🦀 CrabGlamp Docs
← All Builds

Build

DEADPOOL: A Corporate Digital Decay Index in 36 Hours

Watching every company in the Russell 2000 for signs of digital rot — SSL failures, dead hiring, late SEC filings — scored daily across six dimensions.

March 2026 View live →
AstroTypeScriptSQLiteOpenClawCrabGlamp

There’s a class of signal that financial analysts largely ignore: the state of a company’s own digital infrastructure. When a company’s SSL cert is about to expire, their DNS is misconfigured, their website hasn’t been updated in months, they’ve quietly pulled every job posting, and their SEC filings show “notice of late filing” — that’s not noise. That pattern tends to show up right before something very bad becomes public.

DEADPOOL watches every company in the Russell 2000 for exactly these signals, scores them daily across six dimensions, and surfaces the ones showing signs of decay.


The idea

The Russell 2000 is the primary U.S. small-cap benchmark — 2,000 companies spanning biotech, regional banks, industrials, tech, energy, everything. What makes it a good target for this kind of analysis is the coverage gap. S&P 500 companies have dozens of analysts on every earnings call. A $200M industrial in the Russell 2000 might have two. When that company starts missing SEC deadlines while its SSL cert expires and its careers page goes dark, the signal exists — but nobody’s pulling it together.

The premise is simple: a company’s public-facing digital infrastructure tells you something about what’s happening inside. Not everything. Not reliably in isolation. But when multiple signals cluster — expired certs, abandoned DNS, stale website, no job postings, late filings, stock cratering — that convergence is worth paying attention to.

Six signals, collected daily:

SignalWhat it measuresWeight
SSLCertificate validity, expiry, chain quality18%
DNSRecord health, TTL sanity, MX presence12%
Web FreshnessLast-modified headers, content staleness, HTTP response codes15%
JobsActive job postings via ATS detection12%
SECLate filings, restatements, auditor changes, going-concern 8-Ks18%
Price52-week range position + momentum25%

Each produces a 0–100 subscore. The composite is a weighted average of whatever signals fired — missing signals get redistributed, not zeroed. Score labels: Healthy (≥90), Watch (≥70), Warning (≥50), Critical (≥25), Deadpool (<25).


The setup

This project has a few non-negotiable requirements: a persistent server (the database grows every day), a public URL (it’s a live dashboard), an AI coding agent that can touch the running system directly (the data quality work demands it), and the ability to skip the usual devops ceremony.

The environment is a CrabGlamp agent — persistent Linux VM, public HTTPS, browser-based editor and terminal, OpenClaw pre-installed. Deployment is one nginx config:

location / {
    proxy_pass http://127.0.0.1:4321;
    proxy_http_version 1.1;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_read_timeout 60s;
}
crabglamp web enable deadpool

Site’s live. No CI pipeline, no container registry, no DNS provisioning. The interesting part of this project was never the infrastructure.


Day one

First commit at 20:43 UTC, March 23. The second commit, three minutes later, loaded the full Russell 2000.

The initial scaffold landed as ~9,800 lines in a single commit: database schema, all six collectors, the scoring engine, the seed script, and the Astro frontend. Built clean on the first pass.

feat: initial DEADPOOL scaffold — full stack, builds clean, smoke tested
 36 files changed, 9844 insertions(+)

OpenClaw wrote directly into the project, ran the build, checked output, fixed issues, and iterated — all inside the same environment where the site would eventually run. When the first overnight collection run hit SEC rate limits at 3am, the fix was a conversation and a file edit, not a deployment cycle.


The domain problem

This was the hardest part of the entire project. Not the scoring. Not the frontend. Figuring out the correct website for each of 1,916 public companies.

Guessing companyname.com fails spectacularly at Russell 2000 scale. These aren’t household names. “Target Hospitality Corp” doesn’t own targethospitalitycorp.com. Microvast Holdings (MVST) was resolving to thcb.com — the Tuscan Holdings SPAC that took them public four years ago. Kinetik Holdings (KNTK) was pointing at apachecorp.com — an entirely different company. Scores were being computed against the wrong websites.

The Russell 2000 is full of this. Renamed companies on pre-acquisition domains. SPACs that merged years ago but never updated their EDGAR metadata. Corporate “websites” that are actually third-party IR subdomains.

The fix: SEC XBRL namespaces.

Every 10-K filing includes XML namespace declarations:

xmlns:TICKER="https://www.actualcompanydomain.com/2024"

Set by the company’s own IR team. As authoritative as it gets. The resolver pulls the most recent 10-K from EDGAR and extracts the namespace domain, then validates it against DNS before accepting.

async function fromXbrlNamespace(cik: string): Promise<string | null> {
  // Get filing index, find the .htm document
  // Fetch it, grep for xmlns:TICKER="http://DOMAIN/..."
  // Extract and clean the domain
  // Must pass DNS verification before accepting
}

Blocklist for the obvious traps — registrars, sec.gov, blogging platforms. Special-case handling for XBRL artifacts that append .com to non-.com TLDs (nanox.vision.comnanox.vision).

Even after all this, three manual passes:

fix: nanox.vision domain + XBRL .com artifact stripping
fix: alert dedup, 30+ bad domains, subdomain stripping in resolver
fix: 12 more domain corrections + jobs collector hang prevention

Having the AI agent inside the running environment changed how this work happened. The domain corrections weren’t a write-deploy-check cycle. They were: query the database for companies with suspicious scores, cross-reference against EDGAR, apply the fix, re-score, confirm the output makes sense. All in one loop, all against real data.

Final count: 1,727 verified via SEC filing or manual research. 189 unverified (auto-resolved, likely correct). 5 XBRL-confirmed, no web server — the domain is right, the company just doesn’t serve HTTP.


The collectors

Each collector takes a Company and returns { raw_data, subscore, signals }. The runner batches them with controlled concurrency to stay under rate limits.

SSL (124 lines) — Pure Node TLS socket. Checks certificate validity, days until expiry, chain depth. Fails on 8.6% of hosts, mostly timeouts and connection resets. Public companies have worse SSL hygiene than you’d think.

DNS (141 lines) — A/AAAA records, MX presence, SPF/DMARC, TTL values. Companies that have stopped maintaining their digital presence let these rot first. A missing MX record on a corporate domain is a surprisingly strong signal — it often means they’ve stopped using corporate email on that domain entirely.

Web Freshness (149 lines) — Last-Modified and Age headers, parking page detection, response validation. Most reliable collector. Zero errors in 24 hours of continuous operation.

Jobs (268 lines) — The most complex collector. Detects which ATS the company uses (Greenhouse, Lever, Workday, BambooHR, among others), hits the API or careers page, counts active postings, classifies by department. A company that’s stopped hiring in engineering and sales but still posting legal and compliance roles tells a different story than one with zero listings. Many Russell 2000 companies have fewer than 500 employees — when hiring goes to zero, it’s more telling than at a company with 50,000 people.

SEC (235 lines) — EDGAR submissions API. Looks for NT (notice of late filing) forms, 8-K items for restatements (4.02), auditor changes (4.01), bankruptcy (1.03), going concern (8.01). Flags companies that haven’t filed a 10-K in 14+ months. Late filings cluster disproportionately in the Russell 2000.

Price (154 lines) — Yahoo Finance. 52-week range position plus momentum from weekly closes. Carries the highest weight (25%) because price aggregates information efficiently. But the alert threshold is tighter than other signals — being near your 52-week low is common in small caps. Being there while the rest of your signals are also deteriorating is the convergence that matters.


The schema

Append-only, clean separation:

-- companies: one row per company, domain is the key
-- snapshots: immutable raw collector output, one per (company, collector, run)
-- scores: one composite score per (company, day), overwritten on re-run
-- alerts: deduped by (company_id, alert_type, title)

SQLite, WAL mode, better-sqlite3 for synchronous queries in the SSR layer. No ORM. Prepared statements in db.ts, exported as a Q object.

WAL means readers never block writers. Collectors are writing thousands of rows while the web server handles requests. Six days of continuous operation, zero contention. For a system that writes daily and reads constantly, SQLite is more than enough — and it’s one file. No connection pool, no managed database, no credentials to rotate.


The aesthetic

The brief: make it look like a surveillance terminal from a movie you probably shouldn’t be watching.

redesign: pizzint aesthetic — boot sequence, scanlines, radar sweep, live ticker,
DEFCON status, self-hosted JetBrains Mono, dp-pulse on deadpool badge

Boot sequence on load. DEFCON-style status indicator. Live alert ticker across the top. Pulsing red badges on critical companies. CRT scanlines over everything. All CSS — no canvas, no WebGL, no JavaScript framework. JetBrains Mono, self-hosted. Terminal green (#39ff14), warning amber, critical red, all on near-black.

If you’re building a system that watches two thousand companies for signs of corporate decay, the UI should feel like it.


Hardening

First full collection run finished around 03:00 UTC on March 24. All six collectors across the Russell 2000 in about three and a half hours. 11,496 data points per run.

What broke immediately:

False deadpool scores from bad domains. Companies with wrong domain guesses scored zero on SSL and DNS and landed in the worst category. The Russell 2000 has a long tail of obscure companies where naive domain guessing fails completely. Fix: filter unverified domains out before scoring.

Jobs collector hanging. Some ATS endpoints return chunked responses that never terminate. More common with smaller companies on older platforms. Fix: hard timeouts, AbortController on every request.

SEC rate limiting. EDGAR allows roughly 10 requests per second. With 1,916 companies, the SEC collector needed exponential backoff with jitter and serialized calls instead of batched concurrency.

Too many alerts. 1,326 open alerts across 1,916 companies. 65% trigger rate. The biggest driver: 960 price alerts for companies near their 52-week low. That’s just the Russell 2000 being the Russell 2000. Not wrong, but noisy. The fix — not yet implemented — is compound alerting: price-only warnings shouldn’t fire unless another signal is also weak.


The numbers

Six days of continuous collection:

MetricValue
IndexRussell 2000
Companies monitored1,916
Snapshots collected199,220
Score records13,411
Open alerts1,326
Full collection time~3.5 hours
SSL failure rate8.6%
Codebase~4,100 lines
Time to first deploy~3 hours
Time to hardened production~36 hours

What 4,100 lines covers

FileLinesPurpose
src/pages/index.astro486Leaderboard, DEFCON panel, filters
src/pages/about.astro366Methodology
src/pages/quality.astro361Data quality transparency
src/layouts/Layout.astro324Shell, nav, boot sequence, scanlines
src/domain-resolver.ts268SEC EDGAR domain resolution
src/collectors/jobs.ts268ATS detection, job analysis
src/collectors/sec.ts235EDGAR filings analysis
src/collectors/price.ts154Yahoo Finance, range + momentum
src/db.ts175Schema, migrations, queries
src/scorer.ts137Composite scoring, alerts

No UI framework. No GraphQL. No message queue. No Redis.


What worked

Astro SSR with synchronous SQLite. Fast, simple, no hydration tax. Each page fetches data at request time from prepared statements. The entire frontend is server-rendered HTML and CSS.

XBRL namespaces for domain resolution. Companies embed their own domain in their SEC filings. Once we found this, it became the gold standard. Authoritative, machine-readable, public.

The persistent monitor log. A status line printed every two minutes during collection — company count, snapshot counts by collector, score totals. When something went wrong at 3am, the log told us exactly when and where.

What didn’t

Domain guessing without an authoritative source. Dozens of wrong domains before the XBRL resolver was built. Every wrong domain produced a wrong score. At Russell 2000 scale, naive approaches fail hard.

The jobs collector against companies with no public ATS. Some companies don’t post jobs publicly at all. The collector returns nothing, which we treat as neutral (excluded from composite) rather than zero. Still a blind spot.

Alert thresholds. 65% of the index triggering at least one alert is noise. Price near 52-week low is a normal condition for a large chunk of the Russell 2000 at any given time. Compound alerting — requiring signal convergence — would cut the noise dramatically. It’s the obvious next step.


Closing

The hardest problems here were never about code. They were about data — messy entity resolution across nearly two thousand companies, many of which have renamed themselves, merged via SPAC, or quietly let their digital infrastructure go dark. Those problems require judgment. You read the SEC filing, you check the DNS, you look at what the website actually serves, and you make a call.

Having an AI agent that sits inside the same environment as the database, the collectors, and the live site made that judgment loop fast. Find a bad domain, fix it, re-score, check the result. All without context-switching between local dev, staging, production. The whole system lives in one place.

The infrastructure signals are real. A company whose SSL has expired, whose careers page is empty, and whose last 10-K is overdue is telling you something — even if the market hasn’t noticed yet.

Whether DEADPOOL can surface that signal reliably enough to act on is still an open question. The system is honest about its limitations — that’s what the /quality page is for. But after 199,000 snapshots and six days of watching the Russell 2000, the patterns are there. The question is what to do with them.