Market Galaxy

An interactive graph of the S&P 500 — clusters, hubs, and hidden connections.

What this is

Market Galaxy is a relationship-first view of public equity markets. Most market tools show one company at a time — a price chart, a fundamentals page, a screener row. Market Galaxy treats the whole S&P 500 as a single living graph: companies are nodes, sectors form regions, and four kinds of relationships connect them.

The goal is to show you the shape of the market — which companies cluster, which sit at the center, which depend on whom — in a way a 500-row spreadsheet cannot.

Data sources

SourceUsed forRefreshFree-tier limits
yfinance (Yahoo Finance)Company fundamentals, sector/industry, daily-close pricesDaily seedRate-limited and unofficial; never called at request time
SEC EDGAR10-K supplier/customer mentions, corporate actionsWeekly≤8 req/s with a descriptive User-Agent header
WikipediaS&P 500 constituent listWeeklyOpen data; attribution preserved
Yahoo + Google RSSNews co-mention edges (summaries only)HourlyArticle bodies are NOT scraped (RSS summary text only)
Alpha VantageHistorical price/volume backfill for replay toursBackfillFree key, 25 req/day

All HTTP fetches go through a centralized client enforcing User-Agent, rate-limiting, persistent disk cache, and zod schema validation at the boundary.

How edges are computed

Sector & industry edges

Two companies in the same GICS sector are connected by a sector edge; two in the same GICS industry are connected by an industry edge. Classification metadata comes from yfinance and is normalized into the canonical 11-sector palette.

Price-correlation edges

Daily returns are computed from adjusted closing prices, then detrended by subtracting the SPY return (so we measure deviation from the broad market, not co-movement with it). Pearson correlation is computed over a 1-year rolling window. Each company keeps its top 10 correlations — strong enough to be informative, sparse enough to avoid a hairball.

Supplier/customer edges

Public 10-K filings sometimes name specific suppliers or customers (often when one counterparty makes up >10% of revenue). We extract those mentions via regex from EDGAR-filed 10-Ks and treat them as directional supplier-of / customer-of edges. The coverage is partial — companies that don't disclose specific counterparties appear as "no supplier data yet." LLM-based extraction is explicitly out of v1 scope.

News co-mention edges

When two companies appear in the same news article we record a co-mention observation. Articles that mention >5 tickers are excluded (too generic). At least 3 supporting articles are required before an edge surfaces. Time decay deemphasizes older co-mentions. We use RSS summaries only — never full article bodies.

Known limitations

About

Built by Ryan Hoitt as a public showcase of what modern WebGL plus a relationship-first data model can do for finance-curious users. The galaxy is the product — open the demo and explore.

← Back to galaxy