Market Galaxy
An interactive graph of the S&P 500 — clusters, hubs, and hidden connections.
What this is
Market Galaxy is a relationship-first view of public equity markets. Most market tools show one company at a time — a price chart, a fundamentals page, a screener row. Market Galaxy treats the whole S&P 500 as a single living graph: companies are nodes, sectors form regions, and four kinds of relationships connect them.
The goal is to show you the shape of the market — which companies cluster, which sit at the center, which depend on whom — in a way a 500-row spreadsheet cannot.
Data sources
| Source | Used for | Refresh | Free-tier limits |
|---|---|---|---|
| yfinance (Yahoo Finance) | Company fundamentals, sector/industry, daily-close prices | Daily seed | Rate-limited and unofficial; never called at request time |
| SEC EDGAR | 10-K supplier/customer mentions, corporate actions | Weekly | ≤8 req/s with a descriptive User-Agent header |
| Wikipedia | S&P 500 constituent list | Weekly | Open data; attribution preserved |
| Yahoo + Google RSS | News co-mention edges (summaries only) | Hourly | Article bodies are NOT scraped (RSS summary text only) |
| Alpha Vantage | Historical price/volume backfill for replay tours | Backfill | Free key, 25 req/day |
All HTTP fetches go through a centralized client enforcing User-Agent, rate-limiting, persistent disk cache, and zod schema validation at the boundary.
How edges are computed
Sector & industry edges
Two companies in the same GICS sector are connected by a sector edge; two in the same GICS industry are connected by an industry edge. Classification metadata comes from yfinance and is normalized into the canonical 11-sector palette.
Price-correlation edges
Daily returns are computed from adjusted closing prices, then detrended by subtracting the SPY return (so we measure deviation from the broad market, not co-movement with it). Pearson correlation is computed over a 1-year rolling window. Each company keeps its top 10 correlations — strong enough to be informative, sparse enough to avoid a hairball.
Supplier/customer edges
Public 10-K filings sometimes name specific suppliers or customers (often when one counterparty makes up >10% of revenue). We extract those mentions via regex from EDGAR-filed 10-Ks and treat them as directional supplier-of / customer-of edges. The coverage is partial — companies that don't disclose specific counterparties appear as "no supplier data yet." LLM-based extraction is explicitly out of v1 scope.
News co-mention edges
When two companies appear in the same news article we record a co-mention observation. Articles that mention >5 tickers are excluded (too generic). At least 3 supporting articles are required before an edge surfaces. Time decay deemphasizes older co-mentions. We use RSS summaries only — never full article bodies.
Known limitations
- S&P 500 only. Other universes (Russell, international, private) are out of scope for v1.
- Daily-close prices only.Free APIs don't reliably support intraday or real-time streams.
- Partial supplier coverage.Only companies that explicitly disclose counterparties in 10-Ks appear. Major firms with privacy-preserving disclosure read as "no supplier data."
- Daily ingest cadence. Fundamentals refresh once a day; news refreshes hourly; SEC filings refresh weekly. There is intentional lag.
- No price predictions. Market Galaxy is a relationship-intelligence tool, not a forecaster. Every edge is evidence-based, not opinion-based.
About
Built by Ryan Hoitt as a public showcase of what modern WebGL plus a relationship-first data model can do for finance-curious users. The galaxy is the product — open the demo and explore.