I spent a decade reading markets.
Then I built the tools to analyse them properly.
A decade of trading teaches you to be wrong quickly and update. That turns out to be the same discipline that makes data analysis useful rather than decorative - knowing which question is worth asking, holding it lightly, and changing your view when the data says to.
MSc Economics from the University of Copenhagen (2023-2025), following a BSc in Economics and ten years as an independent trader and analyst. Financial markets are the domain I built in; the underlying skills transfer wherever data is complex and voluminous.
Every data system has a label problem. Official categories describe what a company was, not what it's exposed to right now. A gaming company pivots its treasury to Ethereum - it still says "Electronic Gaming" in every database. A photonics maker, a memory chip supplier, a data-centre cooling firm, and a power equipment company all start moving together - placed in different sectors by every standard classification scheme, driven by the same underlying bet on AI infrastructure. Industry labels like GICS and SIC are slow to change by design. But risk doesn't wait. When a company's actual exposure diverges from its label, standard tools miss it - and so does anyone relying on them. The platform is built around one question: what is this company actually, right now? 5,200 US stocks reclassified by real risk exposure - extracted from regulatory filings by AI - with unusual price moves logged daily and processed to surface the narratives that explain why stocks move together, and when that changes.
GameSquare is labelled an esports gaming company - reasonable on the surface. But the platform read its regulatory filing and found something the label misses: the company holds Ethereum as a treasury asset, meaning its stock moves with crypto prices, not just gaming ad spend. No standard classification captures both at once.
The raw classification data shows confidence at 0.72 - the system flagging that this company doesn't fit cleanly into one category. The price drivers were extracted directly from Item 1A of the 10-K filing. The rationale explains why: dual exposure to both gaming fundamentals and ETH price complicates any single label.
Every stock is scored daily against its own history. When something moves far outside its normal range, it rises to the top - filterable by sector and company size.
Every anomalous move is logged with an AI-generated summary and keyword tags - capturing what drove it, and which narrative thread it belongs to. The trail stays readable weeks later.
Co-moving stocks are clustered into evolving themes. POET joins 20+ tickers inside "AI & Energy Infrastructure Buildout" - a narrative Claude has versioned eight times since March.
Open any narrative to see the full thesis, candlestick charts for every member stock, and the event that pulled each one in - readable long after the move has passed. What the system shows isn't just that POET moved - it's that it moved as part of a theme that had been building for weeks. That context is the analysis.
When the screener flags something, the question I actually want to answer isn't "what moved?" - I already know that. It's "what do I know about this company, and what is its peer group saying?" Answering that properly means reading: earnings releases, call transcripts, management commentary on the specific thing that matters - margins, bookings, certification milestones. I was doing that manually. The problem compounds when the question is cross-company: "what are battery companies saying about production ramp in Q1 2026?" isn't answerable from a screener.
This system changes that. Earnings releases and transcripts ingested into a vector database - retrieved, reranked, and synthesised by a 4-layer pipeline. The classifier from P-01 provides the routing layer: when a question arrives, the system finds the right cluster of companies first, then retrieves from accumulated filings for those companies specifically. Company knowledge compounds over time; a spike on BW today draws on every quarterly filing the system has seen.
Every change was tested against 38 queries with known correct answers. The table shows what each technique added — and one case where adding more made things worse.
| What was tested | Right companies found | Relevant results returned | vs. baseline |
|---|---|---|---|
| Baseline — no optimisation | 88.9% | 30.0% | — |
| + re-ranking layer | 88.9% | 33.3% | +3.3% |
| + query expansion via LLM | 100.0% | 38.9% | +8.9% |
| Both together ★ final system | 100.0% | 41.9% | +11.9% |
| Query expansion at two stages | 100.0% | 41.4% | −0.5% |
Spent a decade developing analytical depth independently - then formalised it with an MSc. Now looking for a role where that depth gets sharpened through collaboration and insights actually reach the people making decisions.