Deerfield Green — Prototypes

§ AUGUR / NETWORK
Augur

Polymarket prediction market graph analyzing cross-domain contagion, correlated price movements, capital flows, and whale conviction patterns across 1,800+ markets.

StackMemgraph · Python · FastAPI · Next.js · D3.js · Tailwind CSS
- GraphMemgraph stores 6 node types (Market, Trade, User, Event, Tag, Outcome) with statistical overlays (CO_TRADED, CORRELATED, CAPITAL_FLOW, WHALE_LINKED).
- Time Series4-hour price bucketing for Pearson correlation; Granger causality lag detection across markets.
- Anomaly DetectionMispricing identification where correlated markets diverge in price; whale clustering as outlier behavior.
1. Granger causality inference to identify which markets lead others, with prediction error scoring for information leakage.
2. Real-time streaming pipeline (Kafka to Memgraph to WebSocket) replacing static data snapshots.
3. Arbitrage exploit detection integrating AMM liquidity curves with slippage estimates and fill simulation.
Open Augur →
§ BAROS / INDEX
Baros

Composite crisis-peace index fusing 62 Polymarket geopolitical prediction markets with Global Peace Index (GPR) indicators to surface emerging geopolitical risk before traditional indicators react.

StackClickHouse · Kafka · Python · FastAPI · Next.js · Tailwind CSS
- Time SeriesClickHouse stores intraday market snapshots; composite index is a forecastable time series (ARIMA, Prophet, or LLM-based).
- Ensemble MethodsWeighted aggregation of Polymarket sentiment and GPR indicators; weights tunable per geopolitical zone.
- NLPMarket descriptions parsed for geopolitical keywords (sanctions, military, border); LLM-based semantic risk scoring.
1. Multi-scale time-series decomposition (STL/wavelet) to separate trend, seasonality, and anomalies across historical geopolitical events.
2. Regional sub-indices (Middle East, South China Sea, Eastern Europe) with separate weighting schemes.
3. LLM-powered news-to-risk pipeline feeding Reuters/AP newswire through event extraction and auto-tagging markets.
Open Baros →
§ PALIMPSEST / ANALYTICS
Palimpsest

SEC 10-K knowledge graph overlay for FANG equities, extracting 4.4M structured triples from filings to expose risk disclosure patterns, strategic shifts, and hidden supply chain dependencies across companies and years.

StackNeo4j · Python · HuggingFace Datasets · FinReflectKG · Next.js · Tailwind CSS
- NLPLLM-based information extraction converts unstructured 10-K prose into (subject, predicate, object) triples; NER identifies RISK_FACTOR, FIN_METRIC, PRODUCT, SEGMENT entities.
- GraphNeo4j queries compute entity centrality, risk clustering, and temporal evolution of disclosure patterns across filings.
- VectorJaccard similarity of risk sets between FANG companies; entity density scoring reveals how much explanation companies devote to specific risks.
1. Cross-FANG comparative analysis adding META, GOOGL, NFLX, AMZN with side-by-side risk profiles and asymmetric exposure detection.
2. Causal inference linking extracted risks to stock price movements via causal forest or Granger causality.
3. Supply chain reconstruction from entity-link predicates to identify single-point-of-failure suppliers across multiple 10-Ks.
Open Palimpsest →
§ PELAGOS / RISK MODEL
Pelagos

Maritime supply chain disruption risk platform ingesting AIS vessel tracking, port congestion signals, and Polymarket logistics markets to identify emerging disruptions before they impact earnings.

StackNeo4j · ClickHouse · Bytewax · Kafka · Python · FastAPI · Next.js · Deck.gl · Maplibre · Tailwind CSS
- Geospatial MLBytewax processes streaming AIS data with Rtree spatial indexing and Shapely geofencing for route anomaly detection.
- Time SeriesVessel speed/heading over time; port occupancy forecasting; statistical process control on dwell time and queue length.
- GraphNeo4j encodes fleet ownership networks, inter-port shipping patterns, and vessel operational constraints to detect bottleneck routes.
1. Predictive port queue models using Prophet/LSTM to forecast occupancy and alert on projected 7+ day queues in high-margin corridors.
2. Commodity supply chain impact simulation tracing electronics, oil, and rare earths through the graph with disruption scenario modeling.
3. Multi-modal fusion integrating satellite imagery for port congestion estimation with AIS spoofing detection classifiers.
Open Pelagos →
§ YATAGARASU / PLATFORM
Yatagarasu

LangGraph agent swarm for Japan-focused VC that sources, researches, and scores startup deals across 7 weighted dimensions using parallel LLM agents, web search, and vector memory to output ranked deal lists with investment memos.

StackLangGraph · Novita AI · Qdrant · ClickHouse · Tavily · Perplexity · Python · FastAPI · Next.js · ECharts · Tailwind CSS
- AgenticLangGraph orchestrates 7 parallel scoring agents (market, team, product, traction, syndication, risk, japan_fit) with dynamic tool routing.
- LLMEach dimension agent uses chain-of-thought scoring with JSON schema constraints; output formatter generates natural-language investment memos.
- RAGQdrant stores past research for recall; Perplexity provides sourced citations; Parallel.ai discovers comparable companies.
1. Portfolio-level impact scoring computing how each deal affects portfolio risk, sector concentration, and FX/geopolitical hedging.
2. Founder social proof integration scraping Crunchbase, LinkedIn, and AngelList for network size and past investor reviews.
3. Preference learning feedback loop accepting investor pass/invest decisions to fine-tune dimension weights via Bradley-Terry model.
Open Yatagarasu →
§ CHREMATA / ANALYTICS
Chremata

Earnings transcript NLP pipeline classifying call transcripts across 5 dimensions (outcome, guidance, tone, margins, headwinds) with model performance metrics, label distribution analysis, and interactive inference exploration.

StackPython · React · Babel · SVG Charts
- NLPMulti-label classification of earnings transcripts into outcome, guidance, tone, margins, and headwinds dimensions using a custom 14-class taxonomy.
- NERNamed entity extraction from financial text identifying risk factors, financial metrics, products, and business segments.
- EvaluationPrecision, recall, and F1 scoring across all classification dimensions with per-label confusion analysis and radar chart visualization.
1. Multi-quarter trend analysis tracking how company tone and guidance shift across consecutive earnings periods.
2. Expanded company coverage beyond current dataset with automated transcript ingestion pipeline.
3. Real-time earnings call processing with streaming classification updates during live calls.
Open Chremata →
§ KITSUNE / RESEARCH
Kitsune

RLHF data curation pipeline dashboard visualizing trace generation, dataset materialization (SFT, preference pairs, prompts), validation gates, and fine-tuning evaluation comparing base and fine-tuned models with win rate scoring.

StackPython · React · Babel · SVG Charts
- RLHFReinforcement learning from human feedback data curation with trace scoring, preference pair construction, and SFT dataset materialization.
- EvaluationBase model vs fine-tuned model comparison with win rates, score deltas, refusal rate tracking, and judge-model assessment.
- Data QualitySchema validation and quality gates across materialized datasets with pass ratios, duplicate detection, and invalid record flagging.
1. Expanded trace generation with configurable scoring rubrics and domain-specific quality criteria.
2. Multi-model comparison dashboard supporting arbitrary model pairs with statistical significance testing.
3. Automated quality threshold tuning based on downstream fine-tuning performance feedback loops.
Open Kitsune →
§ KITSUNE-B / SDR COACHING
Kitsune-B

Static design snapshot of an SDR coaching dashboard frozen against real API data — overview, transcripts, hostile-judge analysis, coaching replay diff viewer, training role-play sessions, calibration distributions, and per-SDR weekly reports.

StackPython · Static HTML · Kintsugi CSS
- Coaching8-dim hostile-judge scoring with coaching moments, per-dim lift, and side-by-side diff viewer between recorded and replayed turns.
- Role-playRed/Blue/Judge training sessions with turn-by-turn judge scorecards and branch exploration.
- CalibrationWeekly score-distribution stats and SDR composite averages with heatmap rendering across coaching dimensions.
1. Live data wiring back to the sdr-transcripts API instead of frozen JSON fixtures.
2. Interactive radar overlay with brushing across calls and time windows.
3. Coach-authored rubric editor with versioned rubric history and downstream score recomputation.
Open Kitsune-B →

§ 02 / IN DEVELOPMENT — 2026

On the bench.

Prototypes currently being scaffolded. Briefs below; deployed builds will replace these cards as they ship.

§ CHIMERA / QUANT · IN DEVELOPMENT 2026
Chimera
Chimera

Adversarial Strategy Arena. Two competing agent swarms: Blue Team generates FX trading strategies while Red Team stress-tests them — finding regime failures, black swan edge cases, and overfitting signals. Thompson Sampling allocates attack budgets across strategy weaknesses.

StackLangGraph · ClickHouse · Python
- Multi-AgentAdversarial self-play evaluation where Blue Team strategy generation faces autonomous Red Team stress-testing.
- Reinforcement LearningThompson Sampling allocates attack budgets across strategy weaknesses, prioritizing the most informative failure modes.
- QuantFX backtesting with regime detection; strategy validation across historical black swan events and volatility regimes.
1. Cross-asset strategy generalization beyond FX into equities, commodities, and crypto pairs.
2. Evolutionary strategy mutation with genetic programming for novel signal discovery.
3. Live paper-trading integration with execution simulation and slippage modeling.
§ PANOPTIKON / GEOSPATIAL · IN DEVELOPMENT 2026
Panoptikon
Panoptikon

Satellite + AIS Supply Chain Fusion. Extends Pelagos by fusing satellite imagery (Sentinel-2/AlphaEarth) with AIS vessel data. Agents detect port congestion from imagery, correlate with vessel dwell times, and produce composite supply chain disruption scores.

StackBytewax · ClickHouse · Memgraph · CV Model · Python
- Computer VisionSatellite imagery analysis for port congestion detection, vessel counting, and infrastructure utilization scoring.
- Geospatial MLAIS vessel tracking fused with satellite observations for multi-modal maritime situational awareness.
- Multi-ModalImagery and streaming AIS data correlation producing composite disruption scores with spatial and temporal dimensions.
1. Automated anomaly alerts for sudden congestion spikes detected across satellite and AIS signals.
2. Historical disruption pattern matching against past port closures and weather events.
3. Integration with commodity futures for supply chain disruption impact scoring.
§ MNEMOS / BENCHMARK · IN DEVELOPMENT 2026
Mnemos
Mnemos

Agent Memory Architecture Benchmark. Implements and races four agent memory systems (buffer, Mem0, Zep graph, LangMem procedural) against identical financial research tasks. Measures token efficiency, retrieval latency, answer quality, and cost.

StackLangGraph · Qdrant · Memgraph · ClickHouse · Python
- Memory SystemsComparative evaluation of buffer, Mem0, Zep graph, and LangMem procedural architectures under identical conditions.
- BenchmarkingToken efficiency, retrieval latency, and cost metrics collected in ClickHouse for reproducible architecture comparison.
- RAGRetrieval quality scoring across memory types measuring answer accuracy, hallucination rate, and source fidelity.
1. Expanded benchmark suite with adversarial memory tasks designed to stress-test retrieval under ambiguity.
2. Cost-performance Pareto frontier visualization across memory architectures and model combinations.
3. Community-contributed memory architecture plugins with standardized evaluation harness.
§ SYNOD / INDEX · IN DEVELOPMENT 2026
Synod
Synod

G10 Central Bank Hawk-Dove Spectrum. Multi-agent system monitoring all G10 central banks: speeches, minutes, press conferences, dot plots. Each agent specializes in one central bank, builds a temporal hawkish-dovish score, and an orchestrator produces a global monetary policy heat map.

StackLangGraph · Tavily · Qdrant · ClickHouse · ECharts · Python
- Agentic RAGPer-bank specialist agents with temporal reasoning over speeches, minutes, and press conferences.
- NLPMonetary policy sentiment scoring calibrated to central bank communication styles and historical language shifts.
- Time SeriesHawk-dove drift detection and regime shift identification across the G10 monetary policy landscape.
1. Cross-bank contagion analysis detecting policy coordination and divergence patterns.
2. Forward guidance parsing with commitment language scoring and credibility tracking.
3. Integration with yield curve data for policy announcement impact validation.
§ HERMES / PROTOCOL · IN DEVELOPMENT 2026
Hermes
Hermes

A2A/MCP Agent Interop Gateway. Protocol playground implementing Google's Agent-to-Agent (A2A) alongside MCP. Agents built on different frameworks (LangGraph, CrewAI, OpenAI Agent SDK) collaborate through the gateway, which logs all interactions for observability.

StackFastAPI · MCP · A2A Protocol · LangGraph · Python
- Multi-FrameworkCross-framework agent interoperability enabling LangGraph, CrewAI, and OpenAI agents to collaborate on shared tasks.
- ObservabilityFull interaction logging and replay for debugging cross-agent communication failures and latency bottlenecks.
- Protocol DesignA2A and MCP message translation layer handling schema negotiation, capability discovery, and error propagation.
1. Protocol conformance test suite validating third-party agent implementations against A2A and MCP specs.
2. Latency and throughput benchmarking across framework combinations under concurrent load.
3. Visual interaction graph showing real-time agent communication patterns and message flows.
§ OUROBOROS / RESEARCH · IN DEVELOPMENT 2026
Ouroboros
Ouroboros

Self-Improving Agent Codebase. Agent swarm that writes, tests, and improves its own tool implementations. Given a goal (e.g., "improve search relevance"), it generates code variations, runs them in sandboxed environments, evaluates outputs, and promotes winners.

StackLangGraph · AST Sandbox · ClickHouse · Python
- Code GenerationAutomated tool implementation variants generated, tested, and ranked in isolated sandbox environments.
- EvaluationOutput quality scoring and A/B comparison across code variants with statistical significance testing.
- Meta-AgenticAgents that improve agent infrastructure — the system's tools evolve through its own experimentation.
1. Safety-bounded mutation constraints preventing runaway changes with rollback guarantees.
2. Multi-objective optimization balancing speed, quality, and cost across tool variants.
3. Lineage tracking showing evolutionary path of each tool version with performance genealogy.
§ TESSERA / ANALYTICS · IN DEVELOPMENT 2026
Tessera
Tessera

Multi-Modal Earnings Analyzer. Processes earnings calls across three modalities simultaneously: audio (executive tone/stress via Whisper), slides (visual chart extraction), and transcript (text sentiment). Fuses signals into a composite earnings quality score.

StackWhisper · Vision Model · LangGraph · ClickHouse · Memgraph · Python
- Multi-ModalAudio, visual, and text fusion producing earnings quality signals that no single modality captures alone.
- NLPEarnings transcript sentiment analysis with executive language pattern detection and hedging identification.
- Computer VisionSlide chart extraction and interpretation converting visual financial data into structured metrics.
1. Historical earnings quality backtesting against subsequent price reactions and guidance accuracy.
2. Cross-company comparative analysis within sectors for relative earnings quality ranking.
3. Real-time processing during live earnings calls with streaming signal updates.
§ DAIMON / RESEARCH · IN DEVELOPMENT 2026
Daimon
Daimon

Regulatory Diff Engine. Monitors regulatory feeds (SEC rules, Fed guidance, FEFTA updates, EU AI Act) and autonomously diffs new regulations against portfolio positions and strategies. Produces compliance impact scores with citation chains.

StackMemgraph · Qdrant · LangGraph · Tavily · ClickHouse · Python
- Graph RAGTemporal regulatory knowledge graph tracking rule evolution, amendments, and cross-references over time.
- NLPRegulatory text diffing and impact extraction identifying material changes between rule versions.
- ReasoningCompliance chain-of-thought with citation chains linking regulatory clauses to portfolio exposure.
1. Automated compliance report generation for stakeholders with executive summary and action items.
2. Predictive regulation modeling based on comment periods, drafts, and historical regulatory patterns.
3. Multi-jurisdiction conflict detection across overlapping regulatory regimes.
§ NOCTURNE / NETWORK · IN DEVELOPMENT 2026
Nocturne
Nocturne

Dark Pool / Options Flow Sentiment Graph. Ingests real-time options flow data (unusual activity, sweep orders, dark pool prints), builds a graph of institutional positioning, and correlates with Polymarket prediction signals from Augur/Baros.

StackMemgraph · Kafka · ClickHouse · Bytewax · Python
- GraphInstitutional positioning network construction from dark pool prints and sweep order clustering.
- Anomaly DetectionUnusual flow identification and smart money convergence detection across options chains and dark pools.
- Signal FusionOptions flow and prediction market correlation revealing institutional conviction aligned with crowd sentiment.
1. Sector rotation detection from aggregate institutional flow patterns across equity options.
2. Earnings event positioning analysis with historical accuracy tracking for smart money signals.
3. Real-time alert system for convergence events across options flow and prediction market data.
§ ARACHNE / RESEARCH · IN DEVELOPMENT 2026
Arachne
Arachne

Agentic Web of Trust for Research. Research agent that builds a citation graph of trust — tracking source reliability over time, detecting citation loops, identifying primary vs. derivative sources. Produces research with confidence-weighted citations.

StackLangGraph · Memgraph · Qdrant · Tavily · Python
- GraphEpistemic trust and citation network analysis tracking source reliability and detecting circular references.
- RAGConfidence-weighted source retrieval prioritizing primary sources with verified provenance chains.
- ReasoningPrimary vs. derivative source classification with confidence decay modeling across citation hops.
1. Source reliability scoring that evolves with verification outcomes and prediction accuracy.
2. Cross-domain trust transfer for interdisciplinary research spanning finance, policy, and technology.
3. Visualization of citation provenance chains with confidence decay and source lineage maps.

Quiet experiments in market structure.

Augur

Baros

Palimpsest

Pelagos

Yatagarasu

Chremata

Kitsune

Kitsune-B

On the bench.

Chimera

Panoptikon

Mnemos

Synod

Hermes

Ouroboros

Tessera

Daimon

Nocturne

Arachne

Start a conversation.