AlphaOne - Yincheng Zhou

The Vision

Retail investors are drowning in social media noise. AlphaOne cuts through it with DeBERTa-ABSA-v2, fine-tuned via LoRA on DeBERTa-v3-base (184M params) with entity replacement to classify sentiment per stock within the same sentence. When a post says "AAPL is great but TSLA is doomed," the model correctly labels AAPL as bullish and TSLA as bearish — achieving 82.5% accuracy, 65pp above FinBERT.

The system ingests posts from 7 financial subreddits (wallstreetbets, stocks, investing, stockmarket, options, dividends, ValueInvesting) through a 3-pass batch NLP pipeline: sentence splitting and entity replacement, single-batch GPU inference, and database commit. Claim-based row locking via PostgreSQL SELECT ... FOR UPDATE SKIP LOCKED ensures concurrent-safe ingestion across parallel workers.

Built across 6 containerized services with a React dashboard serving per-ticker sentiment trends and an interactive inference playground with 12-head attention heatmaps.

Technical Highlights

6-Service Architecture

React, Spring Boot API, Celery workers, FastAPI inference server, Redis, and PostgreSQL via Docker Compose.
DeBERTa-ABSA-v2

Our own model, trained through 9 iterations with 6,200+ hand-audited triples. 82.5% accuracy, within 3pp of formal-news SOTA.
Attention Heatmap Playground

Interactive inference with real-time entity replacement visualization and 12-head attention heatmaps.
Idempotent Data Pipeline

3-pass batch processing with concurrent-safe ingestion, content versioning, and lifecycle state tracking.

Under The Hood

Dual API Layer

Spring Boot serves processed sentiment data (daily trends, evidence feeds, macro aggregation). FastAPI handles real-time inference with attention extraction for the interactive playground.

3-Pass Batch Pipeline

Celery workers run a 3-pass pipeline: CPU normalization (sentence split, entity replacement), GPU batch inference (single forward pass), and database commit with idempotency fallback. Scheduled every 2 hours via Celery Beat.

Training DeBERTa-ABSA-v2

We built and iterated our own model across 9 versions: LLM-based labeling, 615 manual corrections, error-targeted synthetic data generation (6,200+ triples total), reaching 0.823 macro F1 on informal Reddit text.

Platform Architecture

See It In Action

Walk through the full AlphaOne experience: from the operations dashboard to per-ticker analytics and the interactive inference playground.

View Product Demo

The Vision

Technical Highlights

6-Service Architecture

DeBERTa-ABSA-v2