Personal Project

AlphaOne

Per-ticker financial sentiment platform powered by DeBERTa-ABSA-v2, fine-tuned to 82.5% accuracy (+65pp over FinBERT) on aspect-based sentiment from multi-entity financial text, with 6,287 curated training triples.

Full-Stack DeBERTa-ABSA-v2 Aspect-Based Sentiment LoRA Docker

The Vision

Retail investors are drowning in social media noise. AlphaOne cuts through it with DeBERTa-ABSA-v2, fine-tuned via LoRA on DeBERTa-v3-base (184M params) with entity replacement to classify sentiment per stock within the same sentence. When a post says "AAPL is great but TSLA is doomed," the model correctly labels AAPL as bullish and TSLA as bearish — achieving 82.5% accuracy, 65pp above FinBERT.

The system ingests posts from 7 financial subreddits (wallstreetbets, stocks, investing, stockmarket, options, dividends, ValueInvesting) through a 3-pass batch NLP pipeline: sentence splitting and entity replacement, single-batch GPU inference, and database commit. Claim-based row locking via PostgreSQL SELECT ... FOR UPDATE SKIP LOCKED ensures concurrent-safe ingestion across parallel workers.

Built across 6 containerized services with a React dashboard serving per-ticker sentiment trends and an interactive inference playground with 12-head attention heatmaps.

Technical Highlights

  • 6-Service Architecture

    React, Spring Boot API, Celery workers, FastAPI inference server, Redis, and PostgreSQL via Docker Compose.

  • DeBERTa-ABSA-v2

    Our own model, trained through 9 iterations with 6,200+ hand-audited triples. 82.5% accuracy, within 3pp of formal-news SOTA.

  • Attention Heatmap Playground

    Interactive inference with real-time entity replacement visualization and 12-head attention heatmaps.

  • Idempotent Data Pipeline

    3-pass batch processing with concurrent-safe ingestion, content versioning, and lifecycle state tracking.

Under The Hood

Dual API Layer

Spring Boot serves processed sentiment data (daily trends, evidence feeds, macro aggregation). FastAPI handles real-time inference with attention extraction for the interactive playground.

3-Pass Batch Pipeline

Celery workers run a 3-pass pipeline: CPU normalization (sentence split, entity replacement), GPU batch inference (single forward pass), and database commit with idempotency fallback. Scheduled every 2 hours via Celery Beat.

Training DeBERTa-ABSA-v2

We built and iterated our own model across 9 versions: LLM-based labeling, 615 manual corrections, error-targeted synthetic data generation (6,200+ triples total), reaching 0.823 macro F1 on informal Reddit text.

Platform Architecture

AlphaOne 6-service platform architecture

See It In Action

Walk through the full AlphaOne experience: from the operations dashboard to per-ticker analytics and the interactive inference playground.

  View Product Demo