Personal Project

AlphaOne

Full-stack sentiment platform powered by DeBERTa-ABSA-v2, a model we trained through 9 iterations to 82.5% accuracy on per-stock sentiment classification from Reddit.

Full-Stack DeBERTa-ABSA-v2 Aspect-Based Sentiment LoRA Docker

The Vision

Retail investors are drowning in social media noise. AlphaOne cuts through it with DeBERTa-ABSA-v2, a model we trained ourselves through 9 iterations of data curation, error analysis, and retraining to classify sentiment per stock within the same sentence. When a post says "AAPL is great but TSLA is doomed," the model correctly labels AAPL as bullish and TSLA as bearish.

The system ingests posts from 6 financial subreddits through a reproducible data pipeline into PostgreSQL, processes each post through a 3-pass NLP pipeline (CPU normalization, GPU batch inference, database commit), and surfaces ticker sentiment trends through a React dashboard with an interactive inference playground.

Built with production-grade engineering across 6 containerized services. Concurrent-safe, idempotent batch ingestion with content versioning ensures no duplicate or corrupted records, even under parallel worker loads.

Technical Highlights

  • 6-Service Architecture

    React, Spring Boot API, Celery workers, FastAPI inference server, Redis, and PostgreSQL via Docker Compose.

  • DeBERTa-ABSA-v2

    Our own model, trained through 9 iterations with 6,200+ hand-audited triples. 82.5% accuracy, within 3pp of formal-news SOTA.

  • Attention Heatmap Playground

    Interactive inference with real-time entity replacement visualization and 12-head attention heatmaps.

  • Idempotent Data Pipeline

    3-pass batch processing with concurrent-safe ingestion, content versioning, and lifecycle state tracking.

Under The Hood

Dual API Layer

Spring Boot serves processed sentiment data (daily trends, evidence feeds, macro aggregation). FastAPI handles real-time inference with attention extraction for the interactive playground.

3-Pass Batch Pipeline

Celery workers run a 3-pass pipeline: CPU normalization (sentence split, entity replacement), GPU batch inference (single forward pass), and database commit with idempotency fallback. Scheduled every 2 hours via Celery Beat.

Training DeBERTa-ABSA-v2

We built and iterated our own model across 9 versions: LLM-based labeling, 615 manual corrections, error-targeted synthetic data generation (6,200+ triples total), reaching 0.823 macro F1 on informal Reddit text.

Platform Architecture

AlphaOne 6-service platform architecture

See It In Action

Walk through the full AlphaOne experience: from the operations dashboard to per-ticker analytics and the interactive inference playground.

  View Product Demo