Production Systems

Shipped AI systems: problem, approach, tech stack, and architecture.

RAG-Powered Knowledge Base

Problem — Teams needed accurate, cited answers from internal docs without hallucination.
Why it was hard — Balancing retrieval quality, context length, and latency for real-time chat.
Tech stack — LLM, BGE-M3, Qdrant, LangChain, FastAPI
Architecture — User query → embedding → vector search (Qdrant) → top-k retrieval → LLM prompt with context → streamed response.

AI Chat for Portfolio

Problem — Visitors wanted to ask questions about my work and get accurate, contextual answers.
Why it was hard — Building a small, reliable RAG pipeline with minimal infra and clear observability.
Tech stack — Next.js, Django REST, OpenAI / Llama, Qdrant, PostgreSQL
Architecture — Next.js frontend → Django API → embedding + vector search → LLM → response. Optional caching and rate limiting.

Document Classification Pipeline

Problem — Large volumes of documents needed consistent tagging and routing for downstream workflows.
Why it was hard — Throughput, cost control, and handling edge cases without manual review.
Tech stack — Python, Transformers, Celery, PostgreSQL, S3
Architecture — Ingest → queue (Celery) → embedding + classifier → write labels and metadata → trigger workflows.