Engineering

AI Infrastructure for Startups

Mukesh Barnwal1 min read
startupsinfrastructureRAGAPIs

Startups need to move fast without building a platform they don't need yet. Here's a pragmatic approach to AI infrastructure.

Start with managed services

Use managed offerings for embeddings, vector search, and LLM APIs. Avoid self-hosting until you have clear scale or cost drivers. Focus engineering time on product and data quality, not cluster ops.

Keep the stack simple

A typical MVP stack:

  • Embeddings — OpenAI or a single open-source model (e.g., BGE) via an API.
  • Vector DB — Pinecone, Qdrant Cloud, or Supabase pgvector.
  • LLM — One primary API (OpenAI or Anthropic) with a simple fallback if needed.

Add complexity only when you hit limits: rate limits, cost, or latency.

Design for change

  • Abstract the LLM and embedding calls behind your own interfaces so you can swap providers or add caching later.
  • Version your prompts and document your evaluation set so you can iterate safely.

Startups that ship AI features quickly and then refine infrastructure based on real usage usually win over those that build the "perfect" stack first.