From Model Building to LLM Integration

November 7, 2025

A short guide to shipping useful AI features with clear evaluation, safe integration, and predictable costs.


Start with the product constraint

Before choosing a model, write down what “good” means for the feature:

  • Latency and cost budgets
  • Allowed failure modes (and what the UI should do)
  • Data boundaries (PII, retention, logging)
  • A measurable success metric (accuracy, resolution rate, time saved)

Model building: prefer baselines and iteration

If you’re training a model:

  • Begin with a baseline (logistic regression, gradient boosting, a small transformer).
  • Use a clean split strategy (time-based splits for temporal data).
  • Track features, training data versions, and metrics per run.

Treat “model quality” as more than a single number: error clusters and edge cases matter.

LLM integration: treat it like a distributed system

LLMs are non-deterministic and rate-limited. Build for that:

  • Use structured outputs (JSON schema) and validate aggressively.
  • Add retries with backoff and idempotency keys.
  • Implement fallbacks (smaller model, cached answer, human-in-the-loop).

Example (validate the response shape before using it):

type Answer = { answer: string; citations: string[] }

function isAnswer(x: unknown): x is Answer {
    return !!x &&
        typeof x === "object" &&
        typeof (x as any).answer === "string" &&
        Array.isArray((x as any).citations)
}

Vector databases: retrieval is a system, not a query

RAG works when the retrieval pipeline is solid:

  • Chunking strategy is part of the model (size, overlap, metadata).
  • Index with filters (tenant, permission, doc type).
  • Evaluate retrieval separately from generation.

Agents: scope them tightly

Agents are great for multi-step workflows, but only when bounded:

  • Explicit tools and permissions
  • Tool-call auditing
  • Hard limits on steps, time, and spending

Keep the “plan” separate from “execute,” and always log the tool trace.

References

Hi, I'm Martin Duchev. You can find more about my projects on my GitHub page.