Blog — lesspoo

The cheapest part of RAG is the embeddings

Why most RAG systems return crap from a working retrieval pipeline, and the corpus-curation work that actually fixes it.

How to build the evaluation pipeline a RAG team will actually keep using, in five pieces, with the loop that prevents abandonment.

A guest post from the SaaSPerform team on the five places production RAG latency actually lives, and why the embeddings are usually fine.