Does a reranker actually help on top of my vector search?
A reranker (Cohere Rerank, Voyage rerank-2, BGE reranker) re-orders the top 20 or 50 chunks from vector search using a finer scoring model. For RAG with a few thousand documents it can make a real difference, but it is not magic and it adds latency and tokens.
Try this first
- 1First measure your baseline: how often is the right answer in top-3 without reranker? At 90 percent recall, a reranker is unnecessary. At 60 percent, gains are likely.
- 2Test Cohere Rerank 3 or Voyage rerank-2 on your eval set. Same top-50 from vector search, then check top-3 after rerank. If you see 15 to 25 percent more correct answers, it is worth it.
- 3Account for cost and latency: a rerank step adds 100 to 500 ms per query plus an API call. For an internal tool that is fine, for a live customer chatbot it matters.
- 4Combine smartly: top-3 from vector alone is cheaper, with reranker take top-50 then top-5. Do not over-engineer both.
- 5Pin the reranker version like the embedding model. Models get deprecated.
When to bring us in
Want us to measure the reranker gain on your own content, we can run the A/B in a day.
See also
- Can I paste a customer file or email into ChatGPT?Depends on the account and settings. Free ChatGPT and a Team tenant behave very differently from what most people assume.
- I want a one-page AI policy for my teamA real one-pager beats a thick document nobody reads. Four headers and concrete examples.
- How do I tell if an AI answer is made up?Models sound confident even when they are wrong. A few habits catch most mistakes.
None of the above fits?
Describe your situation below. We pass your input plus the steps you already saw to our AI and return tailored next-step advice. If it's too risky to DIY, we'll say so.
Or skip the DIY entirely
Our Managed IT clients do not look these things up. One point of contact, a fixed monthly price, resolved within working hours.