Skip to content

Which embedding model do I use for my RAG, OpenAI, BGE or Cohere?

Embeddings turn text into vectors for search. The differences are language quality (especially Dutch), cost per million tokens, and vendor lock-in. For SMB with Dutch content a multilingual model almost always beats an English-only one.

Try this first

  1. 1OpenAI text-embedding-3-small or -large: simple, reliable, decent multilingual. Cheap on tokens. Vendor coupling with OpenAI, and you ship documents to their API.
  2. 2Cohere Embed v3 (multilingual): strong multilingual, explicit query and document modes. Cheap. EU data location via Bedrock in Frankfurt.
  3. 3BGE M3 or multilingual-e5: open source, self-host on GPU or even CPU. No token billing, hosting cost instead. Pick this for on-prem or strict data locality.
  4. 4Test on your own content: take twenty real questions, embed documents with each model, check which top-3 hits are most correct for your domain. Domain fit beats benchmark score.
  5. 5Pin the model and version in code. An upgrade means re-embedding everything. Plan it as a migration, not a tweak.

When to bring us in

Want a comparison on your own documents and questions, we can run the three models against an eval set.

See also

None of the above fits?

Describe your situation below. We pass your input plus the steps you already saw to our AI and return tailored next-step advice. If it's too risky to DIY, we'll say so.

Who are you?

For the AI question we need your email and company, so we can follow up if the AI gets stuck, and to prevent abuse.

Limited to 2 questions per hour and 5 per day, kept lean so the AI stays useful. For more, contacting us directly works better for you and us.

Or skip the DIY entirely

Our Managed IT clients do not look these things up. One point of contact, a fixed monthly price, resolved within working hours.