Skip to content

We want to make our own knowledge searchable for AI but do not know which parts we need

A working RAG (retrieval-augmented generation) pipeline has four blocks: sources, ingest with chunking, vector database, and the query layer that feeds the model. For SMB keep each block as simple as possible and only extend when you hit a real bottleneck.

Try this first

  1. 1Sources: decide which documents really matter. Start with SharePoint, Drive or one specific folder with handbooks and procedures. Not everything at once, or you can never debug what is in there.
  2. 2Ingest: pick a tool or script that fetches sources, converts to text, splits into chunks, and stores them as embeddings. For SMB an open-source ingest or a SaaS like Vectorize or Carbon is usually enough.
  3. 3Vector database: from a few thousand to a million chunks, pgvector on Postgres, Qdrant and Pinecone are roughly interchangeable. Pick the one closest to where your other data already lives.
  4. 4Query layer: a thin app or n8n flow takes the user question, fetches top-k chunks, pastes them as context, and sends to the model. Show sources under the answer, not just the answer.
  5. 5Eval: test with twenty real questions where you know the right answer. Only then do you know the pipeline works. Add questions with the right answer being 'not in our docs' too, hallucination testing matters.

When to bring us in

Want us to set up the first pipeline in a day with your own documents and an eval set of your most-asked questions, we can do that.

See also

None of the above fits?

Describe your situation below. We pass your input plus the steps you already saw to our AI and return tailored next-step advice. If it's too risky to DIY, we'll say so.

Who are you?

For the AI question we need your email and company, so we can follow up if the AI gets stuck, and to prevent abuse.

Limited to 2 questions per hour and 5 per day, kept lean so the AI stays useful. For more, contacting us directly works better for you and us.

Or skip the DIY entirely

Our Managed IT clients do not look these things up. One point of contact, a fixed monthly price, resolved within working hours.