My document is bigger than the model context window, now what?
Context windows are larger than before but still finite. Stuffing a 200-page report into one prompt sometimes works but yields slow and unreliable answers. Three patterns fix this: chunk plus retrieval, hierarchical summarisation, or a long-context model for specific tasks.
Try this first
- 1First decide the task: are you searching for a fact (RAG fits) or synthesising over the whole document (summarisation fits)? The right pattern depends on the goal.
- 2For search: chunk and embed the document like RAG, and pull only relevant pieces. You never search a 200-page report by pasting everything.
- 3For synthesis: use hierarchical summarisation. Generate a summary per chapter, then a summary of summaries. The chain costs more tokens but gives coherent results.
- 4For specific tasks (legal lookup, contract comparison) long-context models (Claude Sonnet, Gemini Pro with large window) are an option. Mind the cost: one 500K-token prompt can be pricier than a whole day of normal chat.
- 5Measure 'lost in the middle': long contexts often forget the middle. Test by placing known facts at different positions and seeing what the model recalls.
When to bring us in
Want us to decide whether RAG, hierarchical summarisation or long-context is right for your case, we can map it out.
See also
- Can I paste a customer file or email into ChatGPT?Depends on the account and settings. Free ChatGPT and a Team tenant behave very differently from what most people assume.
- I want a one-page AI policy for my teamA real one-pager beats a thick document nobody reads. Four headers and concrete examples.
- How do I tell if an AI answer is made up?Models sound confident even when they are wrong. A few habits catch most mistakes.
None of the above fits?
Describe your situation below. We pass your input plus the steps you already saw to our AI and return tailored next-step advice. If it's too risky to DIY, we'll say so.
Or skip the DIY entirely
Our Managed IT clients do not look these things up. One point of contact, a fixed monthly price, resolved within working hours.