Development

On-premises document processing for health under strict privacy rules

For health, government, and businesses with sensitive data: document analysis and classification without a byte going abroad.

A Dutch healthcare provider wanted file summaries and automated finding classification, but patient data through a foreign API was not allowed. We built a setup that does the work inside their own walls, no lock-in, with an audit trail.

The situation

Many organisations sit on data that legally or contractually cannot leave the EU, let alone be processed by a third-party API. Hospitals with patient records. Government bodies with classified documents. Law firms with confidential client data. Pharma R&D with research data. Defence suppliers. And a growing group of SMBs working under NIS2, GDPR or sectoral regulators.

They still have a real workload: summarising documents, classifying them, searching across them, or an internal knowledge search over their own procedures. The question was not "do we want automation?", that was already answered. The question was whether it could run in-house at reasonable cost and in reasonable time.

What we did

On this engagement we worked in five phases over four months:

Week 1-2: discovery. Which use-cases could actually run on-prem (summarising files, classifying findings, an internal knowledge chatbot for protocols), who would use it, what quality and privacy requirements applied.

Week 3-5: hardware. We advised against the "buy the biggest GPU server" they initially wanted. For the expected volume a single-GPU workstation with an RTX 6000 Ada (around 18,000 euro) was enough, with a clear upgrade path. Procurement and on-site setup took three weeks.

Week 6-7: model selection and benchmarking. We tested three open-source models (Llama 3 70B, Qwen 3 32B, Mistral Large) against real example files. Qwen 3 won on quality-per-watt for their specific use-cases.

Week 8-13: application layer. RAG pipeline built on top of existing document stores, a simple web interface for specialists, and integration with their existing EHR system for "push this summary back into the file". Audit logging on every call, so an inspector can see what was asked, when, of which question.

Week 14-16: pilot, hardening, acceptance and handover. Three specialists worked with the pilot for two weeks, gave feedback, we tuned, and rollout went wider.

What it delivered

After four months:

- 100% of processing on-prem, no byte going overseas. - Time saved per specialist on administration: 45 minutes per day on average (measured across a 12-week pilot). - Audit trail on every call, easy to walk through under inspection. - Total project cost: 73,000 euro (hardware + engineering). Operating cost: around 2,200 euro per month for maintenance, monitoring and model updates. - Cloud-API comparison: was legally not an option, so ROI sits in specialist time saved (against their loaded cost the project pays back in 8-10 months, excluding the compliance value).

No vendor lock-in: the stack is open source. If we leave tomorrow it keeps running on their own hardware with their own team.

What this wasn't

Not a "we have the biggest GPU cluster in NL" vanity build. Not a black box no one can read. Not a multi-year license contract you can't exit. What it was: a focused setup matched to the real use-cases, with hardware that fits and software the in-house team can run.

Want to talk?Read about Development

Related cases

A 20-50 person SMB without an in-house IT team

One person "who also does something with IT", three different vendors, and nobody who has the overview.

Read

Documents that get retyped three times

A quote starts in the CRM, gets retyped into Word, then into the accounting system, and arrives by email as a PDF.

Read