
Miriam Kümmel & Mathis Lucka•2026-04-15•15 min read
We created PlaceboBench, a challenging pharmaceutical RAG benchmark based on real clinical questions and official EMA documents. Twelve state-of-the-art LLMs show hallucination rates between 24% and 64%.
Read More
Mathis Lucka•2025-12-09•5 min read
Our LLM OCR Cost Calculator allows AI builders to compare PDF parsing costs across different LLM providers and models
Read More
Miriam Kümmel•2025-11-20•12 min read
We enhanced the RAGTruth benchmark by finding 10x more hallucinations through automated detection and human review.
Read More
Miriam Kümmel•2025-11-12•8 min read
The gap between hallucination benchmarks and production reality and what bears have to do with it
Read More