
Miriam Kümmel & Mathis Lucka•2026-02-17•15 min read
We created PlaceboBench, a challenging pharmaceutical RAG benchmark based on real clinical questions and official EMA documents. Seven state-of-the-art LLMs show hallucination rates between 26% and 64%.
Mehr lesen
Mathis Lucka•2025-12-09•5 min read
Our LLM OCR Cost Calculator allows AI builders to compare PDF parsing costs across different LLM providers and models
Mehr lesen
Miriam Kümmel•2025-11-20•12 min read
We enhanced the RAGTruth benchmark by finding 10x more hallucinations through automated detection and human review.
Mehr lesen
Miriam Kümmel•2025-11-12•8 min read
The gap between hallucination benchmarks and production reality and what bears have to do with it
Mehr lesen