
Miriam Kümmel & Mathis Lucka•2026-05-07•12 min read
Agent steering intercepts agents mid-run to provide state-specific feedback, improving completeness, hallucination rates, and entity resolution by up to 14 percentage points for knowledge graph creation in life sciences.
Mehr lesen
Miriam Kümmel & Mathis Lucka•2026-04-26•10 min read
What's the best tool for hallucination detection? We put 7 of them to the test.
Mehr lesen
Miriam Kümmel & Mathis Lucka•2026-04-15•15 min read
We created PlaceboBench, a challenging pharmaceutical RAG benchmark based on real clinical questions and official EMA documents. Twelve state-of-the-art LLMs show hallucination rates between 24% and 64%.
Mehr lesen
Mathis Lucka•2025-12-09•5 min read
Our LLM OCR Cost Calculator allows AI builders to compare PDF parsing costs across different LLM providers and models
Mehr lesen
Miriam Kümmel•2025-11-20•12 min read
We enhanced the RAGTruth benchmark by finding 10x more hallucinations through automated detection and human review.
Mehr lesen
Miriam Kümmel•2025-11-12•8 min read
The gap between hallucination benchmarks and production reality and what bears have to do with it
Mehr lesen