Blog

Latest insights, tutorials, and news about AI governance and guardrails

Miriam Kümmel & Mathis Lucka•2026-05-07•12 min read

Improving knowledge graph creation in life sciences through agent steering

Agent steering intercepts agents mid-run to provide state-specific feedback, improving completeness, hallucination rates, and entity resolution by up to 14 percentage points for knowledge graph creation in life sciences.

Miriam Kümmel & Mathis Lucka•2026-04-26•10 min read

Hallucination Detection Comparison

What's the best tool for hallucination detection? We put 7 of them to the test.

Miriam Kümmel & Mathis Lucka•2026-04-15•15 min read

PlaceboBench: An LLM Hallucination Benchmark for Pharma

We created PlaceboBench, a challenging pharmaceutical RAG benchmark based on real clinical questions and official EMA documents. Twelve state-of-the-art LLMs show hallucination rates between 24% and 64%.

Mathis Lucka•2025-12-09•5 min read

What does it cost to do OCR with Large Language Models?

Our LLM OCR Cost Calculator allows AI builders to compare PDF parsing costs across different LLM providers and models

Miriam Kümmel•2025-11-20•12 min read

RAGTruth++: Enhanced Hallucination Detection Benchmark

We enhanced the RAGTruth benchmark by finding 10x more hallucinations through automated detection and human review.

Miriam Kümmel•2025-11-12•8 min read

Why Hallucination Benchmarks Miss the Mark

The gap between hallucination benchmarks and production reality and what bears have to do with it