A Systematic Review of Hallucination Detection in Black-Box Large Language Models: Techniques and Constraints

Dissanayake, S.; Wickramaarachchi, D.

A Systematic Review of Hallucination Detection in Black-Box Large Language Models: Techniques and Constraints

Dissanayake, S.; Wickramaarachchi, D.

URI: http://drr.vau.ac.lk/handle/123456789/1930

Date: 2025

Abstract:

Large Language Models (LLMs) have achieved remarkable capabilities across di verse applications, yet their tendency to generate hallucinations-fluent but factually incorrect outputs poses significant challenges for deployment in critical domains. While numerous detection strategies exist, practical constraints often limit access to model in ternals, necessitating black-box approaches that op erate solely on inputs and outputs. This systematic literature review examines hallucination detection methods specifically designed for black-box LLMs under zero-resource constraints, where no additional training data or model modifications are permitted. Following Kitchenham’s evidence-based systematic review protocol, we analyzed 15 peer-reviewed stud ies published between 2020-2025 that satisfy rigorous inclusion criteria. Our analysis reveals six primary detection categories: self-consistency verification, un certainty quantification, fact-level verification, meta morphic testing, contradiction detection, and multi agent validation. Performance analysis shows meth ods achieving F1 scores up to 0.82, though significant trade-offs exist between accuracy, computational cost, and external dependencies. Key findings indi cate that hybrid approaches combining multiple de tection signals outperform single-method strategies, while zero-resource techniques demonstrate compet itive performance without external knowledge re quirements. However, substantial challenges remain in cross-domain generalization, computational scal ability, and standardized evaluation. This review provides a structured taxonomy for practitioners and identifies critical research gaps including mul timodal robustness, cross-lingual effectiveness, and interpretability requirements for high-stakes appli cations.

Show full item record