Abstract:
Large Language Models (LLMs)
have achieved remarkable capabilities across di
verse applications, yet their tendency to generate
hallucinations-fluent but factually incorrect outputs
poses significant challenges for deployment in critical
domains. While numerous detection strategies exist,
practical constraints often limit access to model in
ternals, necessitating black-box approaches that op
erate solely on inputs and outputs. This systematic
literature review examines hallucination detection
methods specifically designed for black-box LLMs
under zero-resource constraints, where no additional
training data or model modifications are permitted.
Following Kitchenham’s evidence-based systematic
review protocol, we analyzed 15 peer-reviewed stud
ies published between 2020-2025 that satisfy rigorous
inclusion criteria. Our analysis reveals six primary
detection categories: self-consistency verification, un
certainty quantification, fact-level verification, meta
morphic testing, contradiction detection, and multi
agent validation. Performance analysis shows meth
ods achieving F1 scores up to 0.82, though significant
trade-offs exist between accuracy, computational
cost, and external dependencies. Key findings indi
cate that hybrid approaches combining multiple de
tection signals outperform single-method strategies,
while zero-resource techniques demonstrate compet
itive performance without external knowledge re
quirements. However, substantial challenges remain
in cross-domain generalization, computational scal
ability, and standardized evaluation. This review
provides a structured taxonomy for practitioners
and identifies critical research gaps including mul
timodal robustness, cross-lingual effectiveness, and
interpretability requirements for high-stakes appli
cations.