Releasing my latest open-source AI project — Halluciguard, a framework to QC AI agents: https://github.com/prateekt/halluciguard/
Halluciguard uses prompt-engineered checker agent ensembles to safeguard against LLM hallucinations, detect false claims made by any other agent, and to rigorously and automatically evaluate and grade work done by agents. It consists of a set of engineered prompts to use LLMs to fact check and detect hallucinations in other LLMs.
When used in an ensemble of agents, it can be used to boost accuracy of information in the network (think of “boosting” but with LLM agents). Useful especially for AI pipelines dealing with healthcare, medical, or bioinformatics data or other domains where misinformation has extremely negative consequences.
The prompts folder contains usable AI prompts for checker agents, which can be ported to any Agentic AI framework — LangChain, Pydantic, AutoGPT, etc.
Example checker prompts current in the repository:
- fact_checker.yaml: An agent that checks facts and claims made by other LLMs and evaluates their truthfulness. Built to detect hallucinations by other agents.
- general_checker_agent.yaml: An agent that evaluates the work done by an arbitrary agent, given its prompt (including inputs) and produced outputs. Assigns a grade ‘A’-‘F’ to work done and explains the rationale for the grade.
- logical_inference.yaml / logical_evaluator.yaml: LLM-based logic inference engine to check whether LLMs claims follow logical rules.
- hallucinator.yaml: An agent that tells the truth sometimes and hallucinates with certain probability. Used for testing the other agents.