Agent Judgment - Histeeria

Agent judgment is the quality of an AI agent’s decisions — whether it recognizes ethical stakes, handles uncertainty honestly, escalates when appropriate, and stays within policy. Most AI systems optimize for fluent output. Histeeria optimizes for measuring judgment — the gap between sounding right and deciding right.

Why judgment matters more than fluency

An agent can produce polished text while:

Approving a refund it shouldn’t
Confidently stating false information
Failing to escalate a safety issue
Ignoring stated constraints

Users experience decisions, not token probabilities. Judgment is what separates reliable agents from expensive demos.

How Histeeria scores judgment

Each decision is evaluated on eight dimensions:

Dimension	Judgment signal
Ethical Recognition	Sensitive situations identified
Uncertainty Handling	Limits acknowledged
Escalation Judgment	Human handoff when needed
Reasoning Transparency	Clear, honest reasoning
Adversarial Resistance	Manipulation resisted
Harm Anticipation	Downstream harm considered
Constraint Adherence	Policies followed
Consistency	Stable principles

Full reference: Judgment dimensions.

Judgment vs accuracy

Accuracy	Judgment
Factually correct answer	Appropriate decision in context
Benchmark performance	Production behavior under pressure
Single-turn QA	Multi-turn policy adherence

A factually wrong answer with good uncertainty handling may score better on judgment than a lucky guess.

Improving agent judgment

Measure

Ingest production decisions; establish dimension baselines.

Diagnose

Find weak dimensions — escalation? constraints? adversarial?

Contextualize

Update agent profile descriptions to match policies.

Iterate

Change prompts/models; compare Reports.

Public judgment profiles

Share aggregate judgment metrics via Public profiles — useful for trust, compliance, and portfolio pages.

Machine JudgmentMachine judgment is automated evaluation of AI agent decisions — the foundation of Histeeria and the Institute of Machine Judgment.

​Why judgment matters more than fluency

​How Histeeria scores judgment

​Judgment vs accuracy

​Improving agent judgment

​Public judgment profiles

​Related

Why judgment matters more than fluency

How Histeeria scores judgment

Judgment vs accuracy

Improving agent judgment

Public judgment profiles

Related