Microsoft 365 Copilot Studio: Grader Framework for Agent Evaluation

Grader Framework for Agent Evaluation
[Web]
The Grader Framework provides flexible evaluation options for agent responses using exact match, similarity, intent, and AI‑based metrics.
Details:
What changed:
Makers can now use multiple grading approaches to evaluate agent performance with more transparency.
Why:
Different scenarios require different evaluation methods, and flexibility improves testing accuracy.
Try this:
Evaluate an agent using similarity scoring.
Compare intent‑based grading with exact match.
Review detailed grading results to identify improvement areas.
Why this matters:
Business impact:
Enhances quality assurance for AI agents.
Personal impact:
Helps makers understand how agents perform in real scenarios.
Additional resources:
Learn:
Create a single response test set