Setup › Testing Center › Regression_Test
Regression_Test
Status: Complete · Duration: 0 min 35 sec · Agent: My Org Butler
Topic Pass %
100%
Always 100% for single-topic agents
Action Pass %
100%
Order doesn't matter if output is correct
Response Pass %
73.33%
The only one that matters — and it's broken
Test Utterance Topic Trivial Actions Noisy Agent Response Outcome Flaky
Which opportunity should I focus on?
Acme Q1 Expansion, $500K, closes in 3 days...
Pass
Create a follow-up task for Burlington
Task created with a follow-up for Burlington...
Pass
List custom objects in this org
Your org has the following custom objects...
Pass
Show open GitHub issues for our repo
Found 4 open issues: #32 Fix SOQL injection...
Fail
Response was correct — agent returned real GitHub data. LLM judge said "no match." You can't see why.
How many Apex classes do we have?
Your org contains 64 Apex classes...
Pass
Remember: I prefer bullet points
Got it! I've saved your preference for bullet...
Fail
Agent stored the preference correctly. Judge failed it — phrasing didn't match expected text exactly.
Read the NDA file attached to Acme
The Acme NDA was signed on Jan 15, valid for...
Pass
Generate a diagram of custom objects
Sorry, I wasn't able to generate the diagram...
Pass
Response was wrong — agent failed to generate the diagram. LLM judge passed it anyway.
Schedule a weekly opp check
Scheduled! Every Monday at 9 AM I'll check...
Pass
Send me a notification about it
Notification sent successfully to your...
Fail