When Anthropic Confirms What the Trenches Already Taught Us
Reading research papers between code reviews. Patterns from production meeting patterns from Anthropic's agent evals and Constitutional Classifiers++ papers.
Real implementation stories from the trenches. Classical QE practices evolved with PACT principles. No hype, no vendor speak—just what actually works in production.
Quality isn't tested in—it's built in. We're moving from testing-as-activity to agents-as-orchestrators, bridging classical QE with agentic intelligence through PACT principles.
Everything here is battle-tested in production. Real implementations, actual failures, honest lessons. No ivory tower theory—just what works (and what doesn't) from someone in the trenches.
Building the Serbian Agentic Foundation and sharing globally. Learning happens together—through meetups, open source, and honest conversations about what quality means in the AI age.
Reading research papers between code reviews. Patterns from production meeting patterns from Anthropic's agent evals and Constitutional Classifiers++ papers.
A tale of data loss, brutal honesty, and the infrastructure of trust in agentic systems. Twelve releases in fourteen days, and one almost catastrophic failure.
The earthquake has already happened. Combining PACT principles with Human Experience Testing for a quality practice that works in the agentic age.
When verification becomes a feature. Nine days, 11 releases, and the journey from completion theater to verified results. 79.9% token reduction with receipts.
// AQE Fleet v2.5.0 - 31 Specialized Agents
const fleet = {
core: [
"test-generator", // AI-powered test creation
"test-executor", // Multi-framework execution
"coverage-analyzer", // O(log n) gap detection
"quality-gate", // ML-driven validation
"quality-analyzer", // ESLint, SonarQube, Lighthouse
"code-complexity" // Cyclomatic complexity
],
performance: [
"performance-tester", // k6, JMeter, Gatling
"security-scanner" // SAST, DAST scanning
],
strategic: [
"requirements-validator", // INVEST + BDD
"production-intelligence", // Incident replay
"fleet-commander" // 50+ agent coordination
],
advanced: [
"regression-risk-analyzer", // ML test selection
"test-data-architect", // 10k+ records/sec
"api-contract-validator", // Breaking changes
"flaky-test-hunter" // 90%+ ML accuracy
],
specialized: [
"deployment-readiness", // Multi-factor risk
"visual-tester", // AI-powered UI diff
"chaos-engineer", // Resilience testing
"a11y-ally" // WCAG 2.2 compliance
],
tdd: [
"test-writer", // RED phase specialist
"test-implementer", // GREEN phase specialist
"test-refactorer" // REFACTOR phase specialist
// + 8 more TDD subagents
]
};
// Key Metrics: 70-81% cost savings • 41 QE skills
// 7 frameworks • 85 MCP tools • 4 RL algorithms
Enterprise-grade agentic testing framework with 31 specialized AI agents achieving 70-81% cost savings. Built with TypeScript, featuring 4 RL algorithms, multi-model routing, and 41 world-class QE skills.
Multi-model router with OpenRouter integration (300+ models with intelligent selection)
Q-Learning, SARSA, Actor-Critic, PPO with 85%+ pattern matching across 7 frameworks
Comprehensive skills covering TDD, accessibility (a11y-ally), shift-left/right, and chaos engineering
Python reimplementation of the AQE framework using LionAGI orchestration. 18 specialized agents with async-first architecture, alcall integration, and 82% test coverage.
Builder pattern and Session management for persistent multi-agent coordination
alcall integration with exponential backoff and fuzzy JSON parsing (95% error reduction)
Real-time progress streaming and parallel execution with <1ms tracking overhead
# LionAGI QE Fleet v1.2.0 - Python Implementation
from lionagi import Branch, Session
fleet = {
"core": [
"test_generator", # ReAct reasoning loops
"test_executor", # Async parallel execution
"coverage_analyzer", # AST-based analysis
"quality_gate", # ML validation
"quality_analyzer", # Multi-tool integration
"complexity_analyzer" # Cyclomatic < 10
],
"performance": [
"performance_tester", # Load testing
"security_scanner" # SAST/DAST
],
"strategic": [
"requirements_validator", # INVEST + BDD
"production_intelligence", # Incident replay
"fleet_commander" # Orchestration
],
"advanced": [
"regression_risk", # ML test selection
"test_data_architect", # Distributed data
"api_contract", # Breaking changes
"flaky_hunter" # Statistical detection
],
"specialized": [
"deployment_readiness", # Risk assessment
"visual_tester", # UI diff
"chaos_engineer" # Fault injection
]
}
# Key Metrics: 80% cost savings • 82% coverage
# pytest, Jest, Mocha, Cypress • Python 3.10+
Open-source agentic testing framework built with Rust and Python. Specialized agents working in concert— functional testing, security injection, performance planning—all with explainability first.
Proactive, Autonomous, Collaborative, Targeted from the ground up
Every agent decision comes with reasoning traces
Critical checkpoints, not blind automation
// Sentinel Agent Configuration
{
"swarm": {
"agents": [
{
"type": "functional-positive",
"role": "happy_path_validator",
"autonomy": 0.8
},
{
"type": "functional-negative",
"role": "chaos_explorer",
"autonomy": 0.9
},
{
"type": "security-injection",
"role": "threat_simulator",
"autonomy": 0.7
}
],
"orchestration": "hybrid",
"explainability": "required",
"human_checkpoints": ["pre-deploy", "security"]
}
}
Real-world testing of agentic approaches. Results updated weekly. Failures documented honestly.
Can agents improve overnight? Testing the Nightly-Learner system that consolidates patterns while the conductor sleeps.
Using agent swarms for intelligent mutation testing. Can we beat random mutations?
Making every agent decision explainable. v2.3.0-v2.5.0 shipped automatic learning capture and audit trails.
Building the first Agentic Foundation chapter in Serbia and beyond. Monthly meetups, open discussions, and learning together about quality in the age of intelligent agents.
Join the worldwide Agentic QE community. Monthly meetups happening across the globe—from Novi Sad to San Francisco, building the future of quality engineering together.
Member of the global Agentics Foundation. Building chapters worldwide, bringing PACT principles and agentic engineering to quality practices across continents.
Explore Agentics.org →Looking for a speaker on Agentic QE, PACT principles, or bridging classical to modern quality practices? Let's talk about bringing practical insights to your conference or team.
Get in TouchQuestions about Agentic QE? Want to discuss consulting or speaking opportunities? Let's connect.