The Same Line in Every Room

The last article ended on a promise. I said the next one would follow Madrid and Budapest, once I had heard what others in the field were seeing. I closed with the line I keep coming back to — keep learning, keep sharing — and went to practice a talk.

Three weeks later, I have heard it. I stood on a conference stage in Madrid, shook hands with people I had only ever met as faces in a Zoom grid, shipped six releases between trips, and spent more hours in more conversations with more practitioners than in any stretch this year. And underneath all of it — the talks, the dinners, the hallway questions, the demos — there was a single line. The same line, in every room.

This is the article about what that line was, and why it convinced me that owning your own AI stopped being a preference and became the work.

Madrid, in a Paragraph

The last week of May was ExpoQA in Madrid — my first time in the city, my first time at the conference, and my first time on that stage as a speaker. My talk was the one I had been preparing for months: Bridging Classical and Agentic Quality Engineering — building bridges, not burning them. The argument is simple, and it landed: our classical quality skills are not retired by agents, they are promoted. When change arrives at this speed, the ability to reason about risk, design good oracles, and ask “how do I actually know this worked?” becomes the most valuable question in the room. I will not recap the whole conference here — I wrote a full piece on it, and the photos are there too: Make Tech Make Sense: Notes from My First ExpoQA in Madrid. What I will carry forward is one thing I noticed at the networking dinner: some very good people in that room are still where I was about eighteen months ago — wrestling with single-threaded coding agents that barely leave the chat window. That is not a criticism. It is a measurement.

Budapest, in a Paragraph

The first week of June in Budapest — Craft Conference, the Hungarian Agentics Foundation meetup, and, for the first time after more than a year of meeting several times a week on screens, the Agentics Foundation crew in the same physical rooms. I walked into the hotel reception at almost the same moment as Reuven, his son Finn, Rob Ranson, and Nick. I finally put handshakes and hugs to faces I knew only as video tiles — Adam, Klara, Antoaneta, Bence, Dekel, Mark, and so many more. Chris from Cognitum One took us all to dinner on his birthday, and midweek, the company announced its founding advisory team and investors, as well as its mission of sovereign agentic AI for enterprise infrastructure. It was an extraordinary week, and again I will not retell it — it is all here, with the photos: A Week in Budapest: The First Time We Met in Person. The part that matters for this article is that Budapest gave me the second data point. The same line I heard in Madrid, I heard again in Budapest — only sharper, because the people saying it were the ones building the thing everyone else is a year behind.

The Line

Here it is.

In Madrid, the focus was on agents — how to use them, how to understand them, how to trust them. But most of the energy in the room was still going into the basics: refining an agent definition, sharpening a skill definition, and getting a single agent to behave. Only a few people were thinking one level up — at the harness, at memory, at how you actually give an agent the right context at the right moment. And that is exactly where the Agentics Foundation was nine months to a year ago, when we defined the first harness around Claude Flow.

I do not say that to feel ahead. I say it because the gap is measurable and because of what happened to it. Most of what we designed and bolted onto Claude Flow — now RuFlow — and onto my own Agentic QE Fleet has since been absorbed. By my rough count, the platform vendor has now implemented something like ninety-six percent of the ideas that were once our additions. The money machines behind the vendors eat small ideas. That is not a conspiracy; it is gravity. You cannot out-spend them, and you cannot keep up with them on their terms.

So the line — the one in every room — was never “the vendors are winning.” The line was: then what do we build? And the answer I kept giving and kept hearing back was the same one. You build on top. You stop renting all of your intelligence. You train your own specialized small models for specific tasks that do not need the frontier, you keep your data behind your own walls, and you make vendor independence cheap enough that switching providers does not cost you capability. This is precisely the work Reuven and I are now doing on Cognitum One — figuring out which tasks can be offloaded to a specialized agent on local hardware, and how to put enterprise data back where it belongs: in the enterprise, not in someone’s cloud.

And here is what made the line urgent rather than aspirational. The economics flipped while we were all watching the leaderboards. Open-weight models have collapsed the cost of near-frontier capability: DeepSeek V4-Pro scores in the 80s on SWE-bench, verified under an MIT license, at roughly 34 times cheaper per output token than the closed frontier; Gemma 4, Qwen, Kimi, and GLM all sit in the same new tier. Self-hosting a capable coding-and-reasoning model is no longer a research stunt — it is a defensible line item. Europe even has a sovereignty answer now in EuroLLM-22B, fully open, built for all 24 EU languages on EU supercomputing resources. The tools to own your stack exist. The only thing missing is the decision to use them.

A Few Releases, One Direction

Between Madrid and Budapest, I had bandwidth for only a couple of things on the Agentic QE Fleet, but the releases — v3.10.2 through v3.10.7 — all pointed the same way: make the learning honest, and make it independent.

The integrity work was the unglamorous kind that matters most. v3.10.7 fixed two bugs that had been silently stopping the self-learning loop — a native-binary mismatch that dropped every captured experience with zero trace, and a SONA upstream bug where reward feedback produced no weight change at all, so the system looked like it was learning and was not. This is the lesson from the routing loop and the dead LLM router, told a third time: the most dangerous failure is the one that passes every check while doing nothing. A learning system that cannot prove it is learning is just a confident story.

The independence work is the part that ties directly to the line. v3.10.4 added an optional local Nagual pattern hub and a local LLM judge — point the fleet at your own nagual serve and score pattern quality with a local model, no API call leaving the building. I spent real time validating that this works, because it is the hinge of the whole vendor-independence argument. I wrote an improvement plan to connect the Agentic QE Fleet’s automatic outcomes to Nagual’s memory, and then I benchmarked local models as the semantic oracle — the judge that decides which patterns are worth transferring from the fleet into Nagual. On a 150-pattern set from my own knowledge base, qwen3:8b and gemma4:12b-mlx both cleared AUROC 0.997 — near-perfect discrimination between genuine knowledge and lifecycle noise — running entirely on my own M-series Mac, for zero dollars, with nothing phoned home. The judge even caught five patterns my linear reward had under-rated and one ADR whose stored text was truncated.

That is the point made concrete: the component that decides what my system learns runs on hardware I own. Vendor independence is not a manifesto. It is a benchmark you can re-run.

Cross-Pollination

A couple of days before I sat down to write this, Nikhil Vallishayee from the Agentics Foundation reached out and asked me to look at his Universal Pattern Space from a quality perspective — not a standard codebase, but a framework evolving toward grounded, labeled responses that improve the depth of a conversation. So I did what I do — pointed the fleet and my own judgment at it from a couple of angles, including a run from my own AQE dev environment, and sent him the reports. He asked me to file them as a public issue on his repo, the kind of move that turns one review into a shared test-and-eval bed.

Then the interesting thing happened. Reviewing his framework surfaced disciplines worth turning back on my own. That review became v3.10.6 — evidence-class labels on every finding (executed, static, inferred, conjecture, so a gate only blocks on verified evidence), a real pass/fail safety eval for the data-protection rules, a verifier that stops shipped agents from drifting away from their source, and pre-registered benchmark rubrics so a claim is auditable from the repo alone. Six architecture decisions, each with a live verification record, none of which I would have written that week without Nikhil’s project to react to.

This is what the open-source corner of the Agentics Foundation actually produces. Not just code you can fork — friction you can learn from. One member’s project, looked at honestly, makes another member’s project better with the next release. The vendors have the money. We have each other’s work, in the open, pushing every project forward. That is the one advantage the money machines cannot buy back from us, and we should stop being shy about it.

Fable, Mythos, and the Leash

A week ago, Anthropic released two new models — Fable and Mythos. Mythos was limited to the enterprise from day one; I had a chance to play with Fable. And then, a couple of days ago, a U.S. government decision further tightened access.

I want to be careful and specific here, because this is exactly the kind of thing that looks like isolated news until you stack it. Mythos did not just ship as a product — it shipped into governance. Anthropic gave the EU’s cybersecurity agency, ENISA, access to a cyber-tuned Mythos through a partnership; OpenAI did the same, extending a cybersecurity-tuned frontier model to the EU. Frontier models are becoming instruments of state and regional security policy, creating a two-tier access landscape: agencies and the well-connected get the hardened, capable models ahead of the open market, and access can be granted or revoked by a policy decision overnight. At the same time, the same vendor put autonomous agent workloads on a separate meter, decoupling the economics of renting intelligence from ordinary interactive use. None of these moves is sinister on its own. Together they describe a single shift: the capability you depend on is increasingly something you are granted, on terms you do not set.

Here is where I will say plainly what I think, because this is my blog and not a vendor’s. This is a play of power, and I do not believe it serves the broad benefit of humanity. When access to the most capable systems in the world can change with one government decision, anyone who built their future entirely on rented intelligence just learned they do not control it — someone else does. The Fable and Mythos moment is not an exception. It is a preview.

The response is not to be angry at the vendors. They are doing what money machines do. The response is the line from every room: own your own AI. Own your data, your memory, your vector store, your agent framework, your deployment path — and your own small, specialized models for the tasks that never needed the frontier in the first place. Keep the option to switch providers without losing what you can do. That is not a retreat from the frontier. It is the only way to keep using the frontier on your own terms.

What Fills the Heart

I do not want to skip past the part that actually refuels me. The feedback after Madrid and Budapest was the kind that gives you strength for the next stretch. Roman, who spent a Friday in Madrid walking the city with me, wrote afterward that he came with questions and left with something better, and called the recap “a strong signal that classical engineering discipline and AI are not in competition — they’re converging.” A respected colleague called my conference notes some of the clearest he had ever read. People I had only known on a screen told me, in person, that the work matters to them.

I am not immune to this, and I have stopped pretending to be. The releases, the benchmarks, and the arguments about sovereignty are the work. But the reason I keep doing meetups, keep accepting invitations, keep showing the rough version instead of the polished one — the reason is the rooms, and the people in them who light up and go build something. That is the part that makes the rest sustainable.

What Is Ahead

The calendar does not slow down, and for once, I am glad.

The next Novi Sad meetup is only days away — number thirteen. And in about two weeks, something I am genuinely happy about: Adam and Klara, who hosted our Friday hackerspace out of their rented Budapest apartment, are coming to Novi Sad to help me boost the local meetup. The community that finally met in person in Hungary is now traveling to each other’s cities. That is exactly the texture I want the Serbian chapter to have.

In parallel, the work as chair of the Agentics Foundation Training Committee is taking shape. I have a curriculum architecture drafted out of my own Nagual assets — a six-level competency model from AI-Curious to Architect, five tracks, and a Train-the-Trainer spine running through all of it. But the design principle I care about most is the delivery: I want to train the trainers through pairing and ensemble — everyone in the room building together, sharing as we go, learning by shipping rather than by slide deck. Mentors all the way down. The Serbian chapter, with the StartIt centers, is the pilot. If the line in every room is “build your own,” then the most important thing I can build is more people who can build.

There is a temptation, in a moment like this, to close softly. To say it will all work out, the community is strong, keep going. But the editor in me — the one who has learned that this blog earns its keep with a sharp last line — wants to be exact.

Models come and go. Access gets granted and revoked. The leaderboard you optimized for last month is already being re-priced. Rent the intelligence, and you rent the future along with it. Own the harness, and the models can come and go without taking your capability with them.

So that is the line, and now I have said it back to every room that said it to me.

Keep learning. Keep sharing. Knowledge is power.

This is the thirty-first article in The Quality Forge series. Previous: “The Question That Followed Me Home” described the week the interest started arriving faster than I could schedule it, and the question a London practitioner left me with. This one describes the three weeks after ExpoQA in Madrid, the first in-person gathering of the Agentics Foundation in Budapest, six fleet releases, a cross-pollination with Nikhil Vallishayee’s Universal Pattern Space, and the Fable/Mythos moment that turned vendor independence from a slogan into the work. The full Madrid and Budapest write-ups, with photos, are on LinkedIn: Madrid and Budapest. The releases are public on github.com/proffesor-for-testing/agentic-qe. Nagual-QE is open-source at github.com/proffesor-for-testing. Cognitum is at cognitum.one.

Dragan Spiridonov is the Founder of Quantum Quality Engineering, an Agentic Quality Engineer, Secretary of the Agentics Foundation Board, chair of the Agentic Engineering Training Committee, and one of the AI Chapter leads for the Ministry of Testing. He is currently building the Serbian Agentic Foundation Chapter in partnership with StartIt centers across Serbia.

Madrid, in a Paragraph

Budapest, in a Paragraph

The Line

A Few Releases, One Direction

Cross-Pollination

Fable, Mythos, and the Leash

What Fills the Heart

What Is Ahead

Stay Sharp in the Forge