privacy improvements to agents

In my previous post, privacy and safety in AI tooling are still missing, I outlined several concerns facing our brave new agentic world, chiefly, the absence of verifiable controls for model outputs, the exposure of sensitive user data through third-party APIs, and the gap between the speed at which agents are deployed and the speed at which we can audit what they actually do. Those concerns have not gone away. If anything, the proliferation of agent-to-agent interactions has widened the attack surface and raised the stakes, making standardization and cryptographic trust prerequisites for the agentic future. Two recent efforts move the trust and quality needle meaningfully: Marco De Rossi’s Protocol Agent, a benchmark for assessing how well agents select and negotiate cryptographic primitives, and ERC-8004, a draft standard for trustless agent identity, reputation, and validation on-chain. Neither claims to solve everything, but together they sketch the kind of verifiable, privacy-first infrastructure the agentic economy will need to earn trust at scale.

benchmarking agent quality

In Protocol Agent: What If Agents Could Use Cryptography in Everyday Life? Marco De Rossi, Director of Product at Consensys and Founder at Agent0, introduces a protocol for benchmarking agent-to-agent (A2A) interactions, specifically their ability to properly select cryptographic primitives in everyday scenarios like age verification, group membership, scheduling, password security, private token spend, and a number of other scenarios that can be found in the paper’s Appendix C. The Protocol Agent standard assesses agents on five dimensions: cryptographic primitive selection, negotiation skills with other agents, implementation correctness of results, computation/tool usage quality, and security strength, which, when combined together, provide a valuable heuristic for assessing agent quality and reputation over time, increasing agent determinism. The models used in Protocol Agent’s v1 benchmark are refreshingly all open-source or open-weight variants, and can be tested on self-hosted hardware: github.com/agent0lab/protocol-agent. The highest performing non-fine tuned model was gpt-oss-120b while models with SFT (supervised fine-tuning) outperformed gpt-oss-120b, chiefly deepseek-v3p1 (which improved by 46.5%) and qwen3-30b-i2507 (which improved by 73.3%) prior to SFT. Interestingly, gpt-oss-120b improved only marginally after SFT (3.61%), dropping to rank four after deepseek and qwen3 models. A full table of results comparing the various models pre- and post-SFT can be found on Page 6, Table 1, and in Appendix A of the paper. For benchmarking v1, De Rossi used a cluster of two NVIDIA H100s and eight H200s in two different configurations. In the first configuration with the two H100s, there was a total of four replicas with 80GB of memory, and in the second configuration with eight H200s, there was a single replica with 141GB of memory. These results clearly indicate the value of benchmarking, assessing, and fine-tuning model-agents, as seen through the SFT improvements across primitive selection, communication, security strength, and other dimensions.

where ai, blockchain, and cryptography meet

In a world where new agents are deployed daily and agent-to-agent interactions are increasingly commonplace, protocols like De Rossi’s are integral for standardization and security as the AI revolution converges with existing technologies like blockchain and quantum resistant cryptographic primitives. With the draft release of ERC-8004 in August of 2025, authored by De Rossi, Jordan Ellis (Google), Erik Reppel (Coinbase), and Davide Crapis (Ethereum Foundation), new applications are now possible beyond financial use cases, enabling developers to build closer to real-world use cases such as identity verification, managing logistics, and personal correspondence with an emphasis on implicit security. A fully agentic future is still a ways off, that said, the standards being developed today like Protocol Agent and ERC-8004, set the groundwork, furthering research and frontier model development. It’s important to note De Rossi’s emphasis on cryptographic primitives as an initial communication layer highlights use cases that protect user privacy. Many of the agents being deployed today have security as a characteristic but are not built with user privacy as a cornerstone and thereby risk data exposure, making them attractive targets for hackers which are equally more sophisticated through AI tools and other forms of computation such as quantum processors. By creating Protocol Agent and ERC-8004, De Rossi and others bridge three of the most important emergent technologies at the frontier today: cryptographic primitives with potential quantum resistance, AI agents which offer a future removed from laborious tasks such as picking the right calendar slot, or remembering a driver’s license at home, and blockchain’s world computer which enables applications like decentralized finance, identity, gaming, and ultimately, sovereignty.

trust as a vector

What makes ERC-8004 particularly compelling is the way it treats agent trustworthiness as a vector, mirroring the real world far better than a centralized rating system or API key allowlist. Through ERC-8004, developers can adjust security proportional to risk, such as the example given in the standard of reduced requirements for ordering pizza compared with increased requirements for applications like medical diagnosis. To coordinate the agent economy, the standard describes three registries: an identity registry for censorship-resistant identifiers, a reputation registry for posting and fetching signals, and a validation registry for independently verifiable proofs. A flexible tiered trust model coupled with registries means aspects such as benchmarked scores can be utilized to make composable on-chain signals through smart contracts, creating a flywheel between agent evaluation and discovery. Refreshingly, ERC-8004 deliberately avoids coupling itself to a payment rail, remaining unopinionated on settlement while collaborating with parallel efforts like Coinbase’s x402 protocol for agent-to-agent payments. It also doesn’t try to solve everything cryptographically, openly stating it cannot guarantee that advertised capabilities are functional and non-malicious, which is why the three trust models (identity, reputation, and validation) exist as layered defenses rather than a single point of failure.

from HTTP to RLAF

Current Reinforcement Learning (RL) and other model improvement techniques such as Reinforcement Learning from Human Feedback (RLHF) are improving model-agent communication quality with humans. However, as De Rossi points out in the Protocol Agent paper, human-optimized training comes at the cost of communication quality between agents. With this in mind, standardized quality and communication assessment metrics will be needed to gauge what will become more common forms of communication as the internet is flooded with personal, group, and enterprise agents. Traditionally, computers communicate with each other through protocols that were written at the genesis of the internet in the 1970s, 80s, and 90s like HTTP, FTP, SMTP, DNS, and IRC. The first-principles approach of these protocols goes from symbols and math to high-level human-readable text. With LLMs, the opposite is true. Models are trained on a large corpus of data and other modalities with the goal of producing text or multimedia for human consumption. The developers of LLMs still don’t exactly know beyond probabilities and some observability how these models return the next token prediction accuracy they do, all that is clear is that improvements to performance (The Scaling Hypothesis) make models more accurate, reducing hallucinations and increasing value. When the future looks like multiple personal agents intercalating to coordinate on behalf of a human and communicate with the agents of friends, family, and businesses, then it becomes increasingly risky to ignore the divide between how models are trained to communicate with humans vs. how they are trained to communicate with each other, computers, and services through code. While the solution is probably something like Reinforcement Learning from Agent Feedback (RLAF) and reputational ranking, new standards and primitives like the Protocol Agent and ERC-8004’s registries will be required to improve trust and assess quality, highlighting the importance of De Rossi’s work. The ability to discover, choose, and interact with agents across organizational boundaries in an open-ended economy is a clear necessity for adoption.

mind the gap

The gap between what agents can do and what we can verify they’ve done is the defining bottleneck for the future of agentic computing. The approach described in Protocol Agent creates a benchmark to determine quality and ERC-8004 creates an on-chain primitive enabling composability, visibility, and trustworthiness. With these specifications described, what’s next is the ironing out of agentic network dynamics, orchestration layers, and post-quantum guarantees. De Rossi and others sketch a future built on verifiable interactions and trust heuristics based on reputation in the same way humans do, over time and with registries, which makes sense, but what remains to be seen is how and if retail models will compare with open-source and open-weight variants in benchmarked rankings given restrictions to SFT, and how network costs, load, and TPS might change once on-chain agents are deployed en masse.

agency over time

Protocol Agent and ERC-8004 increase the trust and quality of use cases like proving your age or group membership without revealing your identity, verifying insurance coverage at a hospital without exposing your full medical history, and coordinating schedules, flights, and dinner reservations across multiple agents, but we are not all the way there yet. As we enter into the agentic future, it is important to consider what it means to pass agency to our future in order to gain agency over time. Protocol Agent and ERC-8004 are refreshing reminders there are still developers building intentionally and making decisions prioritizing safety and user privacy, but open-source and open-weight models still have a long way to go before they match retail variants. It is important to remember the foibles of the social media era where convenience is traded for surveillance and data extraction at scale. Prioritizing open-source alternatives to paid variations supports work like Protocol Agent and ERC-8004 and democratizes access for all. If proven, as has happened in the past, these standards and countless future others will likely be adopted by enterprise developers but only time will tell how the future will play out.

thank you

A huge shout out to Marco De Rossi for Protocol Agent as well as De Rossi, Davide Crapis, Jordan Ellis, and Erik Reppel for their contributions to ERC-8004.

contributing

If you would like to contribute to either project, you can find the details below:

  1. Protocol Agent: What If Agents Could Use Cryptography in Everyday Life?
  2. ERC-8004: Trustless Agents
  3. Agent0