OpenAI has launched EVMbench. This can be a new software that means that you can measure the efficiency of synthetic intelligence brokers in a sequence of checks (benchmarks) by discovering, modifying, and leveraging sensible contracts on Ethereum. The corporate developed EVMbench in collaboration with Paradigm, an organization that funds cryptocurrency community initiatives.
OpenAI AI bots on Ethereum can consider three necessary facets of the protocol.
starting, Precise vulnerability detected in Ethereum open supply Makes use of information from public audits. We then assess the chance of vulnerabilities and audit rewards for the remainder of the builders in our ecosystem.
Quantity 2, Recommend patches to repair these vulnerabilities with out compromising the operation of the protocol. “Brokers should modify weak contracts to eradicate exploitability whereas preserving supposed performance. That is verified by means of automated testing and exploit checks,” OpenAI mentioned.
Third, Simulate assaults that exploit these flaws to exfiltrate funds Managed simulation surroundings (sandbox) Protected. Nonetheless, OpenAI clarifies that EVMbench “doesn’t symbolize the total complexity of real-world sensible contract safety.”
EVMbench makes use of 120 real-world vulnerabilities extracted from 40 public competency audits, together with Code4rena. Efficiency checks performed to this point present that the GPT-5.3-Codex agent mannequin achieved a 72.2% success charge in exploiting the flaw. The earlier mannequin’s GPT-5 success charge was solely 31.9%.
Nonetheless, vulnerability detection and patching The outcomes from the mannequin will not be very encouraging.
Efficiency is degraded for discovery and patching duties. In the course of the discovery part, the agent could cease after figuring out a single difficulty reasonably than completely auditing the codebase. In the course of the patching part, sustaining full performance whereas eliminating delicate vulnerabilities stays a problem.
OpenAI, a man-made intelligence firm.
Why is EVMbench necessary?
Based on the corporate, the significance of this agent audit software lies in the truth that sensible contracts routinely assure $100 billion in crypto belongings inside open supply protocols.
“As AI brokers enhance their expertise in studying, writing, and executing code, it turns into more and more necessary to measure their capabilities in economically applicable environments and encourage using AI programs defensively to audit and implement deployed contracts,” the corporate mentioned in an announcement on the product presentation.
OpenAI brokers on Ethereum come at a time when autonomous brokers are quickly advancing inside the crypto asset ecosystem. As reported by CriptoNoticias, these are already in a position to work together with complicated environments such because the Lightning Community. On this second layerbrokers can handle liquidity channels and financial interactions with different AIs.
Coinbase has launched Agentic Pockets, a pockets that enables AI brokers to function on the Base community with out paying charges. And Phantom, Solana’s hottest pockets, activated its MCP server to permit AI brokers to handle balances and function autonomously.
Based on information from Token Terminal, the variety of weekly transactions on the Ethereum community reached 17.3 million. Explosion of transactions on the community Happens after the launch of ERC-8004 in August 2025. It’s a normal that permits “discovering, choosing, and interacting with brokers throughout organizational boundaries with out the necessity for pre-existing belief” within the open agent economic system.

