OpenAI has launched EVMbench. It is a new instrument that lets you measure the efficiency of synthetic intelligence brokers in a sequence of assessments (benchmarks) by discovering, modifying, and leveraging good contracts on Ethereum. The corporate developed EVMbench in collaboration with Paradigm, an organization that funds cryptocurrency community initiatives.
OpenAI AI bots on Ethereum can consider three necessary facets of the protocol.
starting, Precise vulnerability detected in Ethereum open supply Makes use of information from public audits. We then assess the danger of vulnerabilities and audit rewards for the remainder of the builders in our ecosystem.
Quantity 2, Counsel patches to repair these vulnerabilities with out compromising the operation of the protocol. “Brokers should modify weak contracts to eradicate exploitability whereas preserving meant performance. That is verified by automated testing and exploit checks,” OpenAI mentioned.
Third, Simulate assaults that exploit these flaws to exfiltrate funds Managed simulation surroundings (sandbox) Secure. Nevertheless, OpenAI clarifies that EVMbench “doesn’t totally symbolize the complexity of real-world good contract safety.”
EVMbench makes use of 120 real-world vulnerabilities extracted from 40 public competency audits, together with Code4rena. Efficiency assessments performed to date present that the GPT-5.3-Codex agent mannequin achieved a 72.2% success fee in exploiting the flaw. The earlier mannequin’s GPT-5 success fee was solely 31.9%.
Nevertheless, vulnerability detection and patching The outcomes of the mannequin aren’t very encouraging.
Efficiency is degraded for discovery and patching duties. Throughout the discovery part, the agent might cease after figuring out a single problem fairly than totally auditing the codebase. Throughout the patching part, sustaining full performance whereas eliminating delicate vulnerabilities stays a problem.
OpenAI, a synthetic intelligence firm.
Why is EVMbench necessary?
Based on the corporate, the significance of this agent audit instrument lies in the truth that good contracts routinely assure $100 billion in crypto belongings inside open supply protocols.
“As AI brokers enhance their abilities in studying, writing, and executing code, it turns into more and more necessary to measure their capabilities in economically acceptable environments and encourage using AI methods defensively to audit and implement deployed contracts,” the corporate mentioned in an announcement on the product presentation.
OpenAI brokers on Ethereum come at a time when autonomous brokers are quickly advancing throughout the crypto asset ecosystem. As reported by CriptoNoticias, these are already in a position to work together with advanced environments such because the Lightning Community. On this second layerbrokers can handle liquidity channels and financial interactions with different AIs.
Coinbase has launched Agentic Pockets, a pockets that permits AI brokers to function on the Base community with out paying charges. And Phantom, Solana’s hottest pockets, activated its MCP server to permit AI brokers to handle balances and function autonomously.
Based on information from Token Terminal, the variety of weekly transactions on the Ethereum community reached 17.3 million. Explosion of transactions on the community Happens after the launch of ERC-8004 in August 2025. It’s a customary that permits “discovering, choosing, and interacting with brokers throughout organizational boundaries with out the necessity for pre-existing belief” within the open agent economic system.

