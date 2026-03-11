Sonar Foundation Agent—with Anthropic's Claude Opus 4.5—achieves top scores in SWE-bench in 'verified' and 'full' categories

GENEVA, Switzerland and AUSTIN, Texas, March 11, 2026 /PRNewswire/ -- Sonar, an industry leader in code review and application verification, today announced that its Sonar Foundation Agent has achieved the top ranking on the unfiltered1 SWE-bench leaderboard. The agent scored 79.2% on the SWE-bench Verified and 52.62% on SWE-bench Full, marking a new milestone for autonomous agents in software remediation.

Developed by the research team behind AutoCodeRover, the Sonar Foundation Agent achieved its top benchmark results while maintaining a high efficiency:

79.2% Success rate: The highest recorded score on SWE-bench Verified

The highest recorded score on SWE-bench Verified Elite efficiency: An average resolution time of 9 minutes per issue on verified

An average resolution time of Cost-effectiveness: A low average cost of $1.9 per issue on verified, proving that high-performance AI remediation is viable for enterprise-scale deployment

SWE-bench is widely recognized as the most rigorous benchmark for evaluating how Large Language Models (LLMs) and AI agents perform on real-world software engineering tasks. SWE-bench is "...a benchmark for evaluating LLMs on real-world software engineering tasks. It consists of GitHub issues and their corresponding fixes, allowing LLMs to be evaluated on their ability to generate patches that resolve these issues." Unlike other benchmarks that test isolated code snippets, SWE-bench requires agents to navigate entire codebases, reason across multiple files, and generate functional patches that must pass a project's original unit tests to be considered resolved. A subset of these tests is withheld for validation and is not accessible to the Agent.

As the industry embraces the rapid expansion of AI code generation, Sonar is establishing the critical layer of trusted, independent verification necessary for agentic software development. By focusing on the essential infrastructure of autonomous remediation, Sonar enables AI agents to not only investigate but reliably solve identified issues autonomously—reducing developer toil and accelerating innovation while ensuring the highest standards of code integrity.

Key technical innovations

The Sonar Foundation Agent builds on the technology of AutoCodeRover and incorporates several core advancements:

Advanced tool-calling: Built on the LlamaIndex framework, the agent utilizes stateful bash execution and AST (Abstract Syntax Tree) symbol searching to navigate complex codebases.

Built on the LlamaIndex framework, the agent utilizes stateful bash execution and AST (Abstract Syntax Tree) symbol searching to navigate complex codebases. Thinking model integration: By leveraging "extended thinking" prompts, the agent can reason through complex logic errors rather than following prescriptive, brittle instructions.

By leveraging "extended thinking" prompts, the agent can reason through complex logic errors rather than following prescriptive, brittle instructions. Test-driven remediation: The agent prioritizes a test-driven approach to ensure that every patch generated is not only syntactically correct but functionally verified against the original issue.

While the Sonar Foundation Agent is LLM agnostic, it achieved peak efficacy on both SWE-bench Verified and SWE-bench Full with Anthropic's Claude Opus 4.5. This result is a reflection of the Agent's specialized design.

Unlocking autonomy: From AutoCodeRover to Foundation Agent

The Sonar Foundation Agent represents a shift in agentic architecture. By moving away from rigid, multi-stage workflows toward a "free workflow" model, Sonar demonstrates the full potential of modern Large Language Models (LLMs) for application verification.

"The challenge in building production-ready software isn't just identifying bugs and vulnerabilities—it's also about coping with the growing volume of issues and fixing them timely, especially as AI dramatically speeds up the development process with ever more code waiting to be reviewed," said Harry Wang, Chief Growth Officer at Sonar. "By providing agents with a higher degree of autonomy coupled with sound tests and verification in the feedback loop, we move from manual intervention to truly autonomous software issue remediation. We can now reliably solve challenges that were previously too complex for AI, ensuring software remains resilient and secure at scale without being held back by technical debt or review bottlenecks."

Fueling the future of AI software development

The technology behind the Sonar Foundation Agent powers the SonarQube Remediation Agent (in Beta), and will be further incorporated into the Sonar platform.

"As models get better at coding and reasoning, we should give them more autonomy and better tools so they can take on new classes of actions," said Ridwan Shariffdeen, Principal Research Scientist at Sonar. "Reaching #1 on SWE-bench Verified proves that when you give an agent the right context and the right tools, it can solve real-world engineering challenges at a fraction of the time and cost of manual intervention."

The Sonar Foundation Agent remains a research innovation and is not available commercially. For more technical details on the Sonar Foundation Agent and the research behind its SWE-bench performance, visit the Sonar Blog.

About Sonar

Sonar, the industry standard for code verification and automated code review, helps reduce outages, improve security, and lower risks associated with AI and agentic coding. As an independent verification platform, Sonar enables organizations to securely develop at the speed of AI. Sonar is the foundation for high performance software engineering, analyzing over 750 billion lines of code daily to ensure applications are secure, reliable, and maintainable. Rooted in the open source community, Sonar is trusted by 7M+ developers globally, including teams at Snowflake, Booking.com, Deutsche Bank, AstraZeneca, and Ford Motor Company.

To learn more about Sonar, please visit: www.sonar.com

1To see unfiltered results, ensure the "Filters" button is toggled to "No Filters".

