Accessibility Statement Skip Navigation
  • Resources
  • Investor Relations
  • Journalists
  • Agencies
  • Client Login
  • Send a Release
Return to PR Newswire homepage
  • News
  • Products
  • Contact
When typing in this field, a list of search results will appear and be automatically updated as you type.

Searching for your content...

No results found. Please change your search terms and try again.
  • News in Focus
      • Browse News Releases

      • All News Releases
      • All Public Company
      • English-only
      • News Releases Overview

      • Multimedia Gallery

      • All Multimedia
      • All Photos
      • All Videos
      • Multimedia Gallery Overview

      • Trending Topics

      • All Trending Topics
  • Business & Money
      • Auto & Transportation

      • All Automotive & Transportation
      • Aerospace, Defense
      • Air Freight
      • Airlines & Aviation
      • Automotive
      • Maritime & Shipbuilding
      • Railroads and Intermodal Transportation
      • Supply Chain/Logistics
      • Transportation, Trucking & Railroad
      • Travel
      • Trucking and Road Transportation
      • Auto & Transportation Overview

      • View All Auto & Transportation

      • Business Technology

      • All Business Technology
      • Blockchain
      • Broadcast Tech
      • Computer & Electronics
      • Computer Hardware
      • Computer Software
      • Data Analytics
      • Electronic Commerce
      • Electronic Components
      • Electronic Design Automation
      • Financial Technology
      • High Tech Security
      • Internet Technology
      • Nanotechnology
      • Networks
      • Peripherals
      • Semiconductors
      • Business Technology Overview

      • View All Business Technology

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Financial Services & Investing

      • All Financial Services & Investing
      • Accounting News & Issues
      • Acquisitions, Mergers and Takeovers
      • Banking & Financial Services
      • Bankruptcy
      • Bond & Stock Ratings
      • Conference Call Announcements
      • Contracts
      • Cryptocurrency
      • Dividends
      • Earnings
      • Earnings Forecasts & Projections
      • Financing Agreements
      • Insurance
      • Investments Opinions
      • Joint Ventures
      • Mutual Funds
      • Private Placement
      • Real Estate
      • Restructuring & Recapitalization
      • Sales Reports
      • Shareholder Activism
      • Shareholder Meetings
      • Stock Offering
      • Stock Split
      • Venture Capital
      • Financial Services & Investing Overview

      • View All Financial Services & Investing

      • General Business

      • All General Business
      • Awards
      • Commercial Real Estate
      • Corporate Expansion
      • Earnings
      • Environmental, Social and Governance (ESG)
      • Human Resource & Workforce Management
      • Licensing
      • New Products & Services
      • Obituaries
      • Outsourcing Businesses
      • Overseas Real Estate (non-US)
      • Personnel Announcements
      • Real Estate Transactions
      • Residential Real Estate
      • Small Business Services
      • Socially Responsible Investing
      • Surveys, Polls and Research
      • Trade Show News
      • General Business Overview

      • View All General Business

  • Science & Tech
      • Consumer Technology

      • All Consumer Technology
      • Artificial Intelligence
      • Blockchain
      • Cloud Computing/Internet of Things
      • Computer Electronics
      • Computer Hardware
      • Computer Software
      • Consumer Electronics
      • Cryptocurrency
      • Data Analytics
      • Electronic Commerce
      • Electronic Gaming
      • Financial Technology
      • Mobile Entertainment
      • Multimedia & Internet
      • Peripherals
      • Social Media
      • STEM (Science, Tech, Engineering, Math)
      • Supply Chain/Logistics
      • Wireless Communications
      • Consumer Technology Overview

      • View All Consumer Technology

      • Energy & Natural Resources

      • All Energy
      • Alternative Energies
      • Chemical
      • Electrical Utilities
      • Gas
      • General Manufacturing
      • Mining
      • Mining & Metals
      • Oil & Energy
      • Oil and Gas Discoveries
      • Utilities
      • Water Utilities
      • Energy & Natural Resources Overview

      • View All Energy & Natural Resources

      • Environ­ment

      • All Environ­ment
      • Conservation & Recycling
      • Environmental Issues
      • Environmental Policy
      • Environmental Products & Services
      • Green Technology
      • Natural Disasters
      • Environ­ment Overview

      • View All Environ­ment

      • Heavy Industry & Manufacturing

      • All Heavy Industry & Manufacturing
      • Aerospace & Defense
      • Agriculture
      • Chemical
      • Construction & Building
      • General Manufacturing
      • HVAC (Heating, Ventilation and Air-Conditioning)
      • Machinery
      • Machine Tools, Metalworking and Metallurgy
      • Mining
      • Mining & Metals
      • Paper, Forest Products & Containers
      • Precious Metals
      • Textiles
      • Tobacco
      • Heavy Industry & Manufacturing Overview

      • View All Heavy Industry & Manufacturing

      • Telecomm­unications

      • All Telecomm­unications
      • Carriers and Services
      • Mobile Entertainment
      • Networks
      • Peripherals
      • Telecommunications Equipment
      • Telecommunications Industry
      • VoIP (Voice over Internet Protocol)
      • Wireless Communications
      • Telecomm­unications Overview

      • View All Telecomm­unications

  • Lifestyle & Health
      • Consumer Products & Retail

      • All Consumer Products & Retail
      • Animals & Pets
      • Beers, Wines and Spirits
      • Beverages
      • Bridal Services
      • Cannabis
      • Cosmetics and Personal Care
      • Fashion
      • Food & Beverages
      • Furniture and Furnishings
      • Home Improvement
      • Household, Consumer & Cosmetics
      • Household Products
      • Jewelry
      • Non-Alcoholic Beverages
      • Office Products
      • Organic Food
      • Product Recalls
      • Restaurants
      • Retail
      • Supermarkets
      • Toys
      • Consumer Products & Retail Overview

      • View All Consumer Products & Retail

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Health

      • All Health
      • Biometrics
      • Biotechnology
      • Clinical Trials & Medical Discoveries
      • Dentistry
      • FDA Approval
      • Fitness/Wellness
      • Health Care & Hospitals
      • Health Insurance
      • Infection Control
      • International Medical Approval
      • Medical Equipment
      • Medical Pharmaceuticals
      • Mental Health
      • Pharmaceuticals
      • Supplementary Medicine
      • Health Overview

      • View All Health

      • Sports

      • All Sports
      • General Sports
      • Outdoors, Camping & Hiking
      • Sporting Events
      • Sports Equipment & Accessories
      • Sports Overview

      • View All Sports

      • Travel

      • All Travel
      • Amusement Parks and Tourist Attractions
      • Gambling & Casinos
      • Hotels and Resorts
      • Leisure & Tourism
      • Outdoors, Camping & Hiking
      • Passenger Aviation
      • Travel Industry
      • Travel Overview

      • View All Travel

  • Policy & Public Interest
      • Policy & Public Interest

      • All Policy & Public Interest
      • Advocacy Group Opinion
      • Animal Welfare
      • Congressional & Presidential Campaigns
      • Corporate Social Responsibility
      • Domestic Policy
      • Economic News, Trends, Analysis
      • Education
      • Environmental
      • European Government
      • FDA Approval
      • Federal and State Legislation
      • Federal Executive Branch & Agency
      • Foreign Policy & International Affairs
      • Homeland Security
      • Labor & Union
      • Legal Issues
      • Natural Disasters
      • Not For Profit
      • Patent Law
      • Public Safety
      • Trade Policy
      • U.S. State Policy
      • Policy & Public Interest Overview

      • View All Policy & Public Interest

  • People & Culture
      • People & Culture

      • All People & Culture
      • Aboriginal, First Nations & Native American
      • African American
      • Asian American
      • Children
      • Diversity, Equity & Inclusion
      • Hispanic
      • Lesbian, Gay & Bisexual
      • Men's Interest
      • People with Disabilities
      • Religion
      • Senior Citizens
      • Veterans
      • Women
      • People & Culture Overview

      • View All People & Culture

      • In-Language News

      • Arabic
      • español
      • português
      • Česko
      • Danmark
      • Deutschland
      • España
      • France
      • Italia
      • Nederland
      • Norge
      • Polska
      • Portugal
      • Россия
      • Slovensko
      • Suomi
      • Sverige
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Report Results
  • Amplify Content
  • All Products
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Hamburger menu
  • PR Newswire: news distribution, targeting and monitoring
  • Send a Release
    • ALL CONTACT INFO
    • Contact Us

      888-776-0942
      from 8 AM - 10 PM ET

  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • News in Focus
    • Browse All News
    • Multimedia Gallery
    • Trending Topics
  • Business & Money
    • Auto & Transportation
    • Business Technology
    • Entertain­ment & Media
    • Financial Services & Investing
    • General Business
  • Science & Tech
    • Consumer Technology
    • Energy & Natural Resources
    • Environ­ment
    • Heavy Industry & Manufacturing
    • Telecomm­unications
  • Lifestyle & Health
    • Consumer Products & Retail
    • Entertain­ment & Media
    • Health
    • Sports
    • Travel
  • Policy & Public Interest
  • People & Culture
    • People & Culture
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Report Results
  • Amplify Content
  • All Products
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS

Endor Labs Launches Agentic Code Security Benchmark, Finds Top-Performing AI Coding Agents Pass Tests But Still Fail Security


News provided by

Endor Labs

Apr 15, 2026, 08:00 ET

Share this article

Share toX

Share this article

Share toX

The benchmark extends the Carnegie Mellon SusVibes framework to continuously evaluate leading AI coding agents, updates as new agents and models are released

PALO ALTO, Calif., April 15, 2026 /PRNewswire/ -- Endor Labs, today announced the launch of the agentic code security benchmark, extending the existing SusVibes framework from leading academic researchers to evaluate how securely AI coding agents generate code in real-world scenarios. Alongside the benchmark, Endor Labs is introducing the Agent Security League, a public leaderboard tracking the performance of leading AI coding agents on both functional correctness and security outcomes.

Built on real-world code and peer-reviewed research developed at Carnegie Mellon University, the benchmark extends SusVibes, a framework that evaluates 200 real-world tasks drawn from 108 open-source projects and covers 77 Common Weakness Enumeration (CWE) vulnerability classes. Endor Labs introduced new test harnesses for agents like Cursor, evaluated new AI models, and introduced new anti-cheating safeguards, including prompt hardening and automated detection systems. This is an important extension of SusVibes to address the observed cheating of newer agents.

As AI coding agents become increasingly embedded in modern development workflows, organizations face a growing but under-measured risk: code that works but isn't secure. Endor Labs' benchmark reveals just how significant that gap is. For the highest performing agent, 84.4% of AI-generated code passed functional tests, but the highest performing security agent still only achieved 17.3% of tests, leaving over 80% of outputs vulnerable.

"AI coding agents are dramatically increasing the speed and scale at which software gets written, but security isn't keeping pace," said Varun Badhwar, CEO at Endor Labs. "The challenge isn't just whether the code works, it's whether it's actually safe in the context of a real system. This work builds on rigorous university research grounded in real-world open source code. Today, Endor Labs is extending that foundation and making the continuous evaluation of new models public, pushing the industry toward greater accountability and giving teams a clearer view of how these systems actually behave."

The Agent Security League evaluates coding agents across two key dimensions: whether the code works, and whether it does so without introducing vulnerabilities. The results reveal a stark gap between the two. While many agents perform well on functionality, security consistently fails. Even the top-performing agent achieved just 17.3% security correctness, and 87% of code generated by AI coding agents contains at least one security vulnerability, underscoring how systemic and unresolved this challenge remains.

Other notable findings include:

  • OpenAI Codex with GPT 5.4 scored the highest on security correctness (17.3%), and Cursor with Claude Opus 4.6 scored the highest for functional correctness (84.4%).
  • Even the best-performing functional combo, Cursor with Claude Opus 4.6, produced secure code only 7.8% of the time. The gap between that and the lowest-scoring security combo — SWE-Agent with Gemini 2.5 Pro at 4.5% — was just 3 points.
  • Newer agent/model combinations exhibited "cheating" behavior — where the agents ignored explicit instructions not to inspect git history. In one mode, cheating was observed in 81.5% of all benchmark tasks (163/200).

"This benchmark moves beyond synthetic tests to show how AI coding agents actually behave in real development environments," said Luca Compagna, Senior Security Researcher at Endor Labs. "Unlike human developers, agents lack the contextual awareness and security best practices that uphold our industry. The result is code that passes functional tests while still introducing exploitable vulnerabilities — a gap the industry can not afford to ignore."

This benchmark is the first to evaluate real-world applications at scale, covering a broader range of vulnerability classes, requiring larger code edits, and analyzing how agents perform across different models.

The Agent Security League leaderboard provides an ongoing, transparent view into how leading AI coding agents perform over time. Updated as new agents and models are released, the leaderboard provides some indicators to help developers to choose safer coding tools, security teams to benchmark risk exposure, and model providers to improve security performance.

The launch of the AI Coding Agentic Security Benchmark is part of Endor Labs' broader mission to secure the software supply chain in the age of AI. To help address these risks, Endor Labs also provides AURI, a security harness designed to bring real-time security context into AI-assisted development workflows.

To see where your agent may fall on the leaderboard, visit: endorlabs.com/research/ai-code-security-benchmark.

About Endor Labs
Endor Labs is the AI-native application security platform for teams that refuse to compromise between speed and security.  helps teams identify, prioritize, and fix the vulnerabilities across source code, open-source dependencies, and container images. With deep program analysis, automated remediation, and unmatched coverage, Endor Labs empowers modern engineering and security teams to move fast without compromise.

Media Contact
Rebecca Reese
[email protected]

SOURCE Endor Labs

21%

more press release views with 
Request a Demo

Modal title

Also from this source

Endor Labs Finds Malware in Open Source Ecosystems Surges 14x in Two Years as Organizations Struggle to Respond

Malware in open source software is no longer a fringe threat–it's accelerating at an unprecedented rate. In 2025 alone, more than 90% of open source...

Endor Labs Introduces AURI, Security Intelligence for Agentic Software Development

Endor Labs, the leader in AI-native application security, today announced the launch of AURI by Endor Labs, a unified security intelligence platform...

More Releases From This Source

Explore

Computer & Electronics

Computer & Electronics

Computer Software

Computer Software

Computer Software

Computer Software

High Tech Security

High Tech Security

News Releases in Similar Topics

Contact PR Newswire

  • Call PR Newswire at 888-776-0942
    from 8 AM - 9 PM ET
  • Chat with an Expert
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices

Products

  • For Marketers
  • For Public Relations
  • For IR & Compliance
  • For Agency
  • All Products

About

  • About PR Newswire
  • About Cision
  • Become a Publishing Partner
  • Become a Channel Partner
  • Careers
  • Accessibility Statement
  • APAC
  • APAC - Simplified Chinese
  • APAC - Traditional Chinese
  • Brazil
  • Canada
  • Czech
  • Denmark
  • Finland
  • France
  • Germany
  • India
  • Indonesia
  • Israel
  • Italy
  • Japan
  • Korea
  • Mexico
  • Middle East
  • Middle East - Arabic
  • Netherlands
  • Norway
  • Poland
  • Portugal
  • Russia
  • Slovakia
  • Spain
  • Sweden
  • United Kingdom
  • Vietnam

My Services

  • All New Releases
  • Platform Login
  • ProfNet
  • Data Privacy

Do not sell or share my personal information:

  • Submit via [email protected] 
  • Call Privacy toll-free: 877-297-8921

Contact PR Newswire

Products

About

My Services
  • All News Releases
  • Platform Login
  • ProfNet
Call PR Newswire at
888-776-0942
  • Terms of Use
  • Privacy Policy
  • Information Security Policy
  • Site Map
  • RSS
  • Cookies
Copyright © 2026 Cision US Inc.