Accessibility Statement Skip Navigation
  • Resources
  • Investor Relations
  • Journalists
  • Agencies
  • Client Login
  • Send a Release
Return to PR Newswire homepage
  • News
  • Products
  • Contact
When typing in this field, a list of search results will appear and be automatically updated as you type.

Searching for your content...

No results found. Please change your search terms and try again.
  • News in Focus
      • Browse News Releases

      • All News Releases
      • All Public Company
      • English-only
      • News Releases Overview

      • Multimedia Gallery

      • All Multimedia
      • All Photos
      • All Videos
      • Multimedia Gallery Overview

      • Trending Topics

      • All Trending Topics
  • Business & Money
      • Auto & Transportation

      • All Automotive & Transportation
      • Aerospace, Defense
      • Air Freight
      • Airlines & Aviation
      • Automotive
      • Maritime & Shipbuilding
      • Railroads and Intermodal Transportation
      • Supply Chain/Logistics
      • Transportation, Trucking & Railroad
      • Travel
      • Trucking and Road Transportation
      • Auto & Transportation Overview

      • View All Auto & Transportation

      • Business Technology

      • All Business Technology
      • Blockchain
      • Broadcast Tech
      • Computer & Electronics
      • Computer Hardware
      • Computer Software
      • Data Analytics
      • Electronic Commerce
      • Electronic Components
      • Electronic Design Automation
      • Financial Technology
      • High Tech Security
      • Internet Technology
      • Nanotechnology
      • Networks
      • Peripherals
      • Semiconductors
      • Business Technology Overview

      • View All Business Technology

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Financial Services & Investing

      • All Financial Services & Investing
      • Accounting News & Issues
      • Acquisitions, Mergers and Takeovers
      • Banking & Financial Services
      • Bankruptcy
      • Bond & Stock Ratings
      • Conference Call Announcements
      • Contracts
      • Cryptocurrency
      • Dividends
      • Earnings
      • Earnings Forecasts & Projections
      • Financing Agreements
      • Insurance
      • Investments Opinions
      • Joint Ventures
      • Mutual Funds
      • Private Placement
      • Real Estate
      • Restructuring & Recapitalization
      • Sales Reports
      • Shareholder Activism
      • Shareholder Meetings
      • Stock Offering
      • Stock Split
      • Venture Capital
      • Financial Services & Investing Overview

      • View All Financial Services & Investing

      • General Business

      • All General Business
      • Awards
      • Commercial Real Estate
      • Corporate Expansion
      • Earnings
      • Environmental, Social and Governance (ESG)
      • Human Resource & Workforce Management
      • Licensing
      • New Products & Services
      • Obituaries
      • Outsourcing Businesses
      • Overseas Real Estate (non-US)
      • Personnel Announcements
      • Real Estate Transactions
      • Residential Real Estate
      • Small Business Services
      • Socially Responsible Investing
      • Surveys, Polls and Research
      • Trade Show News
      • General Business Overview

      • View All General Business

  • Science & Tech
      • Consumer Technology

      • All Consumer Technology
      • Artificial Intelligence
      • Blockchain
      • Cloud Computing/Internet of Things
      • Computer Electronics
      • Computer Hardware
      • Computer Software
      • Consumer Electronics
      • Cryptocurrency
      • Data Analytics
      • Electronic Commerce
      • Electronic Gaming
      • Financial Technology
      • Mobile Entertainment
      • Multimedia & Internet
      • Peripherals
      • Social Media
      • STEM (Science, Tech, Engineering, Math)
      • Supply Chain/Logistics
      • Wireless Communications
      • Consumer Technology Overview

      • View All Consumer Technology

      • Energy & Natural Resources

      • All Energy
      • Alternative Energies
      • Chemical
      • Electrical Utilities
      • Gas
      • General Manufacturing
      • Mining
      • Mining & Metals
      • Oil & Energy
      • Oil and Gas Discoveries
      • Utilities
      • Water Utilities
      • Energy & Natural Resources Overview

      • View All Energy & Natural Resources

      • Environ­ment

      • All Environ­ment
      • Conservation & Recycling
      • Environmental Issues
      • Environmental Policy
      • Environmental Products & Services
      • Green Technology
      • Natural Disasters
      • Environ­ment Overview

      • View All Environ­ment

      • Heavy Industry & Manufacturing

      • All Heavy Industry & Manufacturing
      • Aerospace & Defense
      • Agriculture
      • Chemical
      • Construction & Building
      • General Manufacturing
      • HVAC (Heating, Ventilation and Air-Conditioning)
      • Machinery
      • Machine Tools, Metalworking and Metallurgy
      • Mining
      • Mining & Metals
      • Paper, Forest Products & Containers
      • Precious Metals
      • Textiles
      • Tobacco
      • Heavy Industry & Manufacturing Overview

      • View All Heavy Industry & Manufacturing

      • Telecomm­unications

      • All Telecomm­unications
      • Carriers and Services
      • Mobile Entertainment
      • Networks
      • Peripherals
      • Telecommunications Equipment
      • Telecommunications Industry
      • VoIP (Voice over Internet Protocol)
      • Wireless Communications
      • Telecomm­unications Overview

      • View All Telecomm­unications

  • Lifestyle & Health
      • Consumer Products & Retail

      • All Consumer Products & Retail
      • Animals & Pets
      • Beers, Wines and Spirits
      • Beverages
      • Bridal Services
      • Cannabis
      • Cosmetics and Personal Care
      • Fashion
      • Food & Beverages
      • Furniture and Furnishings
      • Home Improvement
      • Household, Consumer & Cosmetics
      • Household Products
      • Jewelry
      • Non-Alcoholic Beverages
      • Office Products
      • Organic Food
      • Product Recalls
      • Restaurants
      • Retail
      • Supermarkets
      • Toys
      • Consumer Products & Retail Overview

      • View All Consumer Products & Retail

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Health

      • All Health
      • Biometrics
      • Biotechnology
      • Clinical Trials & Medical Discoveries
      • Dentistry
      • FDA Approval
      • Fitness/Wellness
      • Health Care & Hospitals
      • Health Insurance
      • Infection Control
      • International Medical Approval
      • Medical Equipment
      • Medical Pharmaceuticals
      • Mental Health
      • Pharmaceuticals
      • Supplementary Medicine
      • Health Overview

      • View All Health

      • Sports

      • All Sports
      • General Sports
      • Outdoors, Camping & Hiking
      • Sporting Events
      • Sports Equipment & Accessories
      • Sports Overview

      • View All Sports

      • Travel

      • All Travel
      • Amusement Parks and Tourist Attractions
      • Gambling & Casinos
      • Hotels and Resorts
      • Leisure & Tourism
      • Outdoors, Camping & Hiking
      • Passenger Aviation
      • Travel Industry
      • Travel Overview

      • View All Travel

  • Policy & Public Interest
      • Policy & Public Interest

      • All Policy & Public Interest
      • Advocacy Group Opinion
      • Animal Welfare
      • Congressional & Presidential Campaigns
      • Corporate Social Responsibility
      • Domestic Policy
      • Economic News, Trends, Analysis
      • Education
      • Environmental
      • European Government
      • FDA Approval
      • Federal and State Legislation
      • Federal Executive Branch & Agency
      • Foreign Policy & International Affairs
      • Homeland Security
      • Labor & Union
      • Legal Issues
      • Natural Disasters
      • Not For Profit
      • Patent Law
      • Public Safety
      • Trade Policy
      • U.S. State Policy
      • Policy & Public Interest Overview

      • View All Policy & Public Interest

  • People & Culture
      • People & Culture

      • All People & Culture
      • Aboriginal, First Nations & Native American
      • African American
      • Asian American
      • Children
      • Diversity, Equity & Inclusion
      • Hispanic
      • Lesbian, Gay & Bisexual
      • Men's Interest
      • People with Disabilities
      • Religion
      • Senior Citizens
      • Veterans
      • Women
      • People & Culture Overview

      • View All People & Culture

      • In-Language News

      • Arabic
      • español
      • português
      • Česko
      • Danmark
      • Deutschland
      • España
      • France
      • Italia
      • Nederland
      • Norge
      • Polska
      • Portugal
      • Россия
      • Slovensko
      • Suomi
      • Sverige
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Amplify Content
  • All Products
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Hamburger menu
  • PR Newswire: news distribution, targeting and monitoring
  • Send a Release
    • ALL CONTACT INFO
    • Contact Us

      888-776-0942
      from 8 AM - 10 PM ET

  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • News in Focus
    • Browse All News
    • Multimedia Gallery
    • Trending Topics
  • Business & Money
    • Auto & Transportation
    • Business Technology
    • Entertain­ment & Media
    • Financial Services & Investing
    • General Business
  • Science & Tech
    • Consumer Technology
    • Energy & Natural Resources
    • Environ­ment
    • Heavy Industry & Manufacturing
    • Telecomm­unications
  • Lifestyle & Health
    • Consumer Products & Retail
    • Entertain­ment & Media
    • Health
    • Sports
    • Travel
  • Policy & Public Interest
  • People & Culture
    • People & Culture
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Amplify Content
  • All Products
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS

Blitzy Blows Past SWE-bench Verified, Demonstrating Next Frontier in AI Progress

Blitzy Logo (PRNewsfoto/Blitzy)

News provided by

Blitzy

Sep 09, 2025, 08:30 ET

Share this article

Share toX

Share this article

Share toX

New autonomous code generation platform achieves breakthrough performance through extended inference time compute, signaling new paradigm beyond pretraining scaling

CAMBRIDGE, Mass., Sept. 9, 2025 /PRNewswire/ -- Blitzy, the autonomous software engineering orchestration platform, today announced it has achieved the top position on SWE-bench Verified, the industry's leading benchmark for AI coding capabilities. Blitzy's 86.8% performance, represents a 13.02% improvement (10 percentage point leap) over the previous best — the largest single advance since March 2024 when Devin achieved a 6.9% improvement (11.9 percentage point leap) over existing state-of-the-art models. The result demonstrates Blitzy's technical excellence and that inference time scaling delivers exponential rather than incremental improvements, and establishes the company's credibility as the leader in autonomous software development.

Blitzy's unprecedented result comes at a time when AI progress has notably decelerated across multiple dimensions. Pre-training improvements have become increasingly incremental compared to the dramatic jumps of previous generations. Even single models appear to be hitting performance ceilings, with leading systems clustering around 70-75% on SWE-bench Verified (a plateau suggesting fundamental limitations in current approaches). While reasoning capabilities have shown promise through foundation models including OpenAI's recent GPT-5 release, the true potential of scaling inference time compute to drive exponential results remained largely unproven — until now.

Smashing Through The "Unsolvables" Ceiling

Blitzy's 86.8% performance isn't just a benchmark victory — it's a breakthrough beyond what the AI community considered the practical ceiling for SWE-bench Verified. OpenAI's analysis during the creation of SWE-bench Verified found that human evaluators identified numerous samples as "hard or impossible to solve," due to ambiguous issue descriptions, insufficient context, or contradictory requirements that rapid AI systems couldn't navigate. Previous attempts plateaued as they encountered these "unsolvable" problems that stumped single-pass reasoning.

"The 'unsolvables' weren't actually unsolvable — they just required deeper thinking than System-1 AI could provide," explained Sid Pardeshi, Blitzy CTO and Co-founder. "By design, our platform enables AI to think for hours or days rather than seconds or minutes, unlocking solutions to problems that stumped every previous approach. This validates inference time scaling as the key to exponential capability improvements."

Blitzy's System-2 approach transforms these roadblocks into solvable challenges through extended reasoning time. This accomplishment signals that the ceiling for complex problem-solving isn't determined by problem difficulty, but by reasoning depth. As Blitzy demonstrates, with sufficient thinking time, AI systems can break through barriers that seemed insurmountable under time pressure.

Blitzy's SWE-bench Verified performance may signal a fundamental shift in how companies develop AI coding solutions. The industry's current emphasis on quick responses and immediate feedback has started to give way to more in-depth reasoning and the higher quality solutions that follow.

The Evolution of Benchmarks

The Foundation Era

This evolution is reflected in the benchmark landscape itself. SWE-bench Verified served an important purpose when AI coding capabilities were nascent, providing standardized evaluation for models attempting basic programming tasks. The benchmark proved AI could move beyond code completion to actual problem-solving, establishing credibility for the entire autonomous coding category.

For over a year, SWE-bench Verified remained the gold standard, driving incremental progress that pushed performance from 13.86% (March 2024) to previous leaders reaching 76.8%. This incremental progression served the industry well, providing clear metrics for comparing approaches and validating improvements.

However, recent research has highlighted evaluation challenges inherent in any static benchmark. Studies indicate that 32.67% of SWE-bench's patches may involve solution leakage — where problem descriptions inadvertently contain guidance — while 94% of issues predate LLM training data, raising questions about whether high performance reflects genuine reasoning or pattern recognition from training. These findings illuminate the complexity of measuring true AI capabilities versus optimized performance on known problem sets.

The Next AI Frontier: System-2 Everywhere

As AI capabilities have rapidly matured, the benchmark's limitations became apparent. The focus on isolated Python bug fixes disconnects from enterprises' realities that require sustained reasoning across massive codebases, architectural transformation, and multi-step workflow orchestration. Research from Berkeley AI published in February 2024 predicted that cutting-edge results would increasingly emerge from compound AI systems rather than individual models. Blitzy's SWE-bench performance validates this prediction and has proven it works at both benchmark and enterprise scale.

Inference time compute is the scaling frontier that enables exponential rather than incremental AI progress. Unlike pretraining scaling with its resource constraints and diminishing returns, inference time scaling offers unlimited improvement potential bounded only by problem complexity and computational budget allocation.

This paradigm extends far beyond coding. Medical diagnosis, financial analysis, legal research, and engineering design all represent domains requiring careful consideration and multi-step reasoning that benefit from extended inference time approaches. The transition from System-1 to System-2 AI represents the next exponential improvement curve the industry has been seeking.

Enterprise Validation: Beyond Benchmarks

Blitzy's 86.8% SWE-bench Verified performance validates its technical excellence, but its enterprise impact reveals capabilities that current benchmarks fundamentally cannot measure. Its real-world transformations demonstrate the exponential power of inference time compute — proving AI can architect, modernize, and transform entire systems at unprecedented scale. This enterprise-scale context management and multi-step workflow orchestration represents the next frontier beyond isolated coding benchmarks.

Examples:

  • Modernization of 4 million lines of legacy Java leveraging 72+ hours of distributed reasoning time per major architectural decision — complexity impossible with time-constrained approaches.
     
  • Service extraction from 500,000-line monoliths that requires 24+ hours of architectural analysis to identify optimal boundaries and integration patterns.
     
  • Cross-language migration while maintaining mathematical precision through extended verification cycles which ensures semantic equivalence across algorithmic transformations.

About Blitzy

Blitzy is the System-2 AI code generation platform that achieved breakthrough performance on SWE-Bench Verified through extended inference time compute and multi-agent orchestration. Unlike traditional AI coding tools that rely on rapid single-pass generation, Blitzy enables hours or days of reasoning time for complex enterprise challenges, coordinating multiple specialized agents to deliver comprehensive solutions.

The platform maintains coherent understanding across multi-million-line codebases, enabling semantic-preserving transformation between programming languages and entire technology stacks. Blitzy orchestrates comprehensive redevelopment rather than incremental patches, coordinating complex development processes from requirements through deployment while engaging in progressive refinement cycles that optimize results far beyond single-pass generation.

Enterprise customers across financial services, professional services, and technology sectors rely on Blitzy's extended reasoning capabilities to solve problems that require architectural depth rather than coding speed — transforming legacy systems, extracting services from monolithic applications, and modernizing entire technology ecosystems through sustained AI reasoning that no benchmark currently measures.

Media Contact(s): Brian Elliott, [email protected]

Source(s):

SWE-Bench http://swebench.com

White paper:

https://paper.blitzy.com/blitzy_system_2_ai_platform_topping_swe_bench_verified.pdf

SOURCE Blitzy

WANT YOUR COMPANY'S NEWS FEATURED ON PRNEWSWIRE.COM?

icon3
440k+
Newsrooms &
Influencers
icon1
9k+
Digital Media
Outlets
icon2
270k+
Journalists
Opted In
GET STARTED

Modal title

Also from this source

Blitzy Partners with Galatea Associates to Deliver AI-Native Development for Financial Services

Blitzy Partners with Galatea Associates to Deliver AI-Native Development for Financial Services

Blitzy, the enterprise-scale AI platform that autonomously builds, refactors, and extends complex codebases, today announced a strategic partnership...

Blitzy Unveils 'System 2 AI Platform' Capable of Autonomously Building 80% of Enterprise Software Applications in Hours

Blitzy Unveils 'System 2 AI Platform' Capable of Autonomously Building 80% of Enterprise Software Applications in Hours

Blitzy, the emerging leader in System 2 AI, today announced the launch of the Blitzy Platform, a category-defining agentic platform that dramatically ...

More Releases From This Source

Explore

Computer Software

Computer Software

Computer Software

Computer Software

Computer & Electronics

Computer & Electronics

Artificial Intelligence

Artificial Intelligence

News Releases in Similar Topics

Contact PR Newswire

  • Call PR Newswire at 888-776-0942
    from 8 AM - 9 PM ET
  • Chat with an Expert
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices

Products

  • For Marketers
  • For Public Relations
  • For IR & Compliance
  • For Agency
  • All Products

About

  • About PR Newswire
  • About Cision
  • Become a Publishing Partner
  • Become a Channel Partner
  • Careers
  • Accessibility Statement
  • APAC
  • APAC - Simplified Chinese
  • APAC - Traditional Chinese
  • Brazil
  • Canada
  • Czech
  • Denmark
  • Finland
  • France
  • Germany
  • India
  • Indonesia
  • Israel
  • Italy
  • Japan
  • Korea
  • Mexico
  • Middle East
  • Middle East - Arabic
  • Netherlands
  • Norway
  • Poland
  • Portugal
  • Russia
  • Slovakia
  • Spain
  • Sweden
  • United Kingdom
  • Vietnam

My Services

  • All New Releases
  • Platform Login
  • ProfNet
  • Data Privacy

Do not sell or share my personal information:

  • Submit via [email protected] 
  • Call Privacy toll-free: 877-297-8921

Contact PR Newswire

Products

About

My Services
  • All News Releases
  • Platform Login
  • ProfNet
Call PR Newswire at
888-776-0942
  • Terms of Use
  • Privacy Policy
  • Information Security Policy
  • Site Map
  • RSS
  • Cookies
Copyright © 2025 Cision US Inc.