Accessibility Statement Skip Navigation
  • Resources
  • Investor Relations
  • Journalists
  • Agencies
  • Client Login
  • Send a Release
Return to PR Newswire homepage
  • News
  • Products
  • Contact
When typing in this field, a list of search results will appear and be automatically updated as you type.

Searching for your content...

No results found. Please change your search terms and try again.
  • News in Focus
      • Browse News Releases

      • All News Releases
      • All Public Company
      • English-only
      • News Releases Overview

      • Multimedia Gallery

      • All Multimedia
      • All Photos
      • All Videos
      • Multimedia Gallery Overview

      • Trending Topics

      • All Trending Topics
  • Business & Money
      • Auto & Transportation

      • All Automotive & Transportation
      • Aerospace, Defense
      • Air Freight
      • Airlines & Aviation
      • Automotive
      • Maritime & Shipbuilding
      • Railroads and Intermodal Transportation
      • Supply Chain/Logistics
      • Transportation, Trucking & Railroad
      • Travel
      • Trucking and Road Transportation
      • Auto & Transportation Overview

      • View All Auto & Transportation

      • Business Technology

      • All Business Technology
      • Blockchain
      • Broadcast Tech
      • Computer & Electronics
      • Computer Hardware
      • Computer Software
      • Data Analytics
      • Electronic Commerce
      • Electronic Components
      • Electronic Design Automation
      • Financial Technology
      • High Tech Security
      • Internet Technology
      • Nanotechnology
      • Networks
      • Peripherals
      • Semiconductors
      • Business Technology Overview

      • View All Business Technology

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Financial Services & Investing

      • All Financial Services & Investing
      • Accounting News & Issues
      • Acquisitions, Mergers and Takeovers
      • Banking & Financial Services
      • Bankruptcy
      • Bond & Stock Ratings
      • Conference Call Announcements
      • Contracts
      • Cryptocurrency
      • Dividends
      • Earnings
      • Earnings Forecasts & Projections
      • Financing Agreements
      • Insurance
      • Investments Opinions
      • Joint Ventures
      • Mutual Funds
      • Private Placement
      • Real Estate
      • Restructuring & Recapitalization
      • Sales Reports
      • Shareholder Activism
      • Shareholder Meetings
      • Stock Offering
      • Stock Split
      • Venture Capital
      • Financial Services & Investing Overview

      • View All Financial Services & Investing

      • General Business

      • All General Business
      • Awards
      • Commercial Real Estate
      • Corporate Expansion
      • Earnings
      • Environmental, Social and Governance (ESG)
      • Human Resource & Workforce Management
      • Licensing
      • New Products & Services
      • Obituaries
      • Outsourcing Businesses
      • Overseas Real Estate (non-US)
      • Personnel Announcements
      • Real Estate Transactions
      • Residential Real Estate
      • Small Business Services
      • Socially Responsible Investing
      • Surveys, Polls and Research
      • Trade Show News
      • General Business Overview

      • View All General Business

  • Science & Tech
      • Consumer Technology

      • All Consumer Technology
      • Artificial Intelligence
      • Blockchain
      • Cloud Computing/Internet of Things
      • Computer Electronics
      • Computer Hardware
      • Computer Software
      • Consumer Electronics
      • Cryptocurrency
      • Data Analytics
      • Electronic Commerce
      • Electronic Gaming
      • Financial Technology
      • Mobile Entertainment
      • Multimedia & Internet
      • Peripherals
      • Social Media
      • STEM (Science, Tech, Engineering, Math)
      • Supply Chain/Logistics
      • Wireless Communications
      • Consumer Technology Overview

      • View All Consumer Technology

      • Energy & Natural Resources

      • All Energy
      • Alternative Energies
      • Chemical
      • Electrical Utilities
      • Gas
      • General Manufacturing
      • Mining
      • Mining & Metals
      • Oil & Energy
      • Oil and Gas Discoveries
      • Utilities
      • Water Utilities
      • Energy & Natural Resources Overview

      • View All Energy & Natural Resources

      • Environ­ment

      • All Environ­ment
      • Conservation & Recycling
      • Environmental Issues
      • Environmental Policy
      • Environmental Products & Services
      • Green Technology
      • Natural Disasters
      • Environ­ment Overview

      • View All Environ­ment

      • Heavy Industry & Manufacturing

      • All Heavy Industry & Manufacturing
      • Aerospace & Defense
      • Agriculture
      • Chemical
      • Construction & Building
      • General Manufacturing
      • HVAC (Heating, Ventilation and Air-Conditioning)
      • Machinery
      • Machine Tools, Metalworking and Metallurgy
      • Mining
      • Mining & Metals
      • Paper, Forest Products & Containers
      • Precious Metals
      • Textiles
      • Tobacco
      • Heavy Industry & Manufacturing Overview

      • View All Heavy Industry & Manufacturing

      • Telecomm­unications

      • All Telecomm­unications
      • Carriers and Services
      • Mobile Entertainment
      • Networks
      • Peripherals
      • Telecommunications Equipment
      • Telecommunications Industry
      • VoIP (Voice over Internet Protocol)
      • Wireless Communications
      • Telecomm­unications Overview

      • View All Telecomm­unications

  • Lifestyle & Health
      • Consumer Products & Retail

      • All Consumer Products & Retail
      • Animals & Pets
      • Beers, Wines and Spirits
      • Beverages
      • Bridal Services
      • Cannabis
      • Cosmetics and Personal Care
      • Fashion
      • Food & Beverages
      • Furniture and Furnishings
      • Home Improvement
      • Household, Consumer & Cosmetics
      • Household Products
      • Jewelry
      • Non-Alcoholic Beverages
      • Office Products
      • Organic Food
      • Product Recalls
      • Restaurants
      • Retail
      • Supermarkets
      • Toys
      • Consumer Products & Retail Overview

      • View All Consumer Products & Retail

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Health

      • All Health
      • Biometrics
      • Biotechnology
      • Clinical Trials & Medical Discoveries
      • Dentistry
      • FDA Approval
      • Fitness/Wellness
      • Health Care & Hospitals
      • Health Insurance
      • Infection Control
      • International Medical Approval
      • Medical Equipment
      • Medical Pharmaceuticals
      • Mental Health
      • Pharmaceuticals
      • Supplementary Medicine
      • Health Overview

      • View All Health

      • Sports

      • All Sports
      • General Sports
      • Outdoors, Camping & Hiking
      • Sporting Events
      • Sports Equipment & Accessories
      • Sports Overview

      • View All Sports

      • Travel

      • All Travel
      • Amusement Parks and Tourist Attractions
      • Gambling & Casinos
      • Hotels and Resorts
      • Leisure & Tourism
      • Outdoors, Camping & Hiking
      • Passenger Aviation
      • Travel Industry
      • Travel Overview

      • View All Travel

  • Policy & Public Interest
      • Policy & Public Interest

      • All Policy & Public Interest
      • Advocacy Group Opinion
      • Animal Welfare
      • Congressional & Presidential Campaigns
      • Corporate Social Responsibility
      • Domestic Policy
      • Economic News, Trends, Analysis
      • Education
      • Environmental
      • European Government
      • FDA Approval
      • Federal and State Legislation
      • Federal Executive Branch & Agency
      • Foreign Policy & International Affairs
      • Homeland Security
      • Labor & Union
      • Legal Issues
      • Natural Disasters
      • Not For Profit
      • Patent Law
      • Public Safety
      • Trade Policy
      • U.S. State Policy
      • Policy & Public Interest Overview

      • View All Policy & Public Interest

  • People & Culture
      • People & Culture

      • All People & Culture
      • Aboriginal, First Nations & Native American
      • African American
      • Asian American
      • Children
      • Diversity, Equity & Inclusion
      • Hispanic
      • Lesbian, Gay & Bisexual
      • Men's Interest
      • People with Disabilities
      • Religion
      • Senior Citizens
      • Veterans
      • Women
      • People & Culture Overview

      • View All People & Culture

      • In-Language News

      • Arabic
      • español
      • português
      • Česko
      • Danmark
      • Deutschland
      • España
      • France
      • Italia
      • Nederland
      • Norge
      • Polska
      • Portugal
      • Россия
      • Slovensko
      • Suomi
      • Sverige
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Amplify Content
  • All Products
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Hamburger menu
  • PR Newswire: news distribution, targeting and monitoring
  • Send a Release
    • ALL CONTACT INFO
    • Contact Us

      888-776-0942
      from 8 AM - 10 PM ET

  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • News in Focus
    • Browse All News
    • Multimedia Gallery
    • Trending Topics
  • Business & Money
    • Auto & Transportation
    • Business Technology
    • Entertain­ment & Media
    • Financial Services & Investing
    • General Business
  • Science & Tech
    • Consumer Technology
    • Energy & Natural Resources
    • Environ­ment
    • Heavy Industry & Manufacturing
    • Telecomm­unications
  • Lifestyle & Health
    • Consumer Products & Retail
    • Entertain­ment & Media
    • Health
    • Sports
    • Travel
  • Policy & Public Interest
  • People & Culture
    • People & Culture
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Amplify Content
  • All Products
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS

APTO Releases Training Dataset to Enhance the Mathematical Reasoning Capabilities of Large Language Models (LLMs)

APTO releases free high-accuracy Japanese reasoning data for LLM fine-tuning. (PRNewsfoto/APTO, Inc.)

News provided by

APTO, Inc.

Sep 30, 2025, 13:00 ET

Share this article

Share toX

Share this article

Share toX

TOKYO, Sept. 30, 2025 /PRNewswire/ -- As generative AI use continues to increase, accuracy has become the most important metric and a key factor in decisions around adoption and utilization. APTO is committed to supporting companies and organizations through high-quality AI data.

Continue Reading
An example of the reasoning process breaking down midway.
An example of the reasoning process breaking down midway.

In recent years, the performance of LLMs has improved dramatically. However, in mathematical tasks that require multi-step calculations or strict answer formats, errors and formatting issues are still frequently observed. To address these challenges, we have developed and released a training dataset for LLMs designed to enhance reasoning and answer accuracy in mathematical problem-solving.

Background: Developing an LLM Dataset for Mathematical Reasoning

LLM developers and users have no doubt encountered the following challenges when dealing with mathematics:

  • The model does not output step-by-step calculations
  • It fails to follow the calculation process accurately and produces an incorrect answer
  • Responses not conforming to the required answer format, such as integers or fractions
  • Responses that do not show the problem-solving process (for example, omitting intermediate steps or outputting only the final answer)

In tackling complex mathematical problems, it is not uncommon to encounter outputs that disregard instructions or rules and fail to provide accurate answers.

We drew on our experience enhancing the reasoning abilities of LLMs to improve the accuracy of answers to mathematical problems. Based on this expertise, we developed a dataset of mathematical problems that includes complex reasoning processes.

About This Dataset

This dataset is mathematical reasoning data in JSON Lines format, created through a combination of machine generation and human review. It is designed for training PRMs (Process/Preference Reward Models) and includes not only the problem statement, correct answer, and generated responses, but also the reasoning process (Chain-of-Thought) and evaluation information for each step. This enables qualitative assessment of the reasoning process, rather than just a simple right-or-wrong judgment.

What Is Included in the Data Set?

  • problem (mathematical problems to be input)
  • expected_answer (used for automatic grading, evaluation, and format checking)
  • generated_answer (used for analyzing error patterns and extracting difficult cases)
  • answer_match (used for difficulty adjustment, stratified evaluation, and sampling control
  • step_evaluations: array; each element consists of {step_index, step_text, verdict} (using the text of each step and its correctness label to perform PRM/process supervision)
  • metadata.step_evaluation_category (eg: all_correct/partial_correct)

An Example of the Reasoning Process Breaking Down Midway

In this example, some of the geometric constraints are overlooked, and the range is limited only by the number-theoretic condition that "the area must be an integer." As a result, configurations that should not be allowed are included, leading to the incorrect conclusion of a difference of 300. On the other hand, since the initial derivation is correct but the reasoning collapses at the end, this is categorized as partial_correct, a typical failure pattern.

(This is an example of an incorrect answer and constitutes only a part of the published dataset.)

We ensure quality by combining automated evaluation with human checks, covering both format compliance and the accuracy of final answers. The dataset we're releasing includes 300 samples taken from the data used in training.

The Types of Questions

In order to prevent the reasoning process being skewed toward specific domains, we have divided the questions into the following categories:

  1. Calculus

  2. Algebra

  3. Geometry

  4. Probability, Statistics, and Discrete Mathematics

The Reasoning Process

'Reasoning Process (Chain-of-Thought)' refers to the step-by-step thinking involved in solving a mathematical problem.

The dataset structures this process into an appropriate sequence: reading the problem, carrying out calculations step by step, and arriving at the answer. Each problem includes at least two reasoning steps, with solution processes ranging from two to eight steps.

Data Performance Evaluation Results

A model trained on this dataset was evaluated for performance using the following AIME problem set as an external benchmark:

  • 2024 AIME (HuggingFaceH4/aime_2024) (*1)
  • 2025 AIME (math-ai/aime25) (*2)

The evaluation procedure is outlined below:

1. Fine-tuning using appropriate reasoning data for mathematical problem solving as training data.

This fine-tuning is a multitask approach that combines PRM (Process Reward Model) and CLM (Causal Language Modeling).

In the PRM task, we used step_evaluations.step_text from the dataset as input, and the verdict labels ("correct / incorrect / unclear") as the training signal to classify whether each step was right or wrong. At the same time, we combined this with CLM (next-token prediction) to help the model maintain its ability to generate text. For training, we applied LoRA, which allowed us to fine-tune efficiently while keeping most of the base model frozen.

2. Evaluated answer accuracy on the AIME dataset before and after training. Since outputs can vary, each problem was answered four times, and the average score was compared.

To evaluate improvements in answer accuracy, we conducted evaluations on openai/gpt-oss-20b (*3), a model with strong mathematical performance.

When training was conducted on the above models, we observed an actual improvement of 10.0 points in answer performance.

Exam Year

No. of questions

Pre-Training Accuracy

Post-Training Accuracy

Margin of Improvement

2024

30

26.7 %

36.7 %

+10.0pt

2025

30

33.3 %

43.3 %

+10.0pt

Using data that includes complex calculation steps not only improved accuracy in mathematical reasoning tasks but also helped prevent breakdowns in step-by-step calculations.

This dataset is available on Hugging Face (see below)

https://huggingface.co/datasets/APTOinc/llm-math-reasoning-dataset 

For our existing clients, this will be delivered shortly via our email newsletter.

Future Dataset Development

In areas that require logical reasoning, it's important not only to reach the right answer but also to clearly understand and reproduce the steps that lead there. As a result, approaches that emphasize the quality of the reasoning process itself are gaining attention. (*4)

Because LLM technology is advancing at a rapid pace, we believe it is necessary to develop datasets for other fields as well, datasets that enable models to accurately follow step-by-step processes, while keeping in mind constantly shifting needs and technical challenges.

We are developing new datasets in line with future technology trends and customer needs. We hope these will help accelerate your AI development and further improve accuracy.

Sources:

*1: https://huggingface.co/datasets/HuggingFaceH4/aime_2024 

*2: https://huggingface.co/datasets/math-ai/aime25 

*3: https://huggingface.co/openai/gpt-oss-20b 

*4: The Lessons of Developing Process Reward Models in Mathematical Reasoning (2025), https://arxiv.org/abs/2410.17621 

About APTO

We provide AI development support services that focus on data, the factor that has the greatest impact on accuracy in any AI development. Our offerings include:

  • harBest , a data collection and annotation platform that uses crowd workers;
  • harBest Dataset , which accelerates the preparation of data often considered a bottleneck in the early stages of development;
  • harBest Expert , which improves data accuracy by incorporating expert knowledge.

By supporting AI development that might be restricted due to data-related challenges, we have earned recognition from many enterprises both in Japan and abroad.

SOURCE APTO, Inc.

WANT YOUR COMPANY'S NEWS FEATURED ON PRNEWSWIRE.COM?

icon3
440k+
Newsrooms &
Influencers
icon1
9k+
Digital Media
Outlets
icon2
270k+
Journalists
Opted In
GET STARTED

Modal title

Also from this source

APTO Releases High-Accuracy Japanese Reasoning Data for LLM Fine-Tuning, Free of Charge

APTO Releases High-Accuracy Japanese Reasoning Data for LLM Fine-Tuning, Free of Charge

APTO is pleased to announce the release of a free dataset for fine-tuning reasoning models, such as OpenAI's GPT-01 and Deepseek's Deepseek R1. This...

More Releases From This Source

Explore

Computer & Electronics

Computer & Electronics

Computer Software

Computer Software

Computer Software

Computer Software

Data Analytics

Data Analytics

News Releases in Similar Topics

Contact PR Newswire

  • Call PR Newswire at 888-776-0942
    from 8 AM - 9 PM ET
  • Chat with an Expert
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices

Products

  • For Marketers
  • For Public Relations
  • For IR & Compliance
  • For Agency
  • All Products

About

  • About PR Newswire
  • About Cision
  • Become a Publishing Partner
  • Become a Channel Partner
  • Careers
  • Accessibility Statement
  • APAC
  • APAC - Simplified Chinese
  • APAC - Traditional Chinese
  • Brazil
  • Canada
  • Czech
  • Denmark
  • Finland
  • France
  • Germany
  • India
  • Indonesia
  • Israel
  • Italy
  • Japan
  • Korea
  • Mexico
  • Middle East
  • Middle East - Arabic
  • Netherlands
  • Norway
  • Poland
  • Portugal
  • Russia
  • Slovakia
  • Spain
  • Sweden
  • United Kingdom
  • Vietnam

My Services

  • All New Releases
  • Platform Login
  • ProfNet
  • Data Privacy

Do not sell or share my personal information:

  • Submit via [email protected] 
  • Call Privacy toll-free: 877-297-8921

Contact PR Newswire

Products

About

My Services
  • All News Releases
  • Platform Login
  • ProfNet
Call PR Newswire at
888-776-0942
  • Terms of Use
  • Privacy Policy
  • Information Security Policy
  • Site Map
  • RSS
  • Cookies
Copyright © 2025 Cision US Inc.