Accessibility Statement Skip Navigation
  • Resources
  • Investor Relations
  • Journalists
  • Agencies
  • Client Login
  • Send a Release
Return to PR Newswire homepage
  • News
  • Products
  • Contact
When typing in this field, a list of search results will appear and be automatically updated as you type.

Searching for your content...

No results found. Please change your search terms and try again.
  • News in Focus
      • Browse News Releases

      • All News Releases
      • All Public Company
      • English-only
      • News Releases Overview

      • Multimedia Gallery

      • All Multimedia
      • All Photos
      • All Videos
      • Multimedia Gallery Overview

      • Trending Topics

      • All Trending Topics
  • Business & Money
      • Auto & Transportation

      • All Automotive & Transportation
      • Aerospace, Defense
      • Air Freight
      • Airlines & Aviation
      • Automotive
      • Maritime & Shipbuilding
      • Railroads and Intermodal Transportation
      • Supply Chain/Logistics
      • Transportation, Trucking & Railroad
      • Travel
      • Trucking and Road Transportation
      • Auto & Transportation Overview

      • View All Auto & Transportation

      • Business Technology

      • All Business Technology
      • Blockchain
      • Broadcast Tech
      • Computer & Electronics
      • Computer Hardware
      • Computer Software
      • Data Analytics
      • Electronic Commerce
      • Electronic Components
      • Electronic Design Automation
      • Financial Technology
      • High Tech Security
      • Internet Technology
      • Nanotechnology
      • Networks
      • Peripherals
      • Semiconductors
      • Business Technology Overview

      • View All Business Technology

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Financial Services & Investing

      • All Financial Services & Investing
      • Accounting News & Issues
      • Acquisitions, Mergers and Takeovers
      • Banking & Financial Services
      • Bankruptcy
      • Bond & Stock Ratings
      • Conference Call Announcements
      • Contracts
      • Cryptocurrency
      • Dividends
      • Earnings
      • Earnings Forecasts & Projections
      • Financing Agreements
      • Insurance
      • Investments Opinions
      • Joint Ventures
      • Mutual Funds
      • Private Placement
      • Real Estate
      • Restructuring & Recapitalization
      • Sales Reports
      • Shareholder Activism
      • Shareholder Meetings
      • Stock Offering
      • Stock Split
      • Venture Capital
      • Financial Services & Investing Overview

      • View All Financial Services & Investing

      • General Business

      • All General Business
      • Awards
      • Commercial Real Estate
      • Corporate Expansion
      • Earnings
      • Environmental, Social and Governance (ESG)
      • Human Resource & Workforce Management
      • Licensing
      • New Products & Services
      • Obituaries
      • Outsourcing Businesses
      • Overseas Real Estate (non-US)
      • Personnel Announcements
      • Real Estate Transactions
      • Residential Real Estate
      • Small Business Services
      • Socially Responsible Investing
      • Surveys, Polls and Research
      • Trade Show News
      • General Business Overview

      • View All General Business

  • Science & Tech
      • Consumer Technology

      • All Consumer Technology
      • Artificial Intelligence
      • Blockchain
      • Cloud Computing/Internet of Things
      • Computer Electronics
      • Computer Hardware
      • Computer Software
      • Consumer Electronics
      • Cryptocurrency
      • Data Analytics
      • Electronic Commerce
      • Electronic Gaming
      • Financial Technology
      • Mobile Entertainment
      • Multimedia & Internet
      • Peripherals
      • Social Media
      • STEM (Science, Tech, Engineering, Math)
      • Supply Chain/Logistics
      • Wireless Communications
      • Consumer Technology Overview

      • View All Consumer Technology

      • Energy & Natural Resources

      • All Energy
      • Alternative Energies
      • Chemical
      • Electrical Utilities
      • Gas
      • General Manufacturing
      • Mining
      • Mining & Metals
      • Oil & Energy
      • Oil and Gas Discoveries
      • Utilities
      • Water Utilities
      • Energy & Natural Resources Overview

      • View All Energy & Natural Resources

      • Environ­ment

      • All Environ­ment
      • Conservation & Recycling
      • Environmental Issues
      • Environmental Policy
      • Environmental Products & Services
      • Green Technology
      • Natural Disasters
      • Environ­ment Overview

      • View All Environ­ment

      • Heavy Industry & Manufacturing

      • All Heavy Industry & Manufacturing
      • Aerospace & Defense
      • Agriculture
      • Chemical
      • Construction & Building
      • General Manufacturing
      • HVAC (Heating, Ventilation and Air-Conditioning)
      • Machinery
      • Machine Tools, Metalworking and Metallurgy
      • Mining
      • Mining & Metals
      • Paper, Forest Products & Containers
      • Precious Metals
      • Textiles
      • Tobacco
      • Heavy Industry & Manufacturing Overview

      • View All Heavy Industry & Manufacturing

      • Telecomm­unications

      • All Telecomm­unications
      • Carriers and Services
      • Mobile Entertainment
      • Networks
      • Peripherals
      • Telecommunications Equipment
      • Telecommunications Industry
      • VoIP (Voice over Internet Protocol)
      • Wireless Communications
      • Telecomm­unications Overview

      • View All Telecomm­unications

  • Lifestyle & Health
      • Consumer Products & Retail

      • All Consumer Products & Retail
      • Animals & Pets
      • Beers, Wines and Spirits
      • Beverages
      • Bridal Services
      • Cannabis
      • Cosmetics and Personal Care
      • Fashion
      • Food & Beverages
      • Furniture and Furnishings
      • Home Improvement
      • Household, Consumer & Cosmetics
      • Household Products
      • Jewelry
      • Non-Alcoholic Beverages
      • Office Products
      • Organic Food
      • Product Recalls
      • Restaurants
      • Retail
      • Supermarkets
      • Toys
      • Consumer Products & Retail Overview

      • View All Consumer Products & Retail

      • Entertain­ment & Media

      • All Entertain­ment & Media
      • Advertising
      • Art
      • Books
      • Entertainment
      • Film and Motion Picture
      • Magazines
      • Music
      • Publishing & Information Services
      • Radio & Podcast
      • Television
      • Entertain­ment & Media Overview

      • View All Entertain­ment & Media

      • Health

      • All Health
      • Biometrics
      • Biotechnology
      • Clinical Trials & Medical Discoveries
      • Dentistry
      • FDA Approval
      • Fitness/Wellness
      • Health Care & Hospitals
      • Health Insurance
      • Infection Control
      • International Medical Approval
      • Medical Equipment
      • Medical Pharmaceuticals
      • Mental Health
      • Pharmaceuticals
      • Supplementary Medicine
      • Health Overview

      • View All Health

      • Sports

      • All Sports
      • General Sports
      • Outdoors, Camping & Hiking
      • Sporting Events
      • Sports Equipment & Accessories
      • Sports Overview

      • View All Sports

      • Travel

      • All Travel
      • Amusement Parks and Tourist Attractions
      • Gambling & Casinos
      • Hotels and Resorts
      • Leisure & Tourism
      • Outdoors, Camping & Hiking
      • Passenger Aviation
      • Travel Industry
      • Travel Overview

      • View All Travel

  • Policy & Public Interest
      • Policy & Public Interest

      • All Policy & Public Interest
      • Advocacy Group Opinion
      • Animal Welfare
      • Congressional & Presidential Campaigns
      • Corporate Social Responsibility
      • Domestic Policy
      • Economic News, Trends, Analysis
      • Education
      • Environmental
      • European Government
      • FDA Approval
      • Federal and State Legislation
      • Federal Executive Branch & Agency
      • Foreign Policy & International Affairs
      • Homeland Security
      • Labor & Union
      • Legal Issues
      • Natural Disasters
      • Not For Profit
      • Patent Law
      • Public Safety
      • Trade Policy
      • U.S. State Policy
      • Policy & Public Interest Overview

      • View All Policy & Public Interest

  • People & Culture
      • People & Culture

      • All People & Culture
      • Aboriginal, First Nations & Native American
      • African American
      • Asian American
      • Children
      • Diversity, Equity & Inclusion
      • Hispanic
      • Lesbian, Gay & Bisexual
      • Men's Interest
      • People with Disabilities
      • Religion
      • Senior Citizens
      • Veterans
      • Women
      • People & Culture Overview

      • View All People & Culture

      • In-Language News

      • Arabic
      • español
      • português
      • Česko
      • Danmark
      • Deutschland
      • España
      • France
      • Italia
      • Nederland
      • Norge
      • Polska
      • Portugal
      • Россия
      • Slovensko
      • Suomi
      • Sverige
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Amplify Content
  • All Products
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Hamburger menu
  • PR Newswire: news distribution, targeting and monitoring
  • Send a Release
    • ALL CONTACT INFO
    • Contact Us

      888-776-0942
      from 8 AM - 10 PM ET

  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • News in Focus
    • Browse All News
    • Multimedia Gallery
    • Trending Topics
  • Business & Money
    • Auto & Transportation
    • Business Technology
    • Entertain­ment & Media
    • Financial Services & Investing
    • General Business
  • Science & Tech
    • Consumer Technology
    • Energy & Natural Resources
    • Environ­ment
    • Heavy Industry & Manufacturing
    • Telecomm­unications
  • Lifestyle & Health
    • Consumer Products & Retail
    • Entertain­ment & Media
    • Health
    • Sports
    • Travel
  • Policy & Public Interest
  • People & Culture
    • People & Culture
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • Explore Our Platform
  • Plan Campaigns
  • Create with AI
  • Distribute Press Releases
  • Amplify Content
  • All Products
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices
  • Send a Release
  • Client Login
  • Resources
  • Blog
  • Journalists
  • RSS

Yandex releases world's largest event dataset for advancing recommender systems


News provided by

Yandex

May 29, 2025, 12:05 ET

Share this article

Share toX

Share this article

Share toX

  • Yandex introduces the world's largest currently available dataset for recommender systems, advancing research and development on a global scale.

  • The open dataset contains 4.79B anonymized user interactions (listens, likes, dislikes) from the Yandex music streaming service collected over 10 months.

  • The dataset includes anonymized audio embeddings, organic interaction flags, and precise timestamps for real-world behavioral analysis.

  • It introduces Global Temporal Split (GTS) evaluation to preserve event sequences, paired with baseline algorithms for reference points.

  • The dataset is available on Hugging Face in three sizes — 5B, 500M, and 50M events — to accommodate diverse research and development needs.

SINGAPORE, May 29, 2025 /PRNewswire/ -- Yandex has published Yambda (Yandex Music Billion-Interactions Dataset), the world's largest open dataset for recommender systems, containing nearly 5 billion anonymized user interactions with audio tracks from its music streaming platform, Yandex Music.

Yambda serves as a universal benchmark for testing new approaches and algorithms across all domains utilizing recommender systems — e-commerce, social networks, and short-form video platforms.

The dataset enables researchers to develop and test new recommender algorithms against its baseline models, accelerating innovation. Startups with limited data can leverage the dataset to build and test systems using Yambda before scaling. This accelerates the creation of advanced technologies tailored to business needs worldwide.

Bridging the research-industry gap

The quality and scale of training data are critical to delivering relevant recommendations on platforms like streaming services, social networks, short-form video apps, and e-commerce marketplaces. However, research in recommender systems has lagged behind rapidly advancing fields like large language models, largely due to limited access to large-scale datasets. Effective recommendation models require terabytes of behavioral data, which commercial platforms possess but rarely share publicly.

Researchers are often left with small, outdated datasets that fail to capture the complexity of modern usage:

  • Spotify's Million Playlists dataset is too small for commercial-scale recommender systems.
  • Netflix Prize dataset, with ~17,000 items and date-only timestamps, limits temporal modeling and large-scale research.
  • Criteo 1TB Click Logs dataset lacks proper documentation and identifiers, and focuses narrowly on ad clicks.

"Recommender systems are inherently tied to sensitive data. Companies can only publish recommender system datasets publicly after exhaustive anonymization, a resource-intensive process that's slowed open innovation," explains Nikolai Savushkin, Head of Recommender Systems at Yandex.

This data scarcity creates a gap: models that excel in academic settings often underperform in real-world applications. Efforts to integrate recommender systems with advanced architectures are also constrained by the lack of suitable training data.

About the Yambda dataset

Yambda addresses recommender system challenges by providing a massive, anonymized dataset from its music streaming service with ~28 million monthly users. This dataset provides insights into how users interact with the content offered by Yandex Music, which is known for its sophisticated recommendation system My Wave that tailors the listening experience to the tastes of each user. To protect privacy, all user and track data is anonymized, using numeric identifiers to meet privacy standards.

Key features of the dataset:

  • 4.79 billion anonymized user interactions collected over 10 months.

  • Data from 1 million users and anonymized descriptors for 9.39 million tracks.

  • Includes two feedback types: implicit interactions (listens) and explicit interactions (likes, dislikes, and their removal).

  • Offers audio embeddings (vector representations generated via convolutional neural networks) and anonymized information about tracks.

  • Features an "is_organic" flag marking whether users discovered tracks independently or through recommendations, enabling deeper behavioral analysis.

  • All events are timestamped, which supports the analysis of user behavior over time and allows models to be evaluated under conditions that closely resemble real-world use.

The dataset is released in Apache Parquet format, compatible with distributed processing systems such as Spark or Hadoop and analytical libraries like Pandas and Polars.

"Yambda empowers researchers to test innovative hypotheses and businesses to build smarter recommender systems. Ultimately, users benefit — finding the perfect song, product, or service effortlessly," notes Nikolai Savushkin.

Dataset versions and evaluation

Available in three sizes — approximately 5 billion, 500 million, and 50 million events — the Yambda dataset accommodates researchers and developers with different needs and computational resource capacities.

The dataset uses Global Temporal Split (GTS) for evaluation, a method that splits data by timestamps to preserve event sequences. Unlike Leave-One-Out, which removes the last positive interaction from each user's history for testing, GTS avoids breaking temporal dependencies between training and test sets. This ensures a more realistic model testing — mimicking real-world conditions where future data is unavailable.

Baseline implementations include MostPop, DecayPop, ItemKNN, iALS, BPR, SANSA, and SASRec, providing benchmarks for comparing new recommender system approaches. These baselines are evaluated using standard metrics, including:

  • NDCG@k (ranking quality)
  • Recall@k (retrieval effectiveness)
  • Coverage@k (catalog diversity)

"When industry leaders share hard-won tools and data, a rising tide lifts all boats: researchers gain real-world benchmarks, startups access resources once reserved for tech giants, and users everywhere enjoy greater personalization," added Nikolay Savushkin.

Yambda, the world's largest open recommender system dataset, is now available on Hugging Face.

About Yandex

Yandex is a global technology company that builds intelligent products and services powered by machine learning. The company's goal is to help consumers and businesses better navigate the online and offline world. Since 1997, Yandex has been delivering world-class, locally relevant search and information services and has also developed market-leading on-demand transportation services, navigation products, and other mobile applications for millions of consumers across the globe.

About My Wave

My Wave, a personalized recommendation system integrated into the multi-million-user music streaming service, Yandex Music, employs deep neural models and AI algorithms to analyze over a thousand factors — including user interactions, customizable mood/language settings, and real-time music analysis of spectrograms, frequency ranges, rhythm, vocal tone, and genre. By processing listening history and track sequences, it dynamically adapts to user preferences, identifies audio similarities, and predicts musical tastes to deliver tailored suggestions.

SOURCE Yandex

21%

more press release views with 
Request a Demo

Modal title

Contact PR Newswire

  • Call PR Newswire at 888-776-0942
    from 8 AM - 9 PM ET
  • Chat with an Expert
  • General Inquiries
  • Editorial Bureaus
  • Partnerships
  • Media Inquiries
  • Worldwide Offices

Products

  • For Marketers
  • For Public Relations
  • For IR & Compliance
  • For Agency
  • All Products

About

  • About PR Newswire
  • About Cision
  • Become a Publishing Partner
  • Become a Channel Partner
  • Careers
  • Accessibility Statement
  • APAC
  • APAC - Simplified Chinese
  • APAC - Traditional Chinese
  • Brazil
  • Canada
  • Czech
  • Denmark
  • Finland
  • France
  • Germany
  • India
  • Indonesia
  • Israel
  • Italy
  • Japan
  • Korea
  • Mexico
  • Middle East
  • Middle East - Arabic
  • Netherlands
  • Norway
  • Poland
  • Portugal
  • Russia
  • Slovakia
  • Spain
  • Sweden
  • United Kingdom
  • Vietnam

My Services

  • All New Releases
  • Platform Login
  • ProfNet
  • Data Privacy

Do not sell or share my personal information:

  • Submit via [email protected] 
  • Call Privacy toll-free: 877-297-8921

Contact PR Newswire

Products

About

My Services
  • All News Releases
  • Platform Login
  • ProfNet
Call PR Newswire at
888-776-0942
  • Terms of Use
  • Privacy Policy
  • Information Security Policy
  • Site Map
  • RSS
  • Cookies
Copyright © 2025 Cision US Inc.