DataStax Releases Dramatically Simpler, More Reliable, High-Performance Hadoop Solution

DataStax' Brisk is Open-Source and Available Now for Download

May 09, 2011, 07:32 ET from DataStax

BURLINGAME, Calif., May 9, 2011 /PRNewswire/ -- Today, DataStax, the commercial leader in Apache Cassandra™, released DataStax' Brisk – a second-generation open-source Hadoop distribution that eliminates the key operational complexities with deploying and running Hadoop and Hive in production. Brisk is powered by Cassandra and offers a single platform containing a low-latency database for extremely high-volume web and real-time applications, while providing tightly coupled Hadoop and Hive analytics.

(Logo: )

"The goal in developing and deploying Brisk was to provide customers an easier way to manage large volumes of data, while extracting business insights from that data in the most efficient way possible," said Matt Pfeil, CEO and co-founder, DataStax.  "By utilizing Apache Cassandra™ as the foundation for Hadoop, we've created a distribution where the real-time analysis and creation of data live in the same data-store, effectively eliminating the need to move data or risk systems failing due to work overload."

Brisk was previewed at the GigaOM Structure Big Data conference and has gained industry attention for the way it utilizes the best of Hadoop (i.e. MapReduce and Hive capabilities), while replacing the weaker pieces (i.e. HDFS and HBase) with Cassandra-based technology. The result is a Hadoop distribution containing a single layer of peer nodes that communicate via a state-of-the-art 'gossip protocol' for replication and fault-tolerance. This entirely eliminates the HDFS 'name node' and associated single-points-of-failure and scalability pains.

DataStax' Brisk also benefits from Cassandra's intrinsic support for multi-datacenter replication. For the first time Hadoop provides automatic synchronous or asynchronous replication of data between two or more distributed datacenters – all controlled through simple policy definitions.

"At Ooyala we focus on delivering an extremely personalized online video experience to end-users as well as deep insights to our customers on how their content is performing, and this requires vast amounts of analytics data," said Harry Robertson, Analytic Technical Lead, Ooyala. "DataStax' Brisk really has the potential to change the way enterprises use Hadoop through its Cassandra core. The potential gains are faster time to insight, simpler operation and effortless multi-datacenter Hadoop replication."

According to the company's early customers and a wide variety of benchmarks, the initial Brisk beta release has equal or better performance than existing, widely-used Hadoop distributions. Brisk is compatible with all widely used Hadoop distributions and tools. To download DataStax' Brisk, please go to

"DataStax is an exciting entrant into the Hadoop Big Data ecosystem," said Martin Hall, executive vice president, Karmasphere. "Its Brisk product offers companies a compelling way to integrate operational data stored in Cassandra with Hadoop, expanding the options for analyzing data using Hadoop MapReduce.  Through our partnership, customers will be able to use Karmasphere solutions to analyze broader sets of Hadoop data and bring new innovation and services to market."

About DataStax

DataStax, the commercial leader in Apache Cassandra™ and Hadoop™, offers products and services that make it easy for customers to build, deploy and operate elastically scalable and cloud-optimized applications and data services. The company has over 80 customers, including leaders such as Netflix, Cisco, Rackspace and Constant Contact, and spanning verticals including web, financial services, telecommunications, logistics and government. DataStax is backed by industry leading investors, including Lightspeed Venture Partners, Sequoia Capital and Rackspace Hosting, and is based in Burlingame, CA with offices in Austin, TX and Stamford, CT. For more information, visit

About Apache Cassandra™

Apache Cassandra™ is an open source distributed database management system. It is designed to store and allow very low-latency access to very large amounts of data spread out across many commodity servers while providing a highly available service with no single point of failure. This next-generation data platform evolved from work at Google, Amazon and Facebook, and is an Apache Software Foundation top-level project.

For more information, visit