EMC Delivers Hadoop 'Big Data' Analytics to the Enterprise

EMC Software Distribution, Support and New Appliance Solidify Apache Hadoop as an Enterprise-Ready Tool Enabling New Real-Time Data Capabilities

May 09, 2011, 14:30 ET from EMC Corporation

LAS VEGAS, May 9, 2011 /PRNewswire/ -- EMC WORLD 2011 --

News Summary:

  • EMC today announced a comprehensive strategy for distributing, integrating and supporting the Apache Hadoop open-source software as an enterprise-ready tool for Big Data.
  • EMC introduced the world's first purpose-built, high-performance, data co-processing Hadoop appliance for structured and unstructured data.
  • EMC announced the availability of EMC® Greenplum® HD Community Edition and EMC Greenplum HD Enterprise Edition -- a complete Hadoop platform including installation, training, global support and value add beyond simple packaging of the Apache Hadoop distribution.
  • Collaboration between EMC and a dozen leading partners will help enable technology innovations such as real-time data interaction, offer greater reliability, and make Hadoop much easier to deploy and use.

Full Story:

Extending its leadership in providing customers with the most powerful and efficient ways to extract value from Big Data, EMC Corporation (NYSE: EMC), the world leader in information infrastructure solutions, today announced a comprehensive strategy for distributing, integrating and supporting the Apache Hadoop open-source software used for data-intensive distributed applications. The company is introducing the world's first purpose-built, high-performance, data co-processing Hadoop appliance -- the Greenplum HD Data Computing Appliance. The appliance marries Hadoop with the EMC Greenplum Database, allowing the co-processing of both structured and unstructured data within a single, seamless solution. In addition, EMC announced the availability of the Hadoop-based EMC Greenplum HD Community Edition and EMC Greenplum HD Enterprise Edition software. Combined with product certification by a dozen leading partners, these will enable technology innovations such as real-time data interaction, offer greater reliability, and make Hadoop much easier to deploy and use.

For the multimedia version of this press release and related content, please go to:  http://www.emc.com/about/news/press/2011/20110509-03.htm

Apache Hadoop has rapidly emerged as the preferred solution for Big Data analytics across unstructured data. Organizations looking for opportunity in an ever-changing business environment are finding that Big Data analysis is the competitive advantage. Hadoop-based batch processing of unstructured and structured data at massive scale using commodity hardware has led to a profound change in analytics. By extracting the knowledge wrapped within unstructured machine-generated data, organizations can make better decisions that drive revenue, improve service and reduce costs.

The EMC Greenplum HD product family enables an organization to take advantage of Big Data analytics without the overhead and complexity that comes with the cumbersome tools and solutions on the market today. Available in two editions -- Community and Enterprise -- Greenplum HD software provides a complete platform including installation, training, global support and value add beyond simple packaging of the Apache distribution.

EMC's unique value and capabilities for Hadoop include:

  • EMC Greenplum HD Data Computing Appliance - Apache Hadoop is seamlessly integrated with the Greenplum database in the Greenplum HD Data Computing Appliance. The solution supports Hadoop external tables, thereby enabling users to access data residing on the Hadoop Distributed File System (HDFS) without materializing the data. Administrators can read and write files in parallel from Greenplum to HDFS, enabling rapid and simple data sharing. Cross-platform analysis can be performed using the power of Greenplum SQL and advanced analytic functions accessing data on HDFS. The combined solution delivers the industry's only complete Big Data Analytics Platform.  
  • EMC Greenplum HD Enterprise Edition - The Enterprise Edition is a 100 percent interface-compatible implementation of the Apache Hadoop stack. By maintaining Hadoop interface compatibility, the Enterprise Edition provides seamless application portability while delivering advanced features required by larger organizations. These include:
    • Data management features such as snapshots and wide area replication
    • Simple data loading and access using a native network file system (NFS) interface
    • End-to-end manageability including simple cluster deployment, automatic failure detection and notification, multi-site management and rolling upgrades

Best of all, these capabilities are delivered along with two to five times the performance improvement over the standard packaged versions of Apache Hadoop.

  • EMC Greenplum HD Community Edition - The Community Edition is a 100 percent open source certified and supported version of the Apache Hadoop stack comprising HDFS, MapReduce, Zookeeper, Hive and HBase. EMC Greenplum provides fault tolerance for the Name Node and Job Tracker, both single points of failure in standard Hadoop implementations.

In addition to its Hadoop offerings, EMC has created a vibrant and powerful ecosystem with twelve companies offering business intelligence, data transfer and other technology capabilities. These companies are Concurrent, CSC, Datameer, Informatica, Jaspersoft, Karmasphere, Microstrategy, Pentaho, SAS, SnapLogic, Talend, and VMware. This breadth of support is testament to the value EMC brings to Hadoop. Technology companies and enterprises can now extend the trust they have in EMC to the open source data analytics tool.

EMC Global Services has developed an integrated family of professional services, support and training to help customers accelerate the adoption of data warehousing and business analytics using the EMC Greenplum Data Computing Appliance. This includes a new Enterprise Business Analytics Assessment Service that helps customers identify, deploy, optimize, and operationalize advanced analytics in support of their key business initiatives.  In addition, EMC will assist customers' data migration and consolidation efforts from their Oracle, Teradata and other existing data warehouse environments onto the EMC Greenplum DCA.

Supporting Quotes:

John Webster, Senior Partner, Evaluator Group:

"Hadoop has played a leading role in the transformation from traditional data warehousing to Big Data Analytics. EMC's Hadoop commercialization strategy is aimed at streamlining and bulletproofing Hadoop for enterprise users, making Hadoop more of a must-have real-time analytics tool for the enterprise."

Gartner, Inc. Research: Cool Vendors in Data Management and Integration, 2010 report by Eric Thoo, Donald Feinberg, Ted Friedman and Andreas Bitterer:  

"Use of Hadoop is growing in commercial organizations. We believe there are many implementations in 'stealth' mode today, hidden within business analytic groups and with little or no support from the IT department. As the use grows within the organization and becomes more mission-critical, there is an increasing need for support and other services."

Bill Cook, President and General Manager, Data Computing Division, EMC:

"EMC has a responsibility to help our customers realize all that's possible with Big Data, both structured and unstructured. There's a time and a place for the value that relational databases add to structured data, and there's a time and a place for the value Hadoop can give to unstructured data. Many of our enterprise customers need both and, with the help of our partners, we're able to provide them both, while also meeting their expectations around high availability, fault tolerance, and enterprise-class support and service."


The EMC Greenplum HD Community Edition, EMC Greenplum HD Enterprise Edition and the EMC Greenplum HD Data Computing Appliance are expected to be available in the third quarter of calendar 2011.

Follow EMC World Online

About Greenplum and the Data Computing Division of EMC

EMC's Data Computing Division is driving the future of data warehousing and analytics with breakthrough products including Greenplum Data Computing Appliance, Greenplum Database, Greenplum Community Edition, Greenplum Apache Hadoop distribution, and Greenplum Chorus™ -- the industry's first Enterprise Data Cloud platform. The division's products embody the power of open systems, cloud computing, virtualization and social collaboration -- enabling global organizations to gain greater insight and value from their data than ever before possible.

About EMC

EMC Corporation (NYSE: EMC) is the world's leading developer and provider of information infrastructure technology and solutions that enable organizations of all sizes to transform the way they compete and create value from their information. Information about EMC's products and services can be found at www.EMC.com.

EMC and Greenplum are trademarks or registered trademarks of EMC Corporation in the U.S. and other countries. All other trademarks are the property of their respective owners.

Forward-Looking Statements

This release contains "forward-looking statements" as defined under the Federal Securities Laws.  Actual results could differ materially from those projected in the forward-looking statements as a result of certain risk factors, including but not limited to: (i) adverse changes in general economic or market conditions; (ii) delays or reductions in information technology spending; (iii) the relative and varying rates of product price and component cost declines and the volume and mixture of product and services revenues; (iv) competitive factors, including but not limited to pricing pressures and new product introductions; (v) component and product quality and availability; (vi) fluctuations in VMware, Inc.'s operating results and risks associated with trading of VMware stock; (vii) the transition to new products, the uncertainty of customer acceptance of new product offerings and rapid technological and market change; (viii) risks associated with managing the growth of our business, including risks associated with acquisitions and investments and the challenges and costs of integration, restructuring and achieving anticipated synergies; (ix) the ability to attract and retain highly qualified employees; (x) insufficient, excess or obsolete inventory; (xi) fluctuating currency exchange rates; (xii) threats and other disruptions to our secure data centers or networks; (xiii) our ability to protect our proprietary technology; (xiv) war or acts of terrorism; and (xv) other one-time events and other important factors disclosed previously and from time to time in EMC's filings with the U.S. Securities and Exchange Commission.  EMC disclaims any obligation to update any such forward-looking statements after the date of this release.

SOURCE EMC Corporation