Big Data and EMC Isilon Drive Bioinformatic Breakthroughs at Columbia University

Renowned University Deploys EMC Isilon Scale-out NAS for Center of Computational Biology and Bioinformatics, Streamlining Data Access and Accelerating Vital Research Programs

Aug 16, 2011, 09:00 ET from EMC Corporation

SEATTLE, Aug. 16, 2011 /PRNewswire/ -- EMC Corporation (NYSE: EMC) today announced that Columbia University's Center for Computational Biology and Bioinformatics ("C2B2") has deployed EMC® Isilon® scale-out NAS to support the advanced, computing-intensive demands of its ground-breaking research programs. Using the EMC Isilon X-Series and NL-Series Platform Nodes combined with the EMC Isilon SmartPools and SmartQuotas software applications, C2B2 has dramatically increased system performance and flexibility, streamlining researchers' access to data, while reducing operational expenses and big data management complexities.

"We were using a traditional NAS system that struggled to support the huge amounts of input/output demands on the 400 CPUs in our computing infrastructure," said John Lowell Wofford, director of IT services for Columbia University's C2B2. "We knew that we'd soon outgrow that number by at least ten times. After switching to Isilon, we no longer had to worry that our system couldn't handle our research demands. We knew that we could independently scale capacity and performance, so that we buy only what we need, when we need it."

The Isilon storage system that C2B2 employs now supports some 4,000 CPUs, which can handle the heavy I/O and data analysis demands of C2B2's research into such areas as computational biophysics and structural biology. In addition to C2B2, the Isilon system also supports the storage needs of Columbia University's Herbert Irving Comprehensive Cancer Center, the Institute for Cancer Genomics, and the J.P. Sulzberger Columbia Genome Center. By leveraging Isilon system capabilities these research centers reduce cost and IT complexity by sharing a computing and storage infrastructure that's flexible enough to match a broad range of tasks.

Genomic databases, such as the ones C2B2 employs, are typically directories with an enormous amount of files that require a heavy amount of namespace reads to index. Wofford notes that 40 percent of the read-write requests on the C2B2 system are namespace-related. The Isilon X-Series' intelligent use of SSDs for metadata and file-based storage speeds namespace-read performance, providing a "huge improvement" over C2B2's previous computing/NAS system infrastructure.

The Isilon system enables C2B2 to speed researchers' access to data and streamline big data system management. "We're processing big sets of genomic data, doing molecular biophysical simulations and sequence analysis, which can lead to new drug discoveries and advances in basic science," said Wofford. "Despite the fact that we have nearly one petabyte of data, the EMC Isilon OneFS® operating system is so easy to manage, we don't need a dedicated storage administrator. Aside from the cost savings, we're able to free up our IT people to focus on other initiatives, such as our virtualization infrastructure."

"Columbia University's Center for Computational Biology and Bioinformatics is leading the way in the scientific community with its ground-breaking research programs," said Sam Grocott, vice president of marketing, Isilon. "We are happy to provide the storage foundation for its computing-intensive demands, so C2B2 can focus on the important discoveries at hand and not on managing the headaches of complex big data infrastructure."

