Aaron Russo - Senior Manager for Tech Sales Data Center
Aaron Russo
Senior Manager for Tech Sales Data Center
Jeff Stork - Software Delivery Practice Manager
Jeff Stork
Senior Service Manager
Lane Shelton - Vice President of Software Business Development
Lane Shelton
Vice President of Software Business Development
Tony Dancona - Vice President of VMware EUC
Tony Dancona
VP VMware EUC, Solutions and Services
»See All Authors
Eric Bruno - Contributing Editor

Storing Big Data

It's key to ensure that big data is stored economically so it can yield optimal benefits in predictive analytics projects.


Most people associate big data with the need to store vast amounts of data, and they’re correct. However, big data can also be thought of as data in motion. All of your valuable data doesn’t do you much good if it’s just sitting there on a drive. Instead, you need regular, efficient access to it in order to analyze it as you capture it. And you need access on demand at a later time. Big data also needs to be fast data. How can you achieve this? Modernization!

Big DataModernization will help you increase storage capacity and performance, while reducing cost per gigabyte. Modern storage systems that utilize a combination of flash drives and traditional hard drives create a solution that’s simultaneously high-performance and high capacity, with lower cost-per-I/O and cost per-gigabyte stored.

Your Multi-pronged Big Data Storage Strategy

According to IBM, more than 2.5 quintillion bytes of data are created each day. This demand for storage far outpaces the increases in average hard drive size.  Although it’s a difficult problem to solve, according to John Bantleman of RainStor, the benefits of big data outweigh the IT investment required.  Here’s a proposed strategy based on Bantleman’s recommendation and other sources:

  • Follow the lead of Google and others, and migrate to a hyperscale storage solution that utilizes both network attached storage (NAS) and large numbers of servers with direct attached storage (DAS).
  • Integrate Hadoop into your existing RDBMS solution and BI tools. There has been a lot of interest and work done in this area over the last few years. However, make sure you use flash-based DAS for these servers to enable the best performance and scale. Doing this allows you to preserve your existing data warehouse, business analytics, and visualization tools while helping to grow more effectively with newer, more efficient big data tools and products. Utilizing open source technology with large communities also helps keep costs down.
  • Consider migrating to a scale-out clustered NAS solution, which can grow to scale as your data grows, has built-in redundancy for reliability, and allows for parallel data access across large numbers of servers. This type of storage solution is hyperscale, high performance, and cost effective for big data storage and real-time analytics.

Moving forward, modern high-speed solutions including SSD-based servers, hyperscale direct attached storage, and clustered NAS solutions will help you find the hidden assets in your enterprise data while keeping the associated storage costs in line. What’s more, you’ll be enabling business process efficiency and reliability to boot.

Justifying a storage refresh? Get SMART! Learn how next generation technologies can create a cost effective model with unrivaled flexibility.  Get S.M.A.R.T. today.

For more than 30 years, the Connection family of companies has been trusted to provide and transform technology into complete solutions. For more information, drop us a line.