Scott Koegler - February 19, 2019

Big data is the new data


The store of data is growing and will continue for the foreseeable future. Now that all data has become ‘big’ enterprises need to shift their mindset to dealing with the enormity of data as a common condition rather than as a new trend. CIOs need to build end-to-end analytics into their data management portfolio as their data expands across an increasingly hybrid cloud environment.

Make data count

Data has two values. The first is the value it delivers immediately as a product of the process that created it. The second is the hoped-for value that may accrue as it accumulates into what’s been termed big data. Coping with data accumulation has been eased by the relative reduction in the cost of storing it, and the fact is that there is no going back. Data will continue to accumulate so that the superlatives we use to describe today’s volume of data will be eclipsed almost immediately as more bits land in our already massive mountain of data. Now that we’ve collectively figured out how to securely store and manage it we need to do more to derive value from it. Doing that means treating data like it’s always been treated and not as something new. Big data is just data - but more of it.

Reduce retention

Storage facilities may be less expensive and easier to manage than they were when all data was local data, but costs add up as data volume grows. Some of those applications that started out as experimental projects went into production and are rolling along collecting data. Once a project has moved from trail to production and counted as a success, it’s on to the next project. But real success includes managing the app’s deployment. Users may complain about slow performance, prompting a change to higher performance servers or other optimization steps but there are few triggers to guide managers to review storage capacities and their costs.

CIOs need to establish ongoing reviews of storage by application, use priorities, and costs to determine what data needs to be kept or migrated to lower-cost storage tiers. That analysis is complicated by the billing practices themselves that can include thousands of lines of detailed billing with cryptic descriptions. Spreadsheets can be only marginally useful because of the complexity of the invoices and when multiple cloud providers are involved the analysis can be as costly as the services themselves. Bring appropriate management software to bear on these tasks to identify apps that are growing and consuming resources and put migration practices in place.

Sculpt data at its origin

Applications that live in the cloud typically store their data in the cloud as well. But a growing array of remote devices typically referred to as IoT devices, populate everything from manufacturing floors to smart city facilities to private homes. Each of these devices generates data and has a connection to a computing facility where the data is transferred and used in a variety of processing tasks. With what’s projected to become 20 billion devices sometime in 2020, that’s a lot of data transferred and stored, then eventually processed. The bulk of that data is likely to be sensor readings that in themselves have little use until they have been combined and processed.

IoT deployments are increasingly making use of edge computing that places compute and storage capabilities either within IoT devices or in proximity to a group of IoT devices. IoT data is then stored and processed by the edge devices and only the results are uploaded to main storage. According to an article in NetworkWorld, “an autonomous car may generate 4TB of data per day, mostly from its sensors, but 96% of that data is what is called true but irrelevant, according to Martin Olsen vice president, global edge and integrated solutions at Vertiv, a data center, and cloud computing solutions provider. “It’s that last 4% of what’s not true that is the relevant piece. That’s the data we want to take somewhere else,” he said.

CIOs need to evaluate the value of the data they are moving across network connections and committing to long term storage to find ways to minimize relatively more expensive storage and processing resources and maximize the value of the data they do retain.

Our site uses cookies. By continuing to use our site, you are agreeing to our cookie policy.

Accept & Close