Story image

Speak like a data center geek: Big data

18 Nov 2016

Big data is big for a lot of reasons. Some are literal (its massive datasets) and some are based on the promise of what it could one day deliver. For instance, IDC estimates a 44 billion gigabyte-sized digital universe by 2020, and the big data inside it offers potentially huge amounts of actionable and mind-blowing insights.

At Equinix, we’re into helping uncover all of it. But a first step is understanding some key big data definitions. That’s what our “How to Speak Like a Data Center Geek” series is for.

We’ll start basic on our first big data entry, since the list of definitions associated with big data is … big.

Big data

Too obvious? Well, we wanted to expand the big data definition a bit beyond what’s clear just by reading it – namely, it involves “big” amounts of “data.” A geek can do better. Here’s a solid definition from McKinsey: “Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.” So maybe big data can also be accurately called “too big data?”

The 3Vs

In an important 2001 report, Gartner analyst Doug Laney laid out the defining dimensions of big data, and they all happen to begin with “V”:

Volume: This refers to the depth and breadth of the data that must be managed, and is always growing. For instance, IBM says we create 2.5 quintillion bytes of data very day. That’s enough to fill 10 million Blu-ray discs.

Variety: This is the diversity of the types of data that make up big data datasets. It could be from video, audio, text, photos, etc., and proper analysis involves reconciling it all. Velocity: The sheer and increasing speed with which data is acquired and used.

People have added or proposed more Vs over the years (value, veracity, variability), but it all starts with the 3Vs.

Structured Data                 

This is data that has a defined length and format, such as numbers and dates, and is usually stored in a database. It accounts for about 20% of the data out there, and its structured nature makes it easier to access and organize. So it is potentially powerful and widely usable.

Unstructured Data 

This type of data does not follow a predefined data model or fit into relational databases. Examples include video, the text of email messages and social media. This makes up the bulk of the big data universe and has huge potential, but also presents bigger challenges for those trying to organize and gain insight from it.

Analytics

DataInformed’s has a concise definition of analytics: “Using software-based algorithms and statistics to derive meaning from data.” But the reality is that big data analytics could have an entire Geek entry on its own (and maybe someday, it will). Here are a few subgroups of big data analytics: behavioral analytics, event analytics, location analytics, text analytics. The bottom line is that without good analytics, big data is akin to a mountainous pile of papers dumped on the floor of a 100-acre warehouse. Big data analytics makes big data make sense.

Article by Jim Poole, Equinix blog network

Atos develops edge server with security in mind
The BullSequana Edge server is able to securely manage and process IoT data close to the source of data generation so that it is treated immediately.
Sony and Microsoft to explore strategic partnership
“Our partnership brings the power of Azure and Azure AI to Sony."
Google puts Huawei on the Android naughty list
Google has apparently suspended Huawei’s licence to use the full Android platform, according to media reports.
Fujitsu and Veeam partner to offer simplified backup and recovery
This new partnership promises the increased availability of data and faster recovery from disasters and unplanned system downtime.
AAEON wins edge accolades at COMPUTEX 2019
AI edge and IoT network solutions manufacturer AAEON has picked up two accolades at the COMPUTEX d&i Awards 2019.
AI driving 'unprecedented' M&A growth
Breakthroughs in artificial intelligence are causing ‘unprecedented’ growth for mergers and acquisitions, as companies grapple for their share of an AI market that will be worth $190 billion by 2025.
Chorus partners with Nlyte, expands edge data centre offerings
Chorus announced today that it is going ahead with expanding its Chorus EdgeCentre Colocation product to three sites across New Zealand.
Schneider shares advice for solving edge computing challenges
Schneider Electric has shared the findings of a new whitepaper that delves into the issues of deploying IT at the edge.