Story image

Speak like a data center geek: Big data

18 Nov 16

Big data is big for a lot of reasons. Some are literal (its massive datasets) and some are based on the promise of what it could one day deliver. For instance, IDC estimates a 44 billion gigabyte-sized digital universe by 2020, and the big data inside it offers potentially huge amounts of actionable and mind-blowing insights.

At Equinix, we’re into helping uncover all of it. But a first step is understanding some key big data definitions. That’s what our “How to Speak Like a Data Center Geek” series is for.

We’ll start basic on our first big data entry, since the list of definitions associated with big data is … big.

Big data

Too obvious? Well, we wanted to expand the big data definition a bit beyond what’s clear just by reading it – namely, it involves “big” amounts of “data.” A geek can do better. Here’s a solid definition from McKinsey: “Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.” So maybe big data can also be accurately called “too big data?”

The 3Vs

In an important 2001 report, Gartner analyst Doug Laney laid out the defining dimensions of big data, and they all happen to begin with “V”:

Volume: This refers to the depth and breadth of the data that must be managed, and is always growing. For instance, IBM says we create 2.5 quintillion bytes of data very day. That’s enough to fill 10 million Blu-ray discs.

Variety: This is the diversity of the types of data that make up big data datasets. It could be from video, audio, text, photos, etc., and proper analysis involves reconciling it all.
Velocity: The sheer and increasing speed with which data is acquired and used.

People have added or proposed more Vs over the years (value, veracity, variability), but it all starts with the 3Vs.

Structured Data                 

This is data that has a defined length and format, such as numbers and dates, and is usually stored in a database. It accounts for about 20% of the data out there, and its structured nature makes it easier to access and organize. So it is potentially powerful and widely usable.

Unstructured Data 

This type of data does not follow a predefined data model or fit into relational databases. Examples include video, the text of email messages and social media. This makes up the bulk of the big data universe and has huge potential, but also presents bigger challenges for those trying to organize and gain insight from it.

Analytics

DataInformed’s has a concise definition of analytics: “Using software-based algorithms and statistics to derive meaning from data.” But the reality is that big data analytics could have an entire Geek entry on its own (and maybe someday, it will). Here are a few subgroups of big data analytics: behavioral analytics, event analytics, location analytics, text analytics. The bottom line is that without good analytics, big data is akin to a mountainous pile of papers dumped on the floor of a 100-acre warehouse. Big data analytics makes big data make sense.

Article by Jim Poole, Equinix blog network

Chayora announces a strategic partnership with Sinnet Technology
Chayora, a Hong Kong-based data center infrastructure company, announced that it has entered into a strategic partnership with Beijing Sinnet Technology.
Commvault fully integrates backup with Cisco Hyperflex
Its IntelliSnap technology has been validated to work with Cisco HyperFlex hyper-converged systems without the need for third-party tools.
Huawei continues 5G trials despite ongoing concern
Huawei completed the 5G NR test at 2.6GHz spectrum in the 5G trial organised by the IMT-2020 (5G) Promotion Group. 
Experts comment on record 772mil-user data breach
Dubbed “Collection #1”, the data set contains emails and passwords with over a billion unique combinations of email addresses and passwords.
Top risk facing organisations? Why, it’s an IT talent famine
For some time there has been talk about how the IT industry is crying out for new talent and skills, which a lot of people have glossed over. But now Gartner says it is a harsh reality.
HPE invests in services with new A/NZ execs 
With IT services spend growing in Australia and New Zealand, HPE is appointing execs for software and technology services in the South Pacific.
Inspur’s server delivery to Baidu claims new record
After an urgent request, Inspur delivered a shipment of rack scale servers of more than 10,000 nodes to a Baidu data centre - equating to one server delivered every 2.88 seconds.
LISA Double Access fibre management system to launch at Cisco Live
“In a data centre, the protection of the fibre is key, which is exactly what the LISA Double Access offers customers.”