Story image

Data lakes transforming the enterprise data warehouse

14 Jan 2016

The introduction of data lakes is one of the most significant changes to enterprise data warehouse technology, according to CenturyLink’s head of business development Martin Hooper.

Hooper says classic enterprise data warehouse architecture is evolving under the influence of new technologies, new requirements, and changing economics.

He says data lakes, large storage repositories and processing engines, are transforming the way data is handled by enterprises.

“Data lakes let enterprise data warehouses store massive amounts of data, offer enormous processing power, and let organisations to handle a virtually limitless number of tasks at the same time,” Hooper explains.

Classic enterprise data warehouses have sources feeding a staging area, and data that is consumed by analytic applications, he says.

“In this model, the access layer of the data warehouse, known as the data mart, is often part of the data warehouse fabric, and applications are responsible for knowing which databases to query.”

According to Hooper, in modern enterprise data warehouses, data lake facilities based on the Apache Hadoop open source software framework replace the staging area that sits at the centre of traditional data warehouse models. While data lakes provide all of the capabilities offered by the staging area, they also have several other important benefits, he says.

 “A data lake can hold raw data forever, rather than being restricted to storing it temporarily, as the classic staging area is,” Hooper explains.

“Data lakes also have compute power and other tools, so they can be used to analyse raw data to identify trends and anomalies.

“Furthermore, data lakes can store semi-structured and unstructured data, along with big data.”

Using Hadoop as an enterprise data warehouse staging area is not a new concept, says Hooper.

“A data lake based on Hadoop not only provides far more flexible storage and compute power, but it is also an economically different model that can save businesses money,” he says.

In addition, a data lake provides a cost-effective, extensible platform for building more sandboxes, which are testing environments designed to isolate and execute untested code, Hooper explains.

“A Hadoop staging approach begins to solve a number of the problems with traditional enterprise data warehouse architecture, while full-blown data lakes have created an entirely new data warehouse model that is more agile, more cost-effective, and provides companies with a greater ability to leverage successful experiments across the enterprise, resulting in a greater return on data investment,” he says.

Dropbox invests in hosting data inside Australia
Global collaboration platform Dropbox has announced it will now host Australian customer files onshore to support its growing base in the country.
Opinion: Meeting the edge computing challenge
Scale Computing's Alan Conboy discusses the importance of edge computing and the imminent challenges that lie ahead.
Alibaba Cloud discusses past and unveils ‘strategic upgrade’
Alibaba Group's Jeff Zhang spoke about the company’s aim to develop into a more technologically inclusive platform.
Protecting data centres from fire – your options
Chubb's Pierre Thorne discusses the countless potential implications of a data centre outage, and how to avoid them.
Opinion: How SD-WAN changes the game for 5G networks
5G/SD-WAN mobile edge computing and network slicing will enable and drive innovative NFV services, according to Kelly Ahuja, CEO, Versa Networks
TYAN unveils new inference-optimised GPU platforms with NVIDIA T4 accelerators
“TYAN servers with NVIDIA T4 GPUs are designed to excel at all accelerated workloads, including machine learning, deep learning, and virtual desktops.”
AMD delivers data center grunt for Google's new game streaming platform
'By combining our gaming DNA and data center technology leadership with a long-standing commitment to open platforms, AMD provides unique technologies and expertise to enable world-class cloud gaming experiences."
Inspur announces AI edge computing server with NVIDIA GPUs
“The dynamic nature and rapid expansion of AI workloads require an adaptive and optimised set of hardware, software and services for developers to utilise as they build their own solutions."