DataCenterNews Asia Pacific logo
Specialist data center news for Asia Pacific
Story image

How to keep your data lake initiative from becoming a data swamp

Business leaders who leverage the many benefits of data continue to remain ahead of their competitors. Enterprises can utilise data lakes to increase agile data delivery; however, they can't reap those benefits without addressing the challenges.

The data analytics market in Australia is predicted to grow at a CAGR of 20% between 2021 to 2025. This is representative of the need for enterprises to manage vast amounts of data.

The amount of data enterprises capture daily to drive critical business decisions, improve product offerings, and serve customers better is growing faster than ever before. In 2021, the total amount of data generated in the world was upwards of 74 Zettabytes, with the projection for 2025 being more than 180 Zettabytes. To quantify this, if each Terabyte in a Zettabyte were a kilometre, it would be equivalent to 1,300 round trips to the moon and back.

But what good is all this data if companies aren't able to utilise it to guide insights in a timely manner and in accordance with their immediate goals?

Enterprises can utilise data lakes to increase data elasticity and agile data delivery; however, they cannot reap those benefits without addressing the challenges of a data lakes initiative. For example, if you try to create analytics-ready data sets from heterogeneous data manually, you'll quickly find yourself in the middle of an extremely complex (if not impossible) and time-consuming project. And when all your data is finally ready for business consumption, it's already outdated.

What is the difference between a data lake, data mart, or data warehouse? 

Before getting too deep into the facts about data lakes, let's talk about the differences between a data lake, a data warehouse, and a data mart. While these types of centralised repositories all provide the ability to store data for analysis and reporting, there are some key differences when it comes to structures, data types, and functionality.

Data warehouse 

A data warehouse is used for companies with a massive amount of data from specific sources, such as an ERP, core systems, or custom applications, and it is usually used for business intelligence, batch reporting, or data visualisation. Data warehouses typically have the following properties:

  • They represent an abstracted picture of the business organised by subject area. 
  • They are highly transformed and structured.
  • Data is not loaded to a data warehouse until the use for it has been defined.
  • They generally follow a methodology, such as dimensional modeling and textual disambiguation.  

Data mart 

A data mart is basically a subset of a data warehouse where the data contained is highly accurate for specific users or data consumers. It is subject-oriented and designed to meet the needs of a specific group of users to make tactical decisions for their department. For this reason, a data mart would be of use to Australian companies with a lot of focused sales data, or perhaps a marketing department analysing customer-specific data sets from multiple sources.

Data lake 

A data lake stores raw, free-flowing data, structured or unstructured, from a wide variety of sources, like social media, devices, apps, or productive databases. Their main use is for machine learning, data discovery, or predictive analysis. The data contained within a data lake is in its real natural state, not accurate, no insights, just data. Some features of data lakes include:

  • All data is loaded from source systems. No data is turned away. 
  • Data is stored at the leaf level in an untransformed or nearly untransformed state. 
  • Data is transformed, and schema is applied to fulfill the analysis needs. 
  • Data lakes retain all data. 

Since data lakes are designed for storing massive amounts of data in its raw form, they may be of use to government agencies who ingest vast amounts of data about citizens in Australia from say, a census, then keep that on hand until specific analysis needs to be done on one of those data sets.

How to start a data lake initiative 

A data lake is the answer to organising large volumes of data from diverse sources. Today more than ever, Australian businesses are facing the universal challenge of managing exponentially growing data sets. Thankfully, the data lake landscape is evolving quickly and adding real business value to a wide range of industries - from healthcare, retail and banking to mining, manufacturing, transportation and many more.

Using an advanced modern solution for data replication, allows organisations to handle all the connections from multiple source systems to your data lake, allowing you to achieve your goals and enable increased efficiency and accuracy without a negative financial impact.

Typically, to start with a data lake project, you have to move your current data from your system through a full load process, or the “Refresh Process,” allowing you to schedule how and when that data loads. For instance, you can schedule to automatically data load in batches overnight, so it's fresh and up to date at the start of business every day.

This thoughtful loading process can help prevent possible network or user issues, making your data lake creation as smooth and painless as possible.

With all your current data stored in your new data lake system, you can now take advantage of the functionalities such as Change Data Capture (CDC), capturing only the changed data using log-reading technology.

There's no risk in learning more about how this ground-breaking solution can help you use the massive amount of data at your fingertips to exact substantial, strategic improvements to your business.

Related stories
Top stories
Story image
Artificial Intelligence
ASUS Servers announce AI developments at NVIDIA GTC
The Taiwanese multinational now offers NVIDIA-certified servers with H100 Tensor Core GPU and AI enterprise software suite.
Story image
No-code
Eradicating ‘App Fatigue’ and retention problems through implementing no-code ITSM
Almost always, simplicity is best. Intuitive designs and practical workflows are the keys to preventing fatigue.
Story image
Sustainable IT
Equinix partners NUS to use hydrogen tech in data centres
The partners will develop hydrogen fuel technologies for green data centres in tropical climates, and for use in Equinix’s global network.
Story image
IT infrastructure
Bentley Systems announces finalists for the 2022 Going Digital Awards in Infrastructure
The company says that this annual awards program honours the work of Bentley software users who are advancing infrastructure design, construction, and operations throughout the world.
Story image
IT Automation
Juniper Networks announces expansion of Apstra Software with Apstra Freeform
The newly announced Apstra Freeform technology will give customers the ability to manage and automate operations for data centers regardless of the architecture.
Story image
Software-as-a-Service
Honeywell launches Data Center Suite for business outcomes
Honeywell has launched its Data Center Suite, a portfolio of outcome-based software offerings to help data centre managers and owners.
Story image
Cloud
SoftIron announces its newest flagship offering, HyperCloud
SoftIron has announced HyperCloud, the world's first full turnkey, completely integrated and supported Intelligent Cloud Fabric and the company's newest flagship offering.
Story image
Digital Transformation
NTT launches its Cyberjaya 6 data center in Malaysia
NTT expands its hyperscaler footprint in Malaysia with its sixth data center facility, supporting the growing digital economy.
Story image
Data Centre Maintenance / Management
Vertiv releases update to Smart InfraSight platform
Vertiv has unveiled an update to its Smart InfraSight data centre management platform, featuring improved intelligence and the ability to manage multiple IT devices.
Story image
Data center
Macquarie Asset Management acquires stake in ST Telemedias VIRTUS Data Centres
"We will further strengthen VIRTUS' focus on sustainability by backing investment in its technology and enhancing the lifecycle management of its equipment."
Story image
Edge Computing
NTT launches Edge-as-a-Service to accelerate automation
"Minimum latency, maximum processing power, and global coverage are exactly what enterprises need to accelerate their digital transformation journeys.”
Story image
Machine learning
Oracle announces MySQL HeatWave for Amazon Web Services
MySQL HeatWave is a service that combines OLTP, analytics, machine learning, and machine learning-based automation. 
Story image
Cloud
DCI plans to build new cloud edge data centre in Canberra
DCI is one of the first to commit to the Precinct which has a focus on defence, space, cybersecurity and high-tech manufacturing sectors.
Story image
Melbourne
Equinix invests $23m to expand ME2 data centre in Melbourne
Equinix has completed the second phase expansion of its ME2 International Business Exchange data centre, located in Port Melbourne.
Story image
Microsoft
VMware extends collaboration with Microsoft for enterprise workloads in Azure
Mutual customers will have the choice to purchase Azure VMware Solution through the VMware Cloud Universal program.
Story image
Software-as-a-Service
ManageEngine unveils SaaS availability of Analytics Plus
ManageEngine's Analytics Plus is now available as a software as a service (SaaS) offering, enabling users to set up a completely functional and integrated analytics platform anywhere in under a minute.
Story image
Data
Talend announces support for Amazon Redshift Serverless
Talend has announced its support for Amazon Redshift Serverless, with the company saying the integration reinforces its commitment and leadership in supporting businesses.
Story image
Data Centre Maintenance / Management
Schneider Electric backs new Leading Edge data centre in Australia
As a result of the new project, regional Australian businesses and communities will likely have greater access to distributed cloud networks.
Story image
Digital Transformation
Nanyang Technological University Singapore builds digital brand presence
Leveraging the customisation features of Sitefinity DX, non-technical users could upload content and create design pages and boost work productivity. 
Story image
Storage
Seagate announces next gen advanced storage arrays
The new Exos X systems feature up to twice the performance of the previous generation and enhanced enterprise-class durability, the company states.
Story image
Startup
Zetaris is changing the way we think about data virtualisation
Zetaris was launched on the Microsoft Marketplace and Ingram Micro Cloud Marketplace in Australia in 2020 and has since expanded into nine global markets.
Story image
Hyperscale
Growth in hyperscale data centres to increase shortage of IT workers
New Zealand's tech worker capacity is set to come under increasing pressure as the number of hyperscale data centres grows.
Story image
Sustainable IT
Empyrion DC announces 40MW green data center in South Korea
Empyrion DC has announced it is developing a 40MW green data center in Gangnam, Seoul, South Korea (GDC).
Story image
Software-as-a-Service
Cloudera launches all-in-one data lakehouse cloud service
CDP One makes it faster, easier and less risky for businesses to move to the cloud and migrate existing workloads to a modern data architecture.
Story image
Data center
Australia’s data centre pioneer still leading after 22 years
We look at the fascinating success of Macquarie data centre's over its 22 year life span and how they continue to innovate in a highly contested sector.
Aws Marketplace
Learn how to implement a backup and recovery plan for a new generation of Kubernetes-based modern applications
Link image
Story image
Superloop
Stellar financial result after major strategic moves by Superloop
We get a glimpse under the hood at the financial results from 2022 for the connectivity giant Superloop.
Story image
Software Defined Wide Area Network
Axiata, Versa Networks partner for enterprise SASE in Asia
Axiata has partnered with Versa Networks to deliver Secure Access Service Edge (SASE) technology to rapidly digitalising Asian enterprises.
Story image
Google Cloud Platform
Google Cloud to open first cloud region in NZ - among others
Google Cloud has announced plans to bring three new cloud regions, one each in New Zealand, Malaysia and Thailand.
Story image
Firewall
Fortinet unveils compact firewall for hyperscale data centres, 5G networks
"Fortinet’s dedication to pushing the boundaries of what is possible in security performance has yielded the most powerful compact firewall yet."
Story image
Gartner
SnapLogic named Visionary in two Magic Quadrant categories
SnapLogic has announced that it is the only iPaaS (Integrated Platform as a Service) vendor to be named a Visionary in two Magic Quadrant categories.
Story image
Public Cloud
How hyperscalers are shaping Australia’s enterprise cloud landscape future
Australia’s public cloud market encompasses both global and domestic players and there has been widespread adoption of cloud technology across public and private sectors.
Story image
Network Infrastructure
Vertiv launches solutions to better manage edge computing
Vertiv has introduced new power and cooling solutions for the edge of the network, including the addition of lithium-ion models to a leading on-line UPS family.
Story image
Multi-cloud
VMware advances multi-cloud management with VMware Aria
Managing apps and infrastructure in a multi-cloud, especially public cloud, and multi-technology environment is complex.
Story image
Software-as-a-Service
Iron Mountain InSight SaaS platform extends capabilities on AWS
Company deepens work with AWS, helps customers to accelerate their journey from physical to digital on a global scale.
Story image
Data Protection
iseek secures Queensland Government data centre contract
iseek secures the Queensland Government's core network data centre as-a-service contract after a competitive procurement process undertaken by the CITEC.
Story image
5G
Worldwide 5G mobile data traffic exploding - report
"With 5G, there is a wider range of deployment scenarios, forcing vendors to provide comprehensive solutions to support every need."
AWS Marketplace
Whitepaper: A practical guide for mitigating risk in today’s modern applications
Link image
Story image
Update
InterSystems releases updates to its IRIS data platform
Provider of next-generation solutions InterSystems has announced a series of new releases to its award-winning InterSystems IRIS data platform.
Story image
Storage
DCI Data Centers breaks ground on AKL02 center
DCI Data Centers has commenced construction on Auckland's largest data center.
Story image
Partnerships
NCS, FPT Software launch Strategic Delivery Centre in Vietnam
The new partnership is designed to support increasing demand for high quality digital services across the region.