DataCenterNews Asia logo
Specialist data center news for Asia
Story image

How to stop data lakes from getting swamped

By Julia Gabel
Mon 30 Apr 2018
FYI, this story is more than a year old

A “data lake” sure sounds inviting.

Cool flows of structured and unstructured data, all streaming into a vast repository, where companies are free to fish out awesome new insights all day long.

But without the right approach, that data lake isn’t as welcoming as it looks on the surface.

The sheer volume of data, for instance, can easily overwhelm companies who aren’t discerning about what is filling the lake, and why.

The weight of all this data can also clog things up unless companies are committed to using the latest technology to integrate it and process it for maximum insight.

The data also needs to be fast and easy to access and secure, so companies can get value from it while ensuring the data isn’t misused or compromised.

In short, it doesn’t take much for a data lake to start looking like a data swamp: a stagnant, murky place, where when you stick in a net, you can’t be sure what will come up.

Avoiding data swamps is a must to truly capitalize on increasing volumes of data and generate new business intelligence that propels growth.

Fortunately, there are ways to keep data lakes dynamic, pristine and viable business assets.

Save the lakes 

The rise of data lakes is the result of the sheer amount of information available today.

Technologies like the Internet of Things (IoT) and its billions of global sensors stream out data that’s never been collected before, promising the discovery of insights that just a few years ago weren’t knowable and the monetization of data flows that we didn’t imagine existed.

Today, for instance, agriculture companies can crunch centuries of crop data to better predict weather patterns and yields.

Transportation firms can turn to big data to optimize traffic routes by combining past and current records about vehicle speeds, weather, road conditions and fuel consumption. It’s exciting, but this kind of information must live somewhere where it’s useful, accessible and safe.

Data lakes that can’t offer those things are a waste of money and a lost opportunity to capitalize on today’s unbelievably rich data resources. Here are a few quick tips for companies looking to avoid data swamps:

  • Be selective

Information overload isn’t a new problem, but it takes on new dimensions for data lakes in an age when Cisco says global big data volumes are soaring toward 402 Exabytes (1 exabyte = 1 billion gigabytes) by 2021, an eight-fold increase from 2016.

In the face of all that information, companies need to resist the temptation to over-collect data just because it’s available.

Companies need to know exactly what business problem they are trying to address and precisely what they hope to achieve with the data they’re gathering.

This can help them avoid filling data lakes with volumes of information that do nothing but bury them in the muck and prevent them from taking advantage of what their data offers.

  • Automate

To truly make sense of the data filling their data lakes, companies need to take advantage of emerging technologies like artificial intelligence (AI) and machine learning that can help them sort, analyze and learn from the data with superhuman efficiency.

These capabilities help companies spot patterns, create hypothesis and find value in their data lakes that might otherwise go unnoticed.

Companies are increasingly learning this. In NewVantage Partners’ annual executive survey, 76.5% of executives indicate that the proliferation and greater availability of data is empowering AI and cognitive initiatives in their organizations.

“The survey results make clear that executives now see a direct correlation between big data capabilities and AI initiatives,” according to the MIT Sloan Management Review.

In short, more automation means fewer data swamps.

  • Keep it close

Distance matters because it delays many of the functions that prevent data lakes from devolving into data swamps.

The further away data lakes are from where data is created or needs to be accessed and analyzed, the greater the chance that latency will slow analytics engines or the various processes that drive AI, such as interconnection between cloud apps, data sources, users, etc.

Creating data lakes in proximity to where data is stored, produced or needed by users and applications maximizes security and optimizes the functions powered by the data the lakes contain, which keeps the lakes fresh and productive.

Data lakes thrive here

A global interconnection platform is a place where a data lake can thrive.

It provides the proximity to various sources, data stores, analytics, and cloud and network partners that’s so crucial to keeping data lakes healthy.

Platform Equinix spans 48 markets on five continents, so companies can create data lakes close to almost anywhere.

The network- and cloud-density on Platform Equinix (1,700+ networks, 2,900+ cloud and IT service providers) is also a huge benefit because it enables interconnection to the cloud and network services needed to fully exploit a company’s data assets.

In addition, Equinix Data Hub is a solution deployed on Platform Equinix that’s designed to enable companies worldwide to store vast amounts of data at a local level, for quick access by the people and applications that need it.

That’s a data swamp preventative if there ever was one.

Article by Jim Poole, Equinix Blog Network

Related stories
Top stories
Story image
Digital Transformation
Federated change is the best path to digital evolution
Businesses that can successfully manage the exponentially expanding masses of data produced by modern consumers will be the businesses that survive and prosper.
Story image
Low-code
Video: 10 Minute IT Jams - An update from Mendix
Mendix is a low-code platform used by businesses to develop mobile and web apps at scale, and Jornt joins us today to discuss how these offerings work, and what benefit they have in the development process.
Story image
Sustainability
Empyrion DC announces 40MW green data center in South Korea
Empyrion DC has announced it is developing a 40MW green data center in Gangnam, Seoul, South Korea (GDC).
Story image
Google Cloud
Google Cloud to open first cloud region in NZ - among others
Google Cloud has announced plans to bring three new cloud regions, one each in New Zealand, Malaysia and Thailand.
Story image
Data center
Macquarie Asset Management acquires stake in ST Telemedias VIRTUS Data Centres
"We will further strengthen VIRTUS' focus on sustainability by backing investment in its technology and enhancing the lifecycle management of its equipment."
Story image
Partnership
NCS, FPT Software launch Strategic Delivery Centre in Vietnam
The new partnership is designed to support increasing demand for high quality digital services across the region.
Story image
Data Centre Maintenance / Management
Vertiv releases update to Smart InfraSight platform
Vertiv has unveiled an update to its Smart InfraSight data centre management platform, featuring improved intelligence and the ability to manage multiple IT devices.
Story image
Hybrid Cloud
ERP implementations biggest concern for customers - report
"Companies are setting a higher bar for their ERP providers to deliver on more than just the technology itself."
Story image
Sustainability
Kohler Power Systems diesel generators now more sustainable
Kohler Power Systems has announced its diesel generators are compatible with Hydrotreated Vegetable Oil (HVO), a major breakthrough in the usage of alternative fuels in backup power.
Story image
Infrastructure
Global investment in data centers more than doubled in 2021
DLA Piper's latest global survey finds the total investment in data center infrastructure worldwide rose from USD $24.4 billion in 2020 to USD $53.8 billion in 2021.
Story image
Data center
Keppel deepens inroads into China’s data centre market
This latest development marks Keppel’s sixth project since entering mainland China’s data centre market in 2020. 
Story image
Southern Cross Cable
Southern Cross Cable launches the SX NEXT cable to connect NZ to the world
The new Southern Cross NEXT fibre cable (SX NEXT) is set to connect Australasia to the US and further enhance connectivity between New Zealand, Australia, and the US.
Story image
Sustainability
ST Engineering launches cooling system for greener data centers
ST Engineering says its Airbitat DC Cooling System cools down data centers and achieves annual net energy savings of more than 20% over conventional chiller systems alone. 
Story image
Cybersecurity
Cloudflare expands A/NZ footprint with four new data centres
New data centres in Adelaide, Canberra, Hobart, and Christchurch will bring faster, more reliable, and more secure internet to A/NZ.
Story image
Cloud
DCI plans to build new cloud edge data centre in Canberra
DCI is one of the first to commit to the Precinct which has a focus on defence, space, cybersecurity and high-tech manufacturing sectors.
Story image
Infrastructure
Oracle Cloud Infrastructure expands distributed cloud services
“Distributed cloud is the next evolution of cloud computing, and provides customers with more flexibility and control in how they deploy cloud resources."
Story image
SaaS
Iron Mountain InSight SaaS platform extends capabilities on AWS
Company deepens work with AWS, helps customers to accelerate their journey from physical to digital on a global scale.
Story image
Sustainability
SoftIron joins Sustainable Digital Infrastructure Alliance
SoftIron has joined the Sustainable Digital Infrastructure Alliance (SDIA), a platform designed to help the digital sector reduce its environmental impact.
Story image
Microsoft
Schneider Electric named Microsoft Energy & Sustainability Partner of the Year
"The award is a great recognition of the collaborative impact we are making together, to tackle climate change."
Story image
Migration
SNP unveils next generation of CrystalBridge software platform
Data is a key pillar of every customer-centric organisation, as it relies on agile decisions to become increasingly sustainable and intelligent.
Story image
Energy
Sustainability huge factor for APAC data centre managers
A new report reveals that 85% of data centre managers in APAC believe that sustainability will significantly impact operations and decision making.
Story image
Public Cloud
Public cloud services revenues top $400 billion in 2021
"For the next several years, leading cloud providers will play a critical role in helping enterprises navigate the current storms of disruption."
Story image
Artificial Intelligence
Vectra AI named as AWS security competency partner
Threat detection and response company Vectra AI has announced that it has become an Amazon Web Services Security Competency Partner.
Story image
Digital Realty
Digital Realty joins forces with CypressTel to deliver enhanced interconnectivity
The collaboration expands access across the Greater China region with Digital Realty's PlatformDIGITAL and CypressTel's hybrid WAN capabilities.
Story image
Development
Intel Labs unveils integrated photonics research advancement
"This new research demonstrates that its possible to achieve well-matched output power with uniform and densely spaced wavelengths."
Story image
Migration
New Relic launches Agentless Monitoring for SAP Solutions
The company says the solution empowers IT teams to better support business operations by harnessing existing SAP data sources to access all necessary telemetry data.
Story image
Big Data
DataStax, Nanyang Polytechnic partner to grow big data management talent in Singapore
The collaboration will deliver technology, curricula and certifications in big data management to accelerate innovation and sustainability.
AWS Marketplace
Whitepaper: A practical guide for mitigating risk in today’s modern applications
Link image
Story image
Amazon Web Services / AWS
Sapporo City selects Nutanix Cloud Clusters on AWS
The city first used a hyper-converged infrastructure (HCI) solution from Nutanix to modernize and improve the efficiency of its on-premise datacenter.
Story image
Hybrid Cloud
HPE GreenLake advances hybrid cloud experience with new services
"The innovations unveiled today further build on our vision to provide the market with an unmatched platform to spur innovation and drive transformation.”
Story image
Data center
Schneider Electric launches education platform to address data center talent shortage
Schneider Electric has announced a series of updates to its vendor-agnostic and CPD-accredited digital education platform.
AWS Marketplace
Learn how security orchestration, automation, and response (SOAR) enhances your security strategy.
Link image
Story image
Cybersecurity
Zscaler launches co-located data centres in Canberra and Auckland
The investment will offer public and private sector enterprises greater resilience in support of their zero trust cybersecurity posture.
Story image
Schneider Electric
Schneider Electric University adds new courses to lineup
The new updates include fundamentals of power, cooling, racks and physical security, and guidance on how to optimise data centre designs.
Story image
Macquarie Data Centres
Macquarie deal to pioneer CO2-cutting data centre tech in Australia
Macquarie Data Centres has signed a multi-year deal with ResetData, an Australian first provider using Submer data centre technology. 
Story image
Microsoft
Cloudian’s HyperStore validated to work with Microsoft Azure
Cloudian’s HyperStore object storage is now validated to work with Microsoft Azure Stack HCI, a joint offering that will give customers public cloud benefits within their own data centres.
Story image
SaaS
Why is MACH architecture a new big thing in the tech world?
More and more global enterprises are considering replacing the monolithic tech stack with the best-of-breed composable stack that enables greater business agility.
Story image
Quinbrook Infrastructure Partners
Quinbrook launches $2.5 billion ‘Supernode’ 800MW data storage project
Quinbrook says the new Supernode will be one of the largest permit-approved data storage campus projects in the Southern Hemisphere.
AWS Marketplace
Watch this webinar to gain building blocks for data mesh, and how AWS customers today are successfully enabling domain driven data.
Link image
Story image
Data center
Tokyo, Sydney and Seoul lead data center growth in APAC
Knight Frank’s latest report in partnership with DC Byte, which looks at centers in APAC, has found the region had an increase of 488 MW of new capacity in Q1, driven mainly by Tokyo, Sydney and Seoul.