July 06 2020 – Varun Sharma
It wasn’t so long ago that big data was a shiny new phenomenon promising a take-no-prisoners business takeover. Now that data and analytics are vital to business and deeply entrenched, the question is whether the technology will grow and mature in 2021, be replaced by something else, or continue to reshape businesses in new and interesting ways.
The answer may be some of each, as it turns out. Here are the top predictions of what we can expect next year.
Big data is dead, long live big data
Back in 2017, Mark Beyer, an analyst at Gartner, wrote big data’s obituary: “Rest in pieces…in all those other technologies where you belong.” Svetlana Sicular, another Gartner analyst, reiterated Beyer’s assessment in a blog post, saying the information behemoth, or at least its moniker, is indeed dead, “because we still get questions about what Gartner thinks about the big data fate. Big data became the new normal, and now it is just data.”
But whatever term you use, big data still places huge burdens on infrastructure and data wranglers. Huge datasets are still retrieved, stored, sorted, and analyzed to fuel everything from software, machine learning, and a range of innovations to the CEO’s daily decisions and reports to stockholders. Just because it’s now ubiquitous and working silently in the background does not mean that its size has shrunk or that its velocity has slowed.
Data is everywhere: It is clogging networks, ballooning storage requirements, crossing borders in multicloud and hybrid environments, and generally placing huge demands on infrastructure, networks, and tools.
But, sure, it’s dead. Long live big data.
In 2021, big data will remain challenging because, whether we call it “big” or not, data is freaking huge and still growing. However, the actual challenges and opportunities will be different next year, as will be the ways we deal with them.
Finally, enough computing power to do what we predicted earlier
Cloud cannibalism will not be a thing in 2021, despite earlier predictions that the cloud (and its bandwidth) would be consumed by the Internet of Things. Instead, both cloud and edge computing will continue to grow fast, thanks to the hair-raising speeds of big data growth.
“Since shipping data [from the edge of the infrastructure to the cloud] can take weeks, months, or even years, it’s important to have something in between the cloud and the devices to fill this gap. It also addresses issues around data privacy and regulation when data is unable to be shipped to the cloud,” explains Duncan Pauly, CTO at Edge Intelligence, a provider of analytics for edge computing.
But edge computing also helps organizations manage unwieldly piles of data.
“The infrastructure edge is key, as it helps store and manage the data in a cost-efficient way while decisions are made as to what data is trimmed, reduced, and ultimately shipped to the cloud,” says Pauly.
That’s a long way from Tom Bittman’s 2017 post in which he declared that “the edge will eat the cloud.” He generally meant that edge computing—computing close to the IoT device or gateway—soon would overtake cloud computing in usage rates.
His concern was not without merit. Problems happen when you consolidate all things big data-related in the cloud. Primarily, it creates an enormous conflict with the laws of physics in transferring huge amounts of data at the speed of light—or at least fast enough to prevent a self-driving car from crashing—over existing and often bottlenecked networks. But edge computing had problems then, too.
“[Edge computing] requires processing power and storage, lots of it. And data analytics tools. And tools to push software and data to the edge. And tools to federate across the edge, and with the centralized cloud. And machine learning on the edge itself,” Bittman wrote. “The edge will need some serious muscle.”
Today, edge computing has that muscle. It’s powering everything from connected cars and autonomous vehicles to automated customer service, on devices for everything from activation to repair and loads of other stuff in between.
But it’s not just new tech like self-driving vehicles that lean heavily on edge computing. Everyday things, like all modern-day cars, do as well.
“For example, modern fuel injection systems will observe the car’s driving patterns in order to optimize for power and fuel efficiency. The real-time nature of this data would make it impossible to process anywhere other than at the edge,” said Johnathan Vee Cree, PhD, in a PC magazine article. Vee Cree is an embedded and wireless systems scientist and engineer at the U.S. Department of Energy’s Pacific Northwest National Laboratory.
You can expect edge computing to rise in adoption rates given the nature and growth of the Internet of Things and the mind-boggling demands for increased speeds in analytics. Data and analytics usage will thus lean more toward a distributed model rather than a centralized one.
However, use of the cloud will not actually diminish. Value-added services and economies of scale will continue to be huge inducements for less time-sensitive data analysis and storage.
Additionally, analytics will move to the cloud (or the edge), but they will no longer be grounded in data centers, not even in traditional high-performance computing centers where machine learning training and intensive computing reign supreme at the moment.
“Edge and fog computing will coexist with the cloud and be used for different types of analytical processing,” explains Pauly.
He says distinct types of analytical processing will be required in these areas:
- Inside of devices and gateways for complex real-time event processing
- At the infrastructure edge, in aggregated, distributed analytics for enormous volumes of data
- In the cloud, for data reduction, deep learning, and integration with other on-demand cloud services
“In fact, the edge should be thought of as an extension of the cloud into geographical extremities rather than an incursion into existing cloud territory,” Pauly adds.
You’re expecting changes in machine learning and analytics. Turns out, not so much
Machine learning isn’t quite that big a glutton—yet.
“They are merging,” says Pauly. “But there continues to be a need for traditional analytics. Machine learning algorithms are best suited for certain use cases and predictive purposes, whereas traditional analytics are useful for time-series analysis, ad hoc analysis, forensic analysis, business dashboards, etc.”
Since it is established that knowledge is power, does this mean that the machines will take over the world because they ate the analytics and ingested the knowledge?
Nope, we’re safe. Machine learning works very well, but it neither possesses a self-awareness nor has access to a data singularity from which to learn the meaning of life. In other words, machine learning is focused on learning and doing specific tasks—none of which is taking over the world.
That’s not to say there isn’t a battle brewing somewhere else.
A CIO vs. CDO showdown
It’s probably not going to be pretty, but it had to happen eventually.
“The days of forgetting that the ‘I’ in CIO stands for ‘information’ are over,” declares James Markarian, CTO of SnapLogic, a provider of integration platform-as-a-service tools and connectors.
Markarian says the CIO role will become more identified with leading a company’s data and information strategy rather than infrastructure and security.
“Much as digitization and data transformed the CMO role, the CIO role will be unrecognizable from its current form in a few years. We can expect this process to pick up steam in 2021,” he adds.
Despite the constant evolution of roles, the important thing is that knowledge is power. Those who can best pull knowledge from information will win the prize. The title won’t much matter.
And at the end of the day, it is totally about amassing and wielding knowledge, because all the mundane stuff will be automated next year.
Improving data retention policies
“While you need to hold onto your data for backup for some time, you don’t need to store it forever,” says Carlos M. Meléndez, chief operating officer at AI consulting firm Wovenware.
Machine learning can be taught to clean and protect the integrity of stored data and even to dispose of data that is out of date or no longer needed. You can think of it as an automated courtesy data flush.
But don’t worry: That flushed information is not lost forever. “The algorithms themselves can serve as your backup, and if need be, you can use the algorithms to re-create some of your data,” says Meléndez.
And there’s your glimpse of data and analytics in 2021. Yes, it is going to be an interesting year.