Today, many companies are faced with a Big data quagmire. There are many questions and huge expectations as the search is on for a silver bullet! The early adapters followed the familiar path of initial exuberance, rushed investments into the “so called” solutions as many hurried in to solve “Big” data. Now many are looking for answers to questions like “what is the return on investment?”, “How and is it viable?” and “what is the right roadmap?” etc. Many have reached a moment of introspection, except it is certain that “Big” data is here to stay.
As organizations embrace “Big” data, they start facing reality –
- The burden of supersized business expectations
- New solution paradigms for new business challenges
- Many disparate technologies – cloud, mobile, social media, and “Big” data competing for same mindshare and resources
- Huge investments in the legacy enterprise data platforms
- Availability of skilled resources lagging the pace of technology evolution
In the midst of such uncertainty, Apache’s Hadoop, dubbed as the “Big” data operating system has emerged as an attractive alternative. It offers many tools delivering targeted capabilities beyond data storage. At its core, it addresses the key problem of “storage” as it offers uninhibited storage for a variety of data. While that solves the storage problem, many are struggling with how to leverage this data effectively. The Apache ecosystem is continuing to evolve rapidly adding new tools and capabilities to complement the Hadoop repository or storage platform. Amidst the surrounding swirl, it is certain that Hadoop is the platform of choice for “Big” data storage.
Its current market place clearly indicates that Hadoop is here to stay, the elephant has arrived.
Nobody said it was an elephant!
The early adopters rushed in to solve the storage issue as most focused on rapid capture of nontraditional data leveraging Hadoop as the repository. The new “Big” data repositories started growing at a rapid pace as did the existing legacy data stores. The availability of new “Big” data fueled newer, nontraditional business demands requiring integration spanning the legacy and “Big” data repositories.
It fueled the possibility of growth in data volumes not seen in the past, it was “Big” and growing fast.
Is it an elephant?
The advent of data science, machine learning, and cognitive computing fueled the newer business demands such as advanced analytics, predictions and intuitive visualizations. These demands required newer technology solutions to manage the entire data value chain. The technology offered many newer storage alternatives like NOSQL, GRAPH, Columnar, and in-memory databases as with many other tools to improve data access, computing and presentation.
The real challenge being the business problem as well as the solution, both are evolving simultaneously, blurring the edges.
It does move slowly!
This is a perfect storm – new technologies, new solution paradigms requiring newer skills, more time and more money. It meant that road to “Big” data maturity has a steep learning curve. The organizations lacked or lagged the experience and expertise as mobile, cloud, and social computing were competing simultaneously with “Big” data for the same mindshare and limited resources. The collective scope and magnitude of the related but separate undertakings is very complex and big.
For many reasons beyond its size, it moves slowly!
Can it co-exist?
The right answer is “Yes”, though practitioners differ in their opinions. The “Big” data added to the existing systems landscape was accepted as newbie with little understanding of risks and/or consequences of it. As it takes root, the conversations are focused on the role and purpose of the platform. The discussion is dominated by two approaches, one is to move and store all the data in Hadoop the other preaches selectively using Hadoop for only certain data. Either approach will involve co-existences as organizations attempt to preserve the sizable legacy investments and to ensure smoother transition into the new world.
The co-existence is mandatory. The rapidly evolving technology may fulfill this need though question remains.
How to nurture it?
Let us start by defining it – what is “Big” data? It is not Hadoop, social media, Internet of Things (IoT), internet logs or something else. A lesson from building enterprise Business Intelligence (BI) is that, poor adoption rates are directly proportional to clarity of business value. The user buy-in is function of how well the “Big” data paradigm is understood and how quickly it can deliver tangible business value to end users. With increasing “time to market” pressures and emphasis on quick return on investments, the need is to deliver tangible business results quickly.
All the “Big” data questions remind of an old story of a few blind men and the elephant. One day these blind men came upon an elephant, as they tried to identify the animal by touching and feeling it, each reached a different conclusion. The one who touched the tail thought of it as a broom, one who touched the legs though it was a pillar, one who touched the tummy thought of it as a big wide drum, then one thought of the trunk as a long tube or pipe and on it goes.
With “Big” data, are we on the same road as the blind men identifying an elephant?