The last post discussed the missed side of Big Data – the context.
Let us start with a brief context of our data legacy. The disappointments of leveraging data have existed for years. In the past, increasing data volumes and its management were relegated as back office issues. Today, we are faced with new “Big” data disillusionments as we struggle to harness it or lack the knowhow to leverage it. These challenges are similar to the legacy ones, except in the new paradigm, the data is the most visible, customer facing facet. This, coupled with fast changing user needs, pose a few interesting business challenges of their own.
As we know understanding the problem context is the first critical step to solving it. This is true, even for finding solutions to the ever-persistent data issues, specifically in navigating the right path to a desirable solution. A typical context provides insights into how the problem manifested, its implications, risks, assumptions, constraints and expected outcomes. Taking a holistic view in understanding the complete picture right at the start is a must for the team attempting to solve any problem.
The question is how to capture such context(s)? The simple answer is by using “data”. This means we use data to describe or paint the context of the problem. In the past, limited availability of quality data and lack of computing power, constrained our ability to use data in this manner. Today, the advent of Big Data offers many rich alternatives as we are able to collect lots of data rich in variety at rapid pace, as the newer technologies can handle complex problems of wider scope in great level of detail. Such powerful capabilities coupled with new business demands and aggressive “go to” market pursuits could be a recipe for disaster. We must start by framing the problem context correctly, right at the start.
Let us start with a few of the following as example problems –
• Detect supply chain risks using Twitter sentiment analysis.
• Predict disease onset within a hospital community/region.
• Holistic caregiving through continuous patient engagement.
• Improve health diagnostic accuracy through machine learning.
These are a few of the typical problems for a healthcare giver or service provider company. A typical solution for such problems will share a few of the traditional data subjects such as patient, hospital, doctor, case history, billing history and others. In the “Big” data paradigm we capture much more data like social media, 3r party data sources, videos, and mobile data to complement the existing enterprise data spread across many different enterprise applications and repositories captured over many years. This enterprise data landscapes is ridden with legacy issues as well as with the additional complexities of managing new data while fulfilling the new business demands. The only way to resolve this is by focusing on the right data (colors) to paint the context (picture) with the right scope.
Cueris’ WeaveXT framework offers a solution to framing of contexts using context strands.
What are context strands?
In the WeaveXT framework, context strands are logical structures or architecture components, a context framing vehicle focused on use of the right and relevant data. A typical context strand identifies what is needed to describe the context in terms of data subjects, formats, over which period of time, purpose of the context (outcomes), and the expected quality. The key premise is that that a context strand evolves or is enriched incrementally as with the underlying data. It is designed to string together related and relevant data stores from within the enterprise data landscape including old and new data. The context strand facilitates direct, specific access to the respective data store to facilitate efficient access and retrieval providing the shortest possible path to an outcome. The context strand is an atomic, independent structure that operates without interfering with the underlying physical data store(s) or data. The noninvasive buildup of context strands leverages legacy data sources in concert with the new data repositories like Hadoop, NoSQL, Hadoop and others. The noninvasive functioning facilitates concurrent yet independent execution of activities of consumption, collection and storage in response to the ever changing context.
In a nutshell, context strands are logical or floating structures spanning the enterprise data landscape. A few critical features are –
• Connect disparate, heterogeneous data stores to facilitate easy and efficient access.
• Offer flexibility to connect or disconnect from a data store in response to the change of context.
• Leverage and reuse the same data stored to support different context strands as a source.
• Hybrid architecture that combines computing and/or storage as needed
• Allow use of legacy data sources while seamlessly integrating leading edge data stores.
As context strands mature, these can be used to weave an enterprise information fabric capable of acting as an enterprise data backdrop to serve additional business needs. This atypical approach deviates from the big bang approach of uninhibited, relentless data population without specific purpose. Individual context strand design emphasizes choice of the right fit repository like Hadoop, Relational, NoSQL, In-Memory, or files, this may help protect legacy investments while creating a resilient information fabric, providing shortest and quickest access paths to the desired data.
The problems listed above can be good context strands candidates to solve the respective business problem. These context strands even if they share data, they are independent as they present the best possible access path to most relevant data for the respective context.
Following are a few of critical considerations in looking at each of the problem as an independent context –
• Supply chain risk detection is helpful as it points to possible shortage of specific medicine or type of supplies required for treatment.
• Predicting possible onset of a disease helps in developing plans to manage such outbreaks proactively including mitigation of supply chain risks.
• Holistic caregiving through continuous patient engagement helps doctors support larger number of patients without an in person visit.
• Using machine learning to improve diagnostic accuracy adds to overall efficiency as machines help improve diagnostic quality and efficiency.
Looking at these problems at the enterprise level provides use clear picture of their interdependencies and how they can be weaved into a fabric providing a bigger picture view. The incremental build and context enrichment is a continuous process where you can build and expand your data outreach independent of the existing IT landscape.
Do they offer possibilities that were unheard of and unavailable before? The answer is an emphatic “YES”. The key to WeaveXT’s context driven prescription is to wield the “Right” and not necessarily Big Data.