May 20, 2013

Four starting questions about Information - Big Data Plan part 1

Big Data - The New Information

Before asking the crystal ball what can Big Data do for you, sit back and think about these four questions:

  1. Where’s the new information?

  2. Where could it be?

  3. If it was in the right place, what could happen? (Challenges of the main industries)

  4. What are the New Technologies and the role of big IT vendors?

These questions should be done before you actually think about building the plan that McKinsey suggested back in their March Quarterly doc entitled "Big Data - What's Your Plan?". In this study the authors bluntly state that "create a simple plan for how data, analytics, frontline tools, and people come together to create business value". Well, who had thought about that? They even go further to explain what is a plan and what should the components be: Data, Analytic Models, and Tools. Nice plan to have a plan, but some more fundamental questions should be asked first,  and those are the four I propose above.

The most inquisitive girl!


Curiosity is the most healthy quality these days where everything seems to be served in a platter. Being inquisitive about the surrounding reality can only be good, and start a discovery exercise that can show the world new and interesting new breakthroughs. Curiosity feeds from information, the more, the better. This sets the stage for our first question: Where is the New Information?

Where is the New Information?

New information has different types of sources, and sometimes it ends up thrown away. I would like to propose these three types of sources for (new) information:


Information that is Born Digital comes directly from digital devices and already bring a certain type of limited context and understanding, but by itself does not carry much value outside that context. Examples are shown below:


More and more data is being captured this way. Before we go deep on the implications of having everything capturing data, or start babbling about Machine to Machine or the new world of the Internet of Things, let's just face the fact that most of these challenges have been around for a while, but confined in such industrial challenges like manufacturing, defense, telco or utilities. Born Digital will become more and more ubiquitous and so it's only logic for people to expect this to be contextualized with other data, within reasonable limits of privacy and personal space.

The other type of data being generated, also in considerable amounts, comes in fact from a manual birth. Humans generate it. Either using digital or analog interfaces. Here the dichotomy between "atom" and "bit" is not to be considered relevant, but more the format and the level of freedom is given to capture that data. This is actually where data can have more or less structure depending on the rigidness of the interface capturing it. The more rigid the more structured the data will be, the more loose the less structured, but also more subjective.


The other type of "Born Analog" data, is the one that has no metadata, sits in piles of paper, or has been exposed as Open Data but with little or none contextualized format. This is actually the most exciting and relevant use for technology, since we the humans were unable to make sense of it in a timely fashion, since Johannes Gutenberg or even sooner. When a piece of information was captured in one of these formats, it had preceded a lengthy process of word of mouth and discussions never captured, hence unable to justify the rationale behind and context in which this analog information was capture. This is why most of this data is, simply put, not trustworthy.

Lastly, information that was captured, but simply is thrown away at some point, can also pose an interesting source, as most of the times the reason why it was discarded makes no sense after some time. In other words, what we find it's not useful today, can be proven to be very valuable some time in the future. Are we talking about information hoarding? Yes we are, with all its magic and creepy dimensions.

Where could it be?

Now we have identified where it comes from, what about making sure we have it when needed. This new information, after being captured, processed and thrown into a pig pile, needs to be in the "right place" otherwise it does not make sense to store it in the first place. I have three examples for you:

  • Border Control or Extreme Homeland Security

  • Healthcare (Public mainly)

  • Vision 360: social uprising, citizen, customer, etc.


Every time those guys in Dubai or Riyadh passport control look at me with a straight face, I wonder what kind of info is popping up in their screens, which I can't see. Great poker faces though, but I guess that this is one of the examples where "no news is good news". If on the other hand, my personal beliefs about a specific country, I'm about to get into, came popping up in the screen at passport control this could be seen as a creepy example of what could be done if the new information could be in the (supposed) right place.

A more interesting and useful example is remote medicine. The fact that hospitals are a bad place to hang out, even when you're sick, makes remote medicine an attractive alternative from both a social comfort and public health cost savings perspective. This is an area where technology alone can't solve all the issues, but it can certainly move mankind into a new era of healthcare.

The third example is the need from every sector that deals with people, to have a 360 view of that customer, person, citizen, etc. Social uprising prediction is a very complex example, but more simple ones include the need to tailor a set of product offerings into the ever so shrinking groups of segmented populations. The challenge is not to segment people, the problem is maintaining a decent level of dynamic behavior in the system that is able to update the segmentation assumptions. Some people make it easier for you by leaving a digital trail with all the social tools, but others are much harder to track and segment.

If it was in the right place, what could happen? (Challenges of the main industries)

The definition of "right place" here is referring to business value and applicable use cases. If we start pounding the data bag, all sorts of current and new use cases can come out, but for the purpose of this exercise I will just use the most relevant for each industry:


Don't get hung up on the fact that some industries are missing, or that in some industries not all use cases are covered. Instead incorporate the fact that the challenge of using more data and new data is a challenge that crosses all industries, so there is no sense in turning away from this challenge. Unless you're one of those people that thinks this is "for others, not us". But even if you are, there is no sense in looking at systems that mimic reality in a deterministic way. Let's step back and look at a graphic that correlates Systems Maturity Level and Business Value to better understand where does all this Data Driven madness sits:


So if you're looking at these New Information challenges from an operational perspective, you will never get your head around it. Operational world is about looking at red lights and what failed and what not. If the involvement in planning gets deeper the future decisions need to be based on a better understanding of past events, hence questions that start with "Why" are generally what will need to be answered by bigger systems like Data Warehouses, but also some crippled data discovery questions through ah-hoc queries to the available data.

The bar rises when you need to incorporate a prediction model to guide strategic decisions about the future. Not just financially speaking, but at all levels of the organization. This is the world of "What if...". A dream come true for decision makers: to be able to dig deeper into the future impacts of each possible scenario today. To accomplish part of this, one needs to have all data available, adaptable and self-learning analytic models and the most flexible tools to enable this process to be pain-free and enjoyable.

Data, Analytical Models, Tools... this is strangely familiar.

The fourth question has been discussed ad-nausea in previous posts on this blog and other blogs. But again: it's not about the means, it's about the end.


No comments:

Post a Comment