Cluster your data like the CIA
Once you have gathered enough data from experiments, you can deploy a radically new tool for innovation management we call clustering, based on the technology the CIA used to find Osama bin Laden and Netflix still uses to recommend movies to you.
Now what on Earth do Osama bin Laden and Netflix have in common?
Back in 2011, we were wondering how to develop the first hypothesis in a large and deeply complex assignment. We decided to look at other trades and industries to see how they approached tasks like these. After investigation, we found that the CIA used “grounded theory.” This turned out to be a vital element in their successful location of Osama bin Laden.
We decided to learn more about grounded theory. After a while, we found out that it was also used in research relating to finding relatives for DNA matching. A number of scientists had developed a piece of open software called Cytoscape. This caught our attention, and we started to experiment with this software and grounded theory, in order to analyze the huge amount of data that we were collecting as part of a highly complex innovation assignment.
Using these tools, we were able to draw out from the massive amount of data some really good hypotheses. After a while, we also realized that grounded theory is a perfect methodology for codifying both quantitative and qualitative information—and identifying relationships, and clusters of relationships. This was how the CIA used grounded theory and a large volume of data to figure out where Osama bin Laden was hiding:
The connections between data points were visualized as lines, with shorter lines representing stronger connections. The CIA has released to the public nearly 470,000 additional files recovered in the May 2011 raid on Usama Bin Ladin’s compound in Abbottabad, Pakistan. Former CIA Director Mike Pompeo authorized the release in the interest of transparency and to enhance public understanding of al-Qa‘ida and its former leader.
What does all this have to do with running your business? The answer is in using data connectivity to make better decisions.
In terrorist networks, high degree centrality, or connectedness, may identify influential actors who are most at risk of detection by law enforcement due to redundant ties. The centrality measures for individual network members reveal a notable amount of variability. At the lowest end of the continuum, Salem Sa’ed Salem in-Suweid is only connected to two actors, which scores a 3.7 percent connectedness rating.
Conversely, Said Bahaji is linked with 48 others, an 88.9 percent connectedness. The relative differences in connectedness has advantages and disadvantages. Bahaji may be most able to significantly influence the network, but he also has the greatest exposure and so is most vulnerable to detection. On the other hand, bin-Suweid is least susceptible to detection, but he is also the most isolated and therefore the least able to exert leverage.
We learned from this data that we could take this technique in a new direction—not using just source, target, and relationship, which is the cornerstone of grounded theory, but rather taking perspectives on clustering on several levels.
The enhanced clustering technique is now widely used by cutting-edge technology companies like Netflix, which is most interested in discovering what customers like to watch and what is likely to keep them glued to the platform. Netflix can only justify its prices by keeping the user base enormous and engaged.
Netflix monitors which scenes make the viewers hit the pause button, which scenes they like to replay, what they skip over and more. Based on that data, Netflix designed new shows based on crowdsourced preferences. This is really digital anthropology on a massive scale, all to keep viewers engaged and binge-watching.
It reminds me of the film Scrooged, in which a TV executive discovers that a majority of his viewers have cats, so he creates a detective program where the protagonist dangles a string everywhere he goes.
Grounded theory is just one of the many ways my team and I have been handling, codifying, comparing, and contrasting vast amounts of complex quantitative and qualitative data to generate hypotheses and form tested theories. Those theories be used on a deeper level to enhance understanding and to make recommendations that are tangible, measurable, trackable, and actionable.
The secret is no secret at all. It is the same method that science has used for centuries to find verifiable truths about the world. Bit by bit, this original research forms the underpinning of every technology we use to navigate through our daily lives, from cloud-based apps that tell us where we need to go next, to the satellite-based GPS that tells us how to get there. The mechanism science uses is the Universal Undo Button, in the form of hypothesis-based experimentation and data analysis.
A single success tells you nothing. Only repeated successes or failures can tell your company which is the best way to proceed. You don’t need to be stuck in an eternal loop of trial and error like Groundhog Day. Instead, a sophisticated data analysis of crafted experiments can provide a deeper understanding of the nature of whatever you are testing. Successful innovation requires you to hit Undo over and over to discover the precise combination of capabilities that will bring your next innovation to life, and thereby prolonging the lifespan of your company.