Data Mining & The Cognitive Hierarchy

Data Mining is somewhat of a misnomer term used to describe the discovery of patterns within a dataset as opposed to finding the actual data in a dataset.  “The types of patterns decision makers try to identify include associations, sequences, clustering, and trends” (Kendall, 2013).  One way to understand how this works it to look at how data plays a role in the cognitive hierarchy of an organization.

Although a concept pioneered prior to automation and primarily concerned with delivering knowledge to decision makers, the cognitive hierarchy process still has value in today’s big data discussions.  In the first layer data exists but isn’t processed until it moves to the next layer.  This initial processing can include basic database sorting and the application of metadata.

In the second layer information is analyzed and it’s this layer where data mining takes place.  Data mining provides the analysis that can be shared as knowledge amongst the stakeholders and decision makers to be able to make decisions.  The hierarchy concludes when the decision has been made and is transmitted to all the stakeholders.

Since this pyramid was designed prior to today’s automation it has generally fallen out of vogue.  I believe it remains a good reference point for organizations that focus heavily on delivering meaningful content to a decision maker.  

According to Kendall and Kendall, modern data mining emerged “from the desire to use a database for more selective targeting of customers.”  The decision techniques used for presenting this information to customers have also become automated and that automation necessitates further refinement.  In relation to the cognitive hierarchy, a customer is a decision maker required to pass judgement on the knowledge presented before him.  In the area of online advertising there are a lot of hits and misses in the decision techniques that present the ads to customers.  Let me share an example.

I listened to a podcast on my phone (local audio content downloaded without going through any analytical tools).  Because of the local nature of the content no metadata was created associated with the actual topic and my use.  I didn’t see any ads based upon the topic of the show.  One of the topics discussed was web hosting using DigitalOcean.  A few days later I went to their site using google chrome and signed up.  After that I started seeing ads for DigitalOcean on other websites.  

This is a hit because it recognized I am someone interested in web hosting, but it’s also a miss because it didn’t accurately identify that I had already purchased the service through the advertized company.  This meant the advertized company spent money on ads designed for acquiring new customers that were displayed on existing customers’ screens.  This inefficiency will translate to a lower ROI with the advertiser and if low enough can cause the advertised company to take their business elsewhere.  If the loss of revenue for the advertising company is lowered enough they will likely adjust their decision techniques built into the algorithms to ensure they are delivering the most competitive product possible for their customers.

The waste in this example is an obvious disadvantage to the decision techniques used in presenting the content.  This disadvantage is certainly reduced by the advantages of the system as a whole.  The automated aspect of the system comes at very low cost of setup to both the advertising and advertised companies and is built upon a system that is easy to use for both parties.

There is also the advantage of identifying the market of interested customers.  Although broadcast media content such as the non-profit NFL’s super bowl reaches a large audience of general users, that large audience has less appeal appeal for specialized products that have a narrower potential customer base.  Having the space to advertise for speciality products has been an accomplishment brought about through effective data mining.

A final advantage of the automated decision techniques used today is the speed of adjustment.  Although complex in nature the ability to massively update and apply improvements to the system is as easy as changing a few lines of code.  Advertising industries with higher overhead (printed fliers) aren’t able to respond as rapidly to changes.

Although somewhat out of vogue the cognitive hierarchy is still a valuable visual tool for understanding the processing and presentation of content relevant to decision makers even in the fast adjusting era of big data.