Even bigger troubles with big data
Troubles in big data are starting to emerge. I have written about these earlier, and warned the clients to first make sure their small data is working as intended before jumping into big data. If you cannot control your small data, your ERP system, your EIS or your cloud then forget about BIG data. All you will have is a big mess instead of a small mess. Despite companies spending hundreds of millions of dollars in their systems renewal efforts, user satisfaction remains stubbornly low. As I say in my book ‘The 5-STAR Business Network’:
Not too many years ago, a very large corporation operating worldwide, made news with the downgrading of their earnings expectations due to supply chain system’s implementation setbacks. The expectation was that the new system would reduce the new production cycle from 1 month to 1 week. Furthermore, it would better match the demand and supply of its products to place the correct products in the right locations and quantities, all at the right time – a very lofty goal. The company spent an enormous amount of money, exceeding US $400 million in order to achieve its aim. However, the software system ‘never worked right’. It caused the factories to crack out too many unpopular products and not enough of the trendier ones in high demand. While making the earning downgrade, the CEO asked the rhetorical question, ‘is this what we get for $400 million?’
The market analysts were not surprised. One respected market analyst [AMR] commented, ‘fiascos like this occur all the time but are usually kept quiet unless they seriously hurt the bottom line.’ Another respected market analyst commented that while the CEO made it sound like it was a surprise for him, if he did not have checkpoints for the projects, he does not have control over his company. A third analyst commented that companies are confused by escalating market hype and too often underestimate the complexity and risks. Another [Forrester Research] commented ‘when the software projects go bad companies are more likely going to scurry up and cover it up because they fear that they are the only ones having trouble. But far from it; our conversation and research reveals this company was not unique or the only one having this kind of trouble‘.
Despite their lofty goals, many of the large information technology deployment projects derail. It takes time for the word to filter out because, in most cases, the executives involved in the process are far too embarrassed to talk about what happened. They do mutter among themselves; after several similar instances the mutterings become more vocal and a trend emerges where a number of people start talking about the shortcomings of the system itself or the implementation process or of the time taken for implementation. Because the cost of this failure is so high – greater than $400 Million in the above case – it is instructive to understand the real root causes of this failure.
All the above problems with small data are only multiplied big time when they apply to big data. However, this blog post is not about these small problems. Most companies survive these small problems by stumbling through them. Now even BIGGER problems are emerging with Big Data. Target was always one of the poster childs of big data. Highlighted in Charles Duhigg’s book and several newspaper articles were its capabilities of predictive behavioural scoring in order to maximise the revenues.
Kashmir Hill, writing in Forbes magazine online in February 2012, cited New York Times in an instance of How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did. 
Target assigns every customer a Guest ID number, tied to their credit card, name, or email address that becomes a bucket that stores a history of everything they’ve bought and any demographic information Target has collected from them or bought from other sources. [They] ran test after test, analyzing the data, and before long some useful patterns emerged. … Take a fictional Target shopper named Jenny Ward, who is 23, lives in Atlanta and in March bought cocoa-butter lotion, a purse large enough to double as a diaper bag, zinc and magnesium supplements and a bright blue rug. There’s, say, an 87 percent chance that she’s pregnant and that her delivery date is sometime in late August.
The anecdote quoted in the article by Kashmir Hill where a father storms angrily into target demanding an apology for encouraging his teenage daughter to get pregnant by mailing her coupons of baby stuff, only to retract the demand later on when he discovers that she was indeed already pregnant, demonstrated the power of predictive business intelligence. This article as well as the New York Times article  by Charles Duhigg and the book it is based on The Power of Habit: Why We Do What We Do in Life and Business also by Charles Duhigg. I quoted this example in my book as well, and cited Target’s ability cautiously. Back of my mind were the concerns about data integrity and security – which have now come true. This holiday season, Target was one of the two large retailers who felt the brunt of the hackers. As per this news report in NBC: Target said Wednesday that the cyber criminals who breached its system used credentials they stole from one of the retailer’s vendors. “The ongoing forensic investigation has indicated that the intruder stole a vendor’s credentials, which were used to access our system,” Target spokeswoman Molly Snyder said in a statement. She declined to elaborate on what type of credentials were taken from the vendor. Meanwhile, the Justice Department is investigating the hacking, Attorney General Eric Holder said Wednesday. While target is not the only one to have suffered such lapses – it is one of the most serious. Reminds me of the joke where a bank robber was asked why did he always rob banks, and he replied because that is where the money is. The news report quoted above shows the magnitude of the theft. Target has said a breach of its networks during the busy holiday shopping period resulted in the theft of about 40 million credit and debit card records and 70 million other records with customer information such as addresses and telephone numbers. Target has not yet specified which vendor was responsible for breach, and whether it was an IT vendor, or a supply chain vendor. Target was not the only one though. Nieman Marcus was another high profile retailer in similar situation, albeit on smaller scale. In fact there were more; in the news report above: Reuters reported Jan. 23 that the FBI has warned U.S. retailers to prepare for more cyber attacks after discovering about 20 hacking cases in the past year that involved the same kind of malicious software used against Target. Final point this episode highlights is the axiom that you will always pay for your vendors sins. I use the example of BP’s oil rig in my book to illustrate that point. Will write on this aspect of the episode in a later blog.