Following on from an animated discussion kicked off by Sameer Patel on Facebook, I’ve been pondering the problem of trusting algorithms and/or intuition in decision making. He says:
One of my favorite posts by Andrew McAfee, ever.
Humans face a huge conflict of interest when it comes to judging the validity of big data. Arguably the biggest conflict of interest in recent (information technology transformation) history. Big data and the algorithms that come with it threatens the very raison d’ etre for many that make a living off it.
The McAfee piece to which he is referring is a Harvard Business Review piece entitled: Big Data’s Biggest Challenge? Convincing People NOT to Trust Their Judgment.
Patel’s post and the ensuing debate has brought my background in accounting into fresh focus while allowing me the opportunity to dig into some fascinating academic research.
I’ve been data driven without appreciating it in a fully conscious fashion pretty much all my professional life. Training as an accountant does that to you. I learned early on that numbers hold magic but few people get that. Math geeks do, some stat geeks do. Most others find the topic too hard. Instead and to this day, most view the accountant in a pejorative sense as ‘the bean counter.’
Maybe that’s true from a pure, regulatory standpoint but then most of the people I know who form the new generation of accountants (and those who are re-inventing themselves) see those required tasks as merely a stepping stone to a future that engages clients/co-workers/partners/colleagues…pretty much everyone who asks a question in a way that positively shapes decision making. So what’s the fuss here and why should you care?
The McAfee problem
I generally find McAfee’s thinking to be problematic. He was the academic who provided legitimacy to much of what we termed Enterprise 2.0 and which for years I disavowed on the basis that ‘content without context in process is meaningless.’
It was a tough position to take as most others who liked the E2.0 idea thought I was being snarky for the sake of it. I kept coming back time and again to the same question: Show me the evidence for your assertions? And while some could show random success I could see no discernible patterns to suggest that E2.0 was anything other than a marketers wet dream.
I took occasional swipes at McAfee because I saw his approach as being overly reliant on the presumed osmotic effect of technology and technology’s supremacy in ushering in a collaborative world. The theory was attractive but the presumptions were flawed since they effectively de-humanised the way technology is adopted in favor of focusing on sparkling outcomes.
Fast forward to today and in quiet moments, some of even the most ardent E2.0 fans will admit they got it wrong. It comes as no surprise then that when I first skim read McAfee’s treatise I was more than skeptical and especially since he focuses on evidence that suggests algorithms beat out intuition on pretty much every occasion. The old machine v man thing kicked in pretty hard. I said:
Sorry Sameer but much of this is predicated on the wild assumption that algorithmically understanding BD provides better insights. Problem: we can’t even get little data right…predictive is mostly not…why even go there in this kind of academic masturbatory horseshit when the basics remain unsolved. Unless of course you enjoy these kinds of intellectually barren debates?
Even by my standards, that’s pretty rough talk and on reflection represents exactly the knee jerk, intuition driven response of which McAfee is critical.
Fortunately, I have learned that if something/someone annoys me at an intellectual level then it’s time to do my own work and discover whether the gut check (intuition – sic) is right or otherwise. And I was encouraged in that by something Rachel Happe said:
It makes me wonder if this is a discussion between people who would rather do things imperfectly and with a human intervention option or those who would like to maximize efficiency but with the risk of running even faster against the brick wall. That may be a bit too extreme but when we put a process into an algorithm we need to be very careful that we are optimizing what we wish for and that we understand the implications of it.
As I researched around the topic and thought more about what McAfee is trying to communicate I realised something that few people want to talk about but which I believe must become part of the open conversation in 2014. Patel says:
Big Data [because that's the tech locus for this topic] is a largely geeky conversation and we haven’t been able to humanize the value proposition. I’m starting to think that its us humans who as both judge and jury, that are the problem in the humanizing part.
He’s partly right. In conversations during a recent consulting engagement I encouraged the client to think first about case study driven outcomes rather than the technology. People get the former, the latter is always a learning issue. But that’s not enough either. So where is McAfee right and what else needs researching?
One piece of research he references caught my eye. The abstract from Chris Snijders concludes:
Our findings show that, contrary to what is often assumed, the OSCM-professionals (supply chain) with more expertise do not use less information while assessing, nor are they faster. Instead, our results show that specialized expertise goes with increased certainty about the assessments, and general expertise goes with an increased use of intuitive judgment. However, the net effects of these expertise characteristics on assessment performance are zero. In the case of specialized expertise this is because specialized expertise is itself negatively related to performance. In the case of general expertise this is because the net effects of the use of intuition on performance are zero.
Wow! In essence, the research, based upon previously verified decisions suggests that experience and expertise are not strong indicators of successful outcomes. In fact neither attribute appears to contribute to better decisions. That’s a shot straight across the bows of any person who has acquired and relies upon skills and expertise. That includes me.
As I read this I was taken back to the scene in Heat where the gang realise they’ve been discovered but have one last big win heist in play. Robert de Niro’s character wants to go for the job and Tom Sizemore’s character is uncertain. Up to this point, everything has been planned and executed in meticulous fashion with risks assessed and factored into each job. In this scene, uncertainty and risk have become important talking points. Sizemore asks de Niro: “Is it worth the risk?” to which de Niro responds that while it might be worth it for him, Sizemore might want to step away. Eventually, Sizemore decides he’s in for the wholly irrational reason that “For me the action is the juice.” The outcome is catastrophic as both Sizemore and de Niro’s characters are eventually hunted down and killed.
The narrative may be a fiction but the broad strokes around how decisions are taken often rings true. In software assessment for example we routinely see examples where the objective facts, background and scene setting imply one decision and yet the customer takes what seems like an irrational decision that ignores a good chunk of the most impactful data. In almost every case I’ve seen, such decisions are suboptimal. The consulting flaw is that none of us (mostly) challenge those decisions. “We gave them the right advice, WTF if they choose some dipshit other way?” is how most would sum up their conclusions.
Snijder et al’s research concludes with the suggestion that model driven decisions need to be considered more closely but recognise the difficulties these face in light of established careers and the perceived value of experts. McAfee believes he has a solution:
So how, if at all, will this great inversion of experts and algorithms come about? How will our organizations, economies, and societies get better results by being more truly data-driven? It’s going to take transparency, time, and consequences: transparency to make clear how much worse “expert” judgment is, time to let this news diffuse and sink in, and consequences so that we care enough about bad decisions to go through the wrenching change needed to make better ones.
It’s pretty brutal but then I confess to the view that the only factor that provides the necessary conditions for transformational change in established business is institutional pain or what I term The IBM or Apple Moment.* You can get partial results with high quality change management but I suspect the outcomes are suboptimal.
Intuition – another word for good luck?
Starting from the obvious fact that professional intuition is sometimes marvelous and sometimes flawed, the authors attempt to map the boundary conditions that separate true intuitive skill from overconfident and biased impressions. They conclude that evaluating the likely quality of an intuitive judgment requires an assessment of the predictability of the environment in which the judgment is made and of the individual’s opportunity to learn the regularities of that environment. Subjective experience is not a reliable indicator of judgment accuracy.
It’s an interesting idea but one which appears to ignore one factor: data driven track record assessment. How many times have we stood in awe of this or that individual marveling at their apparent brilliance only to discover that in truth, they were lucky with the timing of a smart but not necessarily original insight? Isn’t the truth that we’d rather attribute some sort of Steve Jobs halo effect to most of what is considered success rather than objectively examine track record and outcomes?
But what about the algo?
In the same Patel driven thread, Adrian Bowles observes:
I’m surprised nobody has brought up evidence based probabilistic analysis of big data as an alternative to the type of algorithms being discussed. I used to teach algorithms & data structures in a comp sci curriculum, but I would rather trust a system like Watson than one that produces “the” answer based on the algorithm authors’ biases. Real cognitive systems are the future, the algorithmic systems discussed here are a painful phase we will get past eventually.
Yep – I’ll go with that. As we stand today, while it may appear that algorithms provide better outcomes than the intuitive hunch, they include bias because…they are constructed by humans. But the research equally shows that algorithms combined with intuitive insights provide even better results. Here I am mindful that McAfee’s approach sails perilously close to a situation where the machine is always right.
Our human imagination tells us repeatedly through sci-fi tales of woe how well that goes! And I only have to think about the hilarity that can ensue from autocorrect typing on today’s smartphones to recognise we are a long way from having ‘perfect’ algorithms.
Contrary to what I originally thought, I believe McAfee is on the right track but his tendency to defer to ‘the machine’ remains a road block I struggle to overcome.
On the other hand, Vinnie Mirchandani’s somewhat dismissive view in Predict and Prepare about the role of finance and the CFO is plain wrong headed. My sense is that if we are looking for leadership in moving the decision making debate forward in the context of large amounts of diverse data then the accounting type is absolutely the person who should be center stage, helping and advising in the line of business understanding of what data means and how to use it.
Bean counter or not, the accounting mind is well used to abstracting itself from the emotion driven intuition that is under such close scrutiny. I’m less clear on the role of the so-called data scientist who, as far as I can tell, is mostly a stats geek. Too slavish a view of statistical validity is fraught with danger yet I am far more likely to be persuaded by well reasoned and tested hypotheses when I can see how the results are contextually meaningful. David Dietrich has more expansive view I find attractive in this context.
Of course none of this occurs in a vacuum and I would suggest that companies test for themselves in which scenarios they achieve better outcomes as an adjunct to what they already do. There is after all, nothing better than running parallel processing for testing whether one ‘system’ is performing better than another.
Note: *The IBM or Apple Moment occurs when the top management of a business realise they are facing impending doom – not just a crisis – and that without clear, decisive action, explicitly communicated to the workforce, the business will collapse. Both IBM and Apple have faced those moments in the last 25 years.
Featured image © fotolia – Umfrage – Gute Bewertung