Twitter’s algorithm brouhaha, and why enterprises should be algo-wary

SUMMARY:

Twitter’s algorithm controversy is theater of the absurd, but enterprises should still be wary of machine learning gone awry.

data-minerTwitter’s algorithmic kerfluffle can be viewed on three levels: A PR how-not-to, an overblown hypefest for a fading platform, and a cautionary tale.

All three are true. But for enterprises embroiled in data science initiatives, the cautionary tale carries the weight. We hear stories of how algorithms, well-intentioned or not, shape reality in serious ways that invoke design and process issues. Celebrities in bunny slippers bloviating about their Twitter timelines is comical, but protecting ourselves from algorithmic filters is not trivial.

Let’s start with Twitter. The PR timeline is absurd:

To be fair, what Twitter is planning is not a massive overhaul – that’s what Dorsey’s calming tweets likely alluded to. For now, the option must be turned on in users’ settings. There is already an algorithmic feature all Twitter users see: the “While you were away” tweets. Both the new algo tweets and “While you were away” can be swiped aside to view the real-time stream, which appears chronologically as always.

For those not familiar with Twitter, chronological tweets are a big deal. Twitter’s main differentiator is the ability to provide a live immediacy to events. Exact chronology is important. Ideally, citizens on the ground become reporters as they share their immediate reactions to events in progress. In practice, we see horrifying herd-like behavior, such as the tarring and feathering of the wrong suspect during the Boston Marathon bombings (see PDF link to study of this debacle)

Enterprises rely on Twitter hashtags for social event streaming, so any shift in tweet chronology matters. I’d argue that enterprise event “hashtags” are already too polluted with marketing pimpage and peacock feathers to be useful, but those marketers who rely on event streams for KPIs or sentiment analysis should be tracking these changes closely.

The bigger fight is what type of reality we want. Facebook already serves up an algorithmic reality, acceptable to many but disturbing to others. Yes, you can manipulate your Facebook algo and it “learns” from your behavior, but ultimately it’s your reality filtered through Facebook. Hard core Twitter users see Twitter as the only major social media outlet where you still have control over what you see, and in theory, every voice has some level of equality in your stream.

Granted, these are tenuous ideals that influencer rankings like Klout have been undermining for years now. But you can see how the prospect of a shift away from chronological purity would inflame.

Algorithms and surveillance – a controversial mix

Yesterday, Ars Technica UK published a sensational/disturbing piece, The NSA’s SKYNET program may be killing thousands of innocent people. The article seems to be credibly researched (based on a review of previously leaked Snowden documents). However I am not in a position to verify the veracity of the claims made. The piece alleges that the NSA uses algorithms to rank likely terrorists to be targeted in drone strikes in Pakistan, based on big data analysis of Pakistani mobile phone records. According to the documentation reviewed by Ars Technica, the NSA uses the “random forest” approach, assigning a numerical score to each person, with a scoring method similar to spam filters:

The random forest method uses random subsets of the training data to create a “forest” of decision “trees,” and then combines those by averaging the predictions from the individual trees. SKYNET’s algorithm takes the 80 properties of each cellphone user and assigns them a numerical score—just like a spam filter.

Then “extremists” are classified based on those scores:

SKYNET then selects a threshold value above which a cellphone user is classified as a “terrorist.” The slides present the evaluation results when the threshold is set to a 50 percent false negative rate. At this rate, half of the people who would be classified as “terrorists” are instead classified as innocent, in order to keep the number of false positives—innocents falsely classified as “terrorists”—as low as possible.

Ars Technica was not able to determine if other considerations besides machine-generated scores were used to rank drone strike targets. My views against drone strikes aside, I hesitate to condemn any ranking method until I fully understand what other factors, including human judgment, were integrated into the process.

The piece quotes critics of the NSA’s use of algorithms. The SKYNET program uses metadata to determine the daily routines of people under surveillance, including shared contacts and traveling patterns. One potential flaw is behavioral assumptions. It appears the NSA weights suspected extremists based on a persona they believe deviates from the non-extremist. Reviewing the NSA’s internal presentation slides, Ars Technica writes:

The program, the slides tell us, is based on the assumption that the behaviour of terrorists differs significantly from that of ordinary citizens with respect to some of these properties. However, as The Intercept’s exposé last year made clear, the highest rated target according to this machine learning program was Ahmad Zaidan, Al-Jazeera’s long-time bureau chief in Islamabad.

Another problem: machine learning depends on feeding information from target profiles. In this case, there are few “known terrorists” to feed into the algorithm. Plenty of ethical AND tactical questions are yet to be answered.

Why enterprises should be algo-wary – an HR example

For most enterprises, the algorithmic stakes aren’t this high, but they aren’t a small matter either. Our Brian Sommer penned a series of HR-related pieces showing how well-meaning algorithms can lead to ineffective – and sometimes discriminating – hiring practices:

In “You’re not our kind of people,” Sommer cautions against confusing correlation with causation. He cites the hypothetical of someone from a “large, rural, devoutly religious, mixed race family that recently immigrated to the United States from Tongo“, who is applying for a restaurant position at a fast-food chain in mid-town Manhattan. He/she might be algorithmically excluded because their background characteristics do not correlate:

Chances are you would be the only person from Tongo that this firm has ever received an application from. Moreover, the Recruit Retention Algorithm tool will not score you well as your value-set, education, etc. are not correlating well against parameters often found in their other longer-tenured restaurant workers as those workers may have different values, different education, etc. Should this restaurant chain pass on your application? I would hope not but a low retention correlation score could keep you from even being considered.

Sommer acites a real-world example of a vendor who designed a hiring solution based on the client’s investment banker profile (“finance majors who rowed crew at Harvard or Yale.”) Sommer eviscerates this approach:

I did not react well to this… Just look at the successful IT business CEOs who didn’t even graduate college and you’ll see that a degree from a prestigious university is not necessarily a predictor for future success. I wonder how many non-standard people wouldn’t have had the careers they experienced if these algorithms (or the people who will use them) existed in prior years. Take the case of Chester Nimitz: “Despite being reared well away from any ocean in the hills of South Central Texas, he would go on to lead a great naval armada to victory, and become this country’s first five-star, fleet admiral.”

I don’t think Admiral Nimitz rowed crew at Harvard.

In the follow-on HR tech lessons post, Sommer posits companies could experience legal exposure through improper use of algorithmic hiring tools:

While on its surface that sounds admirable and cost-effective, the problem with these tools is that they rely on a test database which only includes existing employees. If a company has failed to hire many women or minorities in the past, then very few of them will appear in the solution’s data population and thus will generate a statistically insignificant subset of individuals to establish a meaningful pattern.

My take

That’s plenty for companies to consider. Machine learning is an asset but we’re only beginning to understand the ramifications of algo-reality. Ideally, such know-how is baked into the design, and proper human intervention is accounted for where needed.

As for the individual in despair over their lack of algorithmic control, I’ll make another pitch for the virtues of a lovely RSS reader (I’m a Newsblur fanboy, many of our readers use Feedly). The trusty RSS reader is one of the last places where you can build your content universe, based on exactly the feeds you wanted. In a sense, you are building your own algorithm – as it should be. And if you can stomach a shameless plug, you can pull RSS feeds from all 18 of diginomica’s new topic areas. Diginomica won’t be taking away your RSS. What you do with it is, as always, your call. Algo on…

Image credit – Data mining concept with vintage businessman and calculator © scandinaviastock – Fotolia.com.

    Comments are closed.

    1. Many parallels between enterprise software and social software. Chronological MRP, the ERP order mgmt core, evolved to algorithmic ranked production scheduling. Twitter’s chronological timeline, it cores algorithm, is evolving to ranked social timeline.

      Woe betide any enterprise system that doesn’t allow algorithmic filters to be tuned from chronological to ranked. So goes Twitter?

      1. Jon Reed says:

        Yes some interesting parallels Clive. Also some important differences. But I do think radical changes to the ERP core will come in time, some already have. Twitter is a bit of a unique case in that chronology is essential to the user experience, at least for many. In fact, Twitter has even affirmed this in their own statements. Whereas ERP users ALWAYS wanted more strategic value, realtime insight and automated/intelligent processes than vendors could provide – even in the 90s. And ERP vendors of all flavors are doing their utmost to fulfill those demands now, via cloud, collaboration, machine learning – with varying degrees of success. – Jon

        1. Jon, very inspiring noting the important differences.

          Twitter evolved the core. ERP avoided evolving the core (for very long time)

          Twitter applied ML to its core. ERP is adding ML on top of its core.

          Twitter evolved from Chronological to ranked (Google is ranked). ERP added features on top of its core G/L MRP item master.

          Twitter learned to create its own demand. ERP stopped at forecasting demand .

          Or is ERP + ML a radical evolution of the core to create certainty of demand?