The dangers of machine driven content publishing


Machine learning is opening up new vistas for media. But are we being deluded? That all depends on what you’re looking for in the first place.

machine learning
Performance metrics on election content

As we at diginomica spend more time exploring how content performs and how best to optimize the reader experience I’ve found myself dusting off old statistical skills I thought I had left behind me many years ago.

It’s an interesting area of study that has implications for any kind of content. Speaking personally, I am fascinated by the interplay between machine learning techniques, emerging statistical models and the inferences that can be drawn from them.

It is very early days and there are plenty of ways to experiment. However, regardless of your approach to these topics, a few risks are emerging.

First, assigning cause and effect is a dangerous game at the best of times but a couple of things are happening that give me pause. Check this thought experiment from which looked at recent political media.

In the midst of today’s 24-7 news cycle, most journalists … often find themselves choosing topics that are convenient to write about. Imagine you’re a journalist in front of a blank screen, thinking about your next story, and faced with intense pressure to pump out content. There may be no clear breaking news on Clinton, Sanders, Cruz, or Kasich — so writing about these candidates may require you to conduct research or reach out to voters. On the other hand, your Twitter feed is full of the ‘events’ that Trump so routinely creates and which feed personality-driven celebrity journalism: politically incorrect comments, bogus claims, and far-fetched promises.

In fact, one could argue that it’s not only more convenient to write about Trump, but that it’s more profitable. If you look at the problem from a return-on-investment perspective, you might argue along the following lines: it’s so much easier to write on Trump that I could write ten articles on him or five on other candidates. Even though Trump articles receive fewer page views per article, the difference is small enough that it still makes sense to just write on Trump, because I’ll end up with more page views overall.

The results of the thought experiment above lead me to believe that it’s exactly this line of thought that drives the media to write more articles on Trump than all other candidates combined. At the end of this investigation, we’re left wondering whether it’s really easier to write articles on Trump, or that’s just an attitude adopted by many journalists. After all, at any moment in time there are dozens of possible topics one could write about any candidate, and how “hard” it is to write about any of these topics depends largely on the creativity and resourcefulness of journalists, themselves.

It seems we live at a time when the ability to apply critical thinking to important topics is being lost in the melée to succeed in what we at diginomica see as the zero sum game of chasing page views. That in turn means that much of what passes as media is not being mediated based upon content quality or the application of facts. Instead we see a an ongoing skewing towards headline grabbing clickbait. Machines can readily take over that process. It is already happening.

Kindle reportWorse still, we observe a degree of homogeneity that almost guarantees mutual assured destruction where only the biggest, loudest foghorns ‘win.’ So of course’s analysis makes sense both empirically and at the gut level. Here is a perfect example of what I mean as it relates to the new Kindle. Out of 18 headlines, only one offers a substantial variation of the PR. (see image left)

Elsewhere, I have been told that some Wall Street quants are getting inventive in their mining of headlines in an effort to look for positive signals in the marketplace. The theory goes that feel good sentiment can be used to drive market pricing up. In those terms, you can almost argue this as a form of price fixing. The fact so many analysts maintain bull positions, often on the most flimsy of reasoning, and then present with rosy headlines, has the feel of an engineered self-fulfilling prophecy. It is deeply worrying.

From a consumer perspective it is not hard to see how this slots neatly to the Facebook feel good narrative of wanting everyone to play nice with one another while displaying their kitty antics.

None of that feels real. It isn’t real. There is no such thing as a one way ticket to some imagined heaven that can be machine driven. Life consists of up and down cycles and that is equally true of media and content. You only need to observe the heartbeat cycles of daily viewing from any media property to recognize that.

But…if you’re in the business of media production that has some lead generation or other sales/marketing motive attached, then you can quickly see where this goes.

If I want to pimp a stock price, product, service or what not then all I really need to do is deluge enough hard pressed media people with slap happy content and my job is done. Or is it? I don’t see that as sustainable any more than I see taking a 100% bearish view of the market. There is a time and a place for everything but at some point reality has to kick in or you end up with the equivalent of a sub-prime crisis.

Right now we are in a skewed and arguably false market where the feel good content, whether prepared electronically or by good old fashioned typed content is ruling the roost – most of the time. It is deeply unhealthy.

As an example, I listened in on a call the other week with a large company fully knowing there is some bad news wafting around in the form of widely touted lawsuits. The company headed it off right from the get go with a blanket denial. Not one analyst chose to broach the topic when that should surely have been the elephant in the room, instead, preferring to congratulate for rosy forward guidance.

Rather than mining for page views as the lowest common denominator, I am much more interested in mining to better understand how engagement works, regardless of the kind of content under review. That’s because engagement tells us much more about reality than any number of views or Twitter/Facebook/LinkedIn likes. It is, for example, telling that in the battle for eyeballs in the Trump election, wary media people have finally realized that concentrating upon facts is a much more potent weapon with which to defeat rhetoric than simply bloviating about whatever shocking utterance has come from The Donald.

The same holds true for any content topic and we can apply that very well in the technology industry that we cover. What we can’t do however is mediate for controlled access. That is a secondary tactic that effectively excludes contrarian voices. Trump’s exclusion of certain media is a blunt and obvious example but we know this happens regularly among technology media.

Got a nice story to write? You’re in. Heading down a critical path? You might not get a sniff. Get enough of those nice stories out there and you are a PR hero. In the long run though, exclusion doesn’t matter and ends up counterproductive. The world is just far too chatty, thriving on rumor and speculation until the facts come into plain view.

I look forward to the day when machine learning helps us develop balanced content. That doesn’t mean content becomes neutral, you can still have implied agendas.  But what I fear, at least for the time being, is a continuation of marketing driven mania that is only interested in telling a story if it leaves that feel good factor wafting in the air like some stale, cheap perfume or which is predicated upon the idea that the only metrics that matter are those increasingly delusional page views.

Image credit - Screenshot via, shiny brain via fotolia.