NLP Augmented Analytics is the ability to have a conversation with the system, have it understand your questions, and even anticipate and enrich your queries as it learns about your interests, e.g., "Let me see more of that," or "Compare all three."
For more than twenty-five years, the standard for interactive analytics was Business Intelligence (BI) visualizations such as dashboards, but NLP is the next step in ease-of-use. NLP is a much richer capability with applications far beyond just enhancing analytics. Application of capabilities like Speech Recognition, Machine Translation, Natural Language Generation, Sentiment Analysis, and Automatic Report Generation are applied to extract intelligent information from every kind of data. It is capable of, on an unsupervised basis, sharing exhaustive insights into market sentiments.
NLP has a limiting factor, however. It is not a database or a query processing engine, or a powerful calculation platform. It can understand your question and often provide stunning insight. It can generate a narrative of analysis in words, visualization, or generated speech, but it needs the intelligence of a powerful analytical engine to process the question. NLP can learn. It can extrapolate and provide the ability for you to be more creative, more expansive, and more dynamic in your explorations of data, but it can't count to ten. To complete functionality, an underlying mesh of analytics engines, databases, curated data storage and adequate metadata store to do the heavy lifting.
Requirements of augmented analytics
The whole gestalt of data analytics is "ask a question, get back an answer, and ask a follow-up question." In a data visualization tool, you click and drag. In NLP augmented analytics, you speak. Ultimately, both generate a query that the underlying analytical engine understands.
This experience of conversing with data depends on underlying query response speed. Suppose you ask a question in natural language, or even via visualization drill-down, and have to go to lunch for an hour before you get the answer. In that case, you have far less opportunity to formulate the response and less inclination to pursue curiosity, knowing you could waste hours. The effectiveness of the conversational paradigm depends on a robust back end. The effect of this is even more pronounced in augmented analytics when speaking and listening to your data.
In a conversational mode, natural language processing accepts questions in your spoken language and directs the system to process your questions and even enrich your queries. This is the long-awaited next step of analytics ease-of-use.
augmented analytics goes beyond that to include reaching out to data beyond the database, stored in other forms, and performing a much broader range of analytics, including statistics, data science, AI (machine learning and neural networks), and presenting the answers not just in interactive visualizations, but also in spoken language, and all manner of publication.
To power augmented analytics, the underlying analytical processing engine must:
- Transparently analyze many kinds of data beyond the row and column structured format of the average OLAP database, so all data needed to answer a question can be accessed.
- Scale affordably to handle vast amounts of data, so all data needed to answer a question can be efficiently processed.
- Calculate using advanced analytics capabilities such as geospatial, machine learning, and time series analysis to answer questions about location, patterns, etc., and make recommendations.
- Provide rapid query response with high-performance to maintain a user-comfortable conversational speed.
- The ability to handle many concurrent users efficiently. As NLP removes the barriers to broad data analytics use, the underlying engine's limitations shouldn't put those barriers back up.
Computer - talk to me!
How much non-productive time do we spend with the mouse, a necessary component of a GUI? If you're like most people, you spend many hours a day pushing your mouse around, positioning the point to highlight something to copy, and getting too much or not enough and having to start over, or typing with both hands and having to stop to pick up the mouse to move it.
For NLP-enhanced business analytics, the conversation may be, "Run the latest pricing analysis push the results to my phone." The critical thing to remember is that the computer does not understand what you are saying. It can process it and answer but make no mistake -- it's all done with math.
Organizations that offer NLP capabilities don't have to start from scratch. There are open-source Python libraries that software can integrate with, such as spaCy, textacy, or neuralcrret; SparkNLP (From John Snow Labs) and a few in other languages such as CoreNLP in Java.
I recently did a proof of concept using Pavlov, an open-source library. To complete the picture, I added a data source. Now, we have a complete picture of how NLP Augmented Analytics works in practice.
- I ask, "Run the latest pricing model for product FFP."
- The answer comes back in either computer-generated speech or whatever form of communication desired.
- "Relax the capacity restraint and re-run."
- Answer: "There are currently three capacity constraints. Which do you want to relax?"
- "Domestic manufacturing.""
- Answer: "That will stress the supply of component QFT223L9."
- Relax that capacity restraint without exceeding supply of that component."
- Answer: "Done. Where do you want a copy?"
What happened here is that the system correctly interpreted "that capacity constraint" and "that component" from understanding the context of the conversation. That may seem simple, but it requires the application of some very serious AI, in particular, a Recurrent Neural Network (RNN). RNN is a neural sequence model, an architecture specifically designed to address previous inputs. RNNs sometimes fail to converge on a solution, so an alternative model, Long Short-Term Memory (LSTM), is added.
Unlike feedforward neural networks, RNNs can use their internal state (memory) to process input sequences, like the iterative query above. This makes them applicable to tasks such as unsegmented, connected handwriting recognition, or speech recognition.
Everyday examples that are a little more complicated:
"Show me the sales of accessories."
"Remind me at three o'clock to call Venessa."
"Give me a four-frame visualization of Sales of Accessories: multi-line, Pie, by month stacked column, and sales range."
"Did my propensity model finish?" Not yet.
"Take out Tires and Tubes from that other one."
"Send to Joe with heading Sales of Accessories."
How did the NLP know that I was asking about sales and accessories in a particular version? Because it learns your pattern, and it learns patterns in context from potentially millions of queries. Perhaps this dialogue happened at the end of the month, and the NLP assumed the most current analysis was needed. Notice the comment, "Remind me at three o'clock to call Venessa." It is an out-of-context request, but when it's followed by "Give me a four-frame visualization of sales of accessories: multi-line, Pie, by month stacked column, and sales range," it makes a clean context switch. This may be the essential aspect of Augmented Analytics.
Augmented Analytics does require some training on your part, too. You need to learn how to phrase your conversation so the meaning is not ambiguous. This will get easier over time as the software learns your phrasing things, but you need to help initially. Consider this:
"Hey Siri, I'm bleeding bad. Can you call me an ambulance?"
"Neil, from now on, I'll call you 'Neil an Ambulance,' OK?
This example is from 2016. Apple, getting hundreds of complaints about these misunderstandings, continuously improves the application with advances in NLP. If you spoke this request today, just the tone of your voice would clue Siri that you needed an ambulance.
To do that, Siri not only needs to understand that you need an ambulance, but it also needs to have the ability to dial your phone and communicate with the hospital.
In data analytics, NLP must communicate with data to resolve your question. Once the NLP processing has figured out what the question is, it must provide an answer. What does it need to do?
- Know where the data (or reference material) is.
- Know what format the data is in.
- Devise a way to retrieve the information.
- If an analytical engine can find the information, explain to the analytical engine, SQL, for example, what to do in its dialect.
- Present the information to you as a verbal answer, a visualization, or other communication forms that make sense for that data.
NLP Augmented Analytics is poised for some giant leaps forward. NLP technology has grown from a handful of trained models (used for Transfer Learning) to dozens. I learned at the NLP Summit in October 2020 that each of these NLP subject areas has improved in accuracy in just four years from 50% to 90%:
- Sentiment Analysis
- Chatbots and Virtual Assistants
- Text Classification
- Text Extraction
- Machine Translation
- Text Summarization
- Market Intelligence
- Intent Classification
- Urgency Detection
- Speech Recognition
- Speech Generation
It can read complex text in many languages and sense your emotions from your voice's tenor or even your written questions. In addition to parsing your questions, comprehending what you ask, it can do all the hard work you would have done in the past - attaching the needed data sources, framing the queries in the dialect of those sources, and publishing the result in a format you prefer. It can even ask you follow-up questions.
NLP will become more and more common in corporate analytics applications.
If you're like me, you've been waiting for Conversational AI since the Jetsons. Augmented Analytics goes a step beyond Conversational and Search. Communicating your request to the resources that can perform complex queries and models is the end of annoying user interfaces.