The Challenges with Keywords…

Read our latest white paper on the challenges and constraints of using keywords to analyze large volumes or complex documents or text…

Google recently announced in November, 2019 that it has made some significant but not well publicized changes to its search algorithm, adding natural language processing algorithm named “BERT” make it better at understanding user search queries, mirroring the way that humans would understand the expressions. But given the ubiquity of Google search and the familiarity of most users with the conce the concepts of keyword search, why would Google move to enhance it’s search algorithms with the ability to understand human expressions?

The answer is a simple one that most of us would be familiar with – using key words to search for what you’re looking for in content can yield large amounts of useless results. Ideally, your key word search would return just the relevant and pertinent information you need with the minimum input in the fastest time possible. But as most of us realize more often than not this is usually not the case

In business, the situation grows more dire. Traditional media intelligence processes and workflows are heavily reliant on manual searches based on keywords, or infinite nesting of Boolean statements. These lengthy and somewhat complex queries can take huge amounts of time to create, update and maintain, which is not only frustrating it’s completely inefficient and, for many mid-market businesses, it’s just not sustainable with slim resources already stretched thin.

Fortunately, there is a better way. “Virtual assistants” that use a form of Artificial Intelligence (AI), in the fields of Natural Language Processing (NLP) and Machine Learning (ML), use mathematical algorithms to “read” unstructured content like news articles, emails and policy documents, that helps save time and increases the precision of the results returned. These technologies are more effective by using human language expression to analyze and understand the sense or context of content. But these AI powered Virtual Assistants don’t just reduce the clutter of information and results returned, they can provide a variety of beneficial insights and uses that previously weren’t feasible or possible.

The Benefits of AI Virtual Assistants over Keywords

So, let’s look at an example of a problematic keyword search that can illustrate the benefits of using AI Virtual Assistants. A topical subject area that has some nuance and context to it is the most common challenge for users in finding the content they want, say you’re searching for new articles, media or even research and policy documents on “river bank. While the topic seems explicitly straight forward to the man on the street, the real challenge is that a typical keyword search will produce all sorts of ambiguities and irrelevant results.

Users will quickly realize that there are results returned for financial institutions and banks (“River Bank”, River Bank & Trust, Two Rivers Bank,….), but also results for the definitions of river banks and information on different types of bodies of water, advertisements and websites for towns named river bank, translation of river bank into other languages,”River Bank” by Brad Paisley, and the list goes on….

A boolean search could help you filter the results, but there are very particular challenges with these types of key word queries. Firstly, they can get very long and complicated very quickly.

Here’s a short version of what a search string could look like to filter out some of the other main results for ‘river bank‘:

“river bank” NOT (“financial services” OR “River Bank Wisconsin” OR “Brad Paisley” OR “ personal banking” OR “wealth management” OR “flooding” OR “Definition” OR “Riverbanks Zoo & Garden” OR………………………………………….you get the picture ;).

Now imagine doing that for every topical query, and having to update and maintain these queries as time goes by.

Secondly, and just as importantly, using these type of negative key word queries can inadvertently reduce the recall results of the search request, and therefore maybe excluding results that are highly relevant to the topic the user is focused on. Increasing the precision of your results at the sacrifice of recall can have dire consequences for identifying important and relevant content and information by excluding them unintentionally.

In contrast, the use of the AI Virtual Assistant to “score” the news articles, blog posts, policy documents or any other text items appears strikingly simple when placed next to the key word search results above.

The AI Virtual Assistant determines low NLP scores for the key word articles, news media and other results that were returned by just key word search. The context of “river bank” is not present in the full text score results from these other articles because the language and text they contained were dealing with financial services, or famous song writers, or small cities across the country. So, the articles relating to Brad Paisley and Twin Rivers Bank would be filtered out by the AI Virtual Assistant in returning just the relevant results from the comparison of documents across the population of all content to be considered.

Artificial intelligence and the benefits of Document Scoring

This is not to say that keywords are obsolete. Far from it. They are an essential part of any search tool with very helpful implications in building out a content profile for AI Virtual Assistants. For example, including the keywords, terms and concepts that are frequently associated with the topic of choice will really hone the document scoring by the AI Virtual Assistant to deliver exactly what you are looking for.

The incremental benefits of using AI and NLP over keywords extend beyond the capture of documents that are responsive to the profile you provide to the AI Virtual Assistant. A range of possibilities open up including – creating personalized summaries of documents or groups of documents, extracting highly specific or relevant passages of large multi-topic documents, navigating across large repositories of documents without search, prioritizing documents for consideration based on relevance and, most importantly, creating personalized profiles for a recommendation engine that caters specifically to user needs.

Contexture.ai provides the latest AI-powered solutions for customers across multiple industries, with configurable virtual assistants in BillWatch and MediaWatch for the policy, regulatory and mixed media markets. Contact us today to learn more on how your organization can act on the information that matters most.

info@contexture.ai
https://contexture.ai

Scroll to Top