Sign up for our newsletter - Navigating the horizon of business technology​
Technology / Hardware

Search is Not a Side Hustle

Searching for information online has become as instinctive to most people as tying their shoes. It stands to reason, therefore, that search is gaining ground as the preferred way for business people to find, view and analyse corporate, relational data sets, writes Amit Prakash, co-founder and CTO, ThoughtSpot. Indeed several analytics vendors including Tableau and Qlik are scrambling to add or acquire a search capability.

However as our new CEO Sudheesh Nair likes to say “search is not a side hustle!” Just look at Google and Yahoo. Google built search from the ground up. Yahoo bolted it on. And look how that turned out.

Developing the capability to search on numerical data is complex. For it to work as users expect, it must be intrinsic to an analytics systems’ architecture. If you’re a buyer, you need to know how to spot genuine relational search. This becomes even more important if you plan to introduce AI capabilities like natural language processing and voice.

All Search is Not Created Equal

First let’s quickly review the relatively simple approaches that popular services like Google, Amazon, LinkedIn and Facebook use.

White papers from our partners

Amit Prakash, co-founder and CTO, ThoughtSpot

Google uses what’s known as “document search. Web pages, after all, are essentially souped-up text documents. Giving more weight to linked documents, Google’s PageRank algorithm ensures that the most relevant pages are ranked highest and bubble up to the top.

Amazon’s searchable universe is a known set of objects (products) and their properties. Its search engine employs what’s known as “faceted” or “object” search to help customers find relevant products. LinkedIn works in a similar way, except that the objects in their universe are people, companies, and jobs.

Facebook uses yet another technique called “graph search,” which analyses the connections between friends. Facebook graph search lets users specify how broad a search should go within their friend network.

Introducing Relational Search

Relational search is a new approach designed for searching and analysing company data – financial, marketing, sales, supply chain, and other business data sources.

Relational search is a completely different and much harder kind of search problem than those described earlier. Company data, especially for large enterprises, is nothing like the Web documents people search with Google or a network Facebook friends. The most valuable company data is typically stored in relational databases, private clouds, public clouds, Hadoop clusters, and even spreadsheets.

Let’s explore some of the reasons relational search is harder than other types of search.

Company Data is Complicated

Credit: Mohammed Metri, CC0, Unsplash

A company’s data spans multiple databases, tables, columns, rows, and keys, with a complex web of relationships between them.

For large global enterprises, this data exists in multiple data centres spread around the world. These sources invariably live in different systems that were originally intended for a limited group of users. A relational search engine needs to crawl all these data sources, correctly identify the relationships, and allow the right people to search and analyse this data. All this must be done without losing the instant gratification people have come to expect from a search bar.

The results have to be 100% accurate

Sometimes Google gets it wrong. If you search for “cities that are not in Wales”, the first results are pages giving information about all the Welsh cities. But the implications of a bad Google search are trivial. Just reword your search and try again.

In the realm of relational search, there is no room for inaccurate answers. How much revenue you generated last quarter has one singular right answer. Not only is there only one correct answer, mistakes can be even harder to spot. And they can lead to bad business decisions. The only thing worse than guessing is being convinced by bad insights.

Everyone isn’t Allowed to See Everything

Most of what people search online doesn’t have to be secure. The opposite is true in the enterprise. Security has to take into account the user’s role, department, and sometimes geographical location. It’s easy to hide a column, like salaries, but what if sales managers should only be shown information on their regions?

Add in the fact that most enterprises operate in multiple regions, each with their own privacy and compliance requirement, and it’s clear. Row-level security is a must-have for relational search.

And that’s not easy to build.

It Must be Fast

Search engines must respond to users instantly (often measured in milliseconds). This means a relational search engine has do the following in the blink of an eye:

  1. Read multiple keystrokes
  2. Mine a huge morass of data, correctly interpreting the relationships between data elements
  3. Apply security rules
  4. Make relevant suggestions
  5. Translate searches into queries that can be applied to users’ data
  6. Return accurate results

People might be willing to wait two weeks for a report, but they expect search suggestions to be instant.

There’s a Lot to Search

Searching enterprise data isn’t like searching the file system on your laptop. We’re talking about terabytes of data, spread across multiple databases that included thousands of tables.

In Web search, the data volume is huge, but a single search only touches a tiny fraction of that data. If you type “bumfuzzle,” only the Web pages that include that term are included in the search (about 75,500).

In relational search, every data element has to be considered for matches with each search term. And often, large amounts of data must be aggregated, as in the search “revenue last 5 years.”

It May Need to Interpret Natural Language

Thanks to the explosion in voice-activated searching via Google, Alexa, Siri and others, there is growing demand for voice activated analytic queries. This means that relational search will need AI capabilities that accurately interpret and act on natural language commands issued by non-technical business people. More importantly, it needs to be able to parse the analytic intent to be able to deliver the singular right answer to the posed question.

Behind relational search’s deceptively simple experience, lies a massive amount of complexity. In order to test that a BI system you’re evaluating addresses these complexities, first find out what type of search its employs. Then it’s imperative to test search accuracy, speed, security and scalability on real data sets with real business users via text and voice, if applicable. Putting a system through this rigorous evaluation process will soon reveal whether it’s the ‘real deal’ or if its search capabilities were rushed to market with as a side hustle.

Read also: MongoDB Lets Rip at AWS After Amazon’s DocumentDB Launch

 
This article is from the CBROnline archive: some formatting and images may not be present.

CBR Staff Writer

CBR Online legacy content.