A Major Machine Learning Update for Search

Related products: Navigation and Search
A Major Machine Learning Update for Search

As many of you know, search has been a top priority of ours for a while. You might have noticed this from our roadmap updates (shared here quarterly!) or past announcements like the new algorithm, and snippets and highlighting. The consistency of this theme is because we know that search is critical to Guru serving you well, and it’s a top area of feedback. In light of this, we’re delighted to share our latest improvement to search for two reasons. First, it’s already having a positive impact on search performance, and second, because the technology powering it sets us up really well for improvements in the future - even beyond search! 

Cool, what’s this fancy new tech?

This improvement takes advantage of transformers–one of the fastest developing areas within machine learning, and specifically the deep learning space. Adding this technology means that Guru can now interpret the meaning of content. 🤯  If you’re following the search industry, this area is called “vector” or “neural” search and it’s a big step forward. Transformers can interpret meaning because they learn how words are used in various contexts by paying close attention to other words and their order. Once they capture the meaning of a sequence of words they represent that sequence as a vector (a list of numbers). For example, "bank" has a different vector in "take money out of a bank" than in "along the Schuylkill river bank.” Representing text as vectors then makes it possible to use math to reason about different sets of text: these two pieces of content are similar, these two are not similar, this text is often found alongside this text, etc. AI-powered search is pretty cool, right?

How does this help Guru’s search?

We know from our data and your feedback that it’s not unusual for someone to search for something, not see what they need at the top of the list, search again, perhaps with slightly different wording, and then choose a Card that was in their original results list. Since most people focus on just the top few search results, this isn’t surprising. If a Card is not in the top five results, it often doesn’t get much attention (more than 90% of actions on our search results page happen within the first five results). For that reason, we know it’s critical that the most relevant results are as high in the list as possible. We also know from feedback, and personal experience, that having to remember keywords is not an easy way to locate the information you need.   

With this improvement, Guru uses its new ability to infer the meaning of search terms and Card content as a way to re-order results for each search to ensure that the most relevant Cards are as high in the list as possible. This helps reduce the need to use the “right” terminology for a search. What does this look like in practice? Within our own company we see many searches for “wfh” and “work from home,” which intuitively, as humans, we know are related phrases. Before this update these two searches yielded very differently ordered results lists. Now, regardless of which is provided as a search term, at least two Cards related to our employee policies, which is usually what employees want, are in the top four results–jumps, in some cases, of multiple spots up the results list. 

In addition to using the meaning distilled from Cards and search terms, Guru now also takes into account the collective intelligence of your team as it searches for knowledge. Specifically, Guru pays attention to the meaning of all the different search terms used by your team and how users interact with results and uses what it learns from those actions as another factor when determining relevance. This means less figuring out what words to search with to get to the content you need because your teammates have already done that work and Guru has learned from it.

Keep in mind that this new machine learning technology is one piece of a growing search architecture, there are many things that go into relevance.


Illustration of an example search and result.


Neat! You said this release has already improved search performance. How so?

This enhancement finished rolling out on February 24th so we now have several weeks of data to review. We keep track of quite a few metrics for search, most based on click data and the type of interactions users have with results: do they open them, copy them, favorite them, etc. One of our main metrics focuses on how often someone interacts with a Card in the top five results, and what rank it had (was it first, second, third, etc.). This is an industry standard way of measuring search performance (for the search aficionados among you, this metric is MAP). 

Thus far we’ve seen the median of this metric improve more than 6%, which makes this one of the more impactful improvements of the last year. Being a median, this means some customers have seen significantly more improvement–some well over 10%!! We’ve also seen more than 8.5% improvement in searches where three or more words were provided (about 30% of searches)–which makes sense, the more words there are, the more content for Guru to glean meaning from. TL;DR: the numbers are looking good, and we believe there will be additional positive shift as the model gets to see more activity over time.

What are the additional potential improvements you mentioned?

It took a significant effort to put this technology in place but we decided it was worth it because it has great potential, even beyond search. With this new ability to relate content based on meaning, Guru could do things like form a picture of what content you tend to look at, and provide suggestions for content that’s similar in meaning. Additionally, since meaning is mostly language agnostic, Guru could draw relationships between content that is in different languages. Imagine searching in French and getting back the most relevant content that happens to be in English (which you also speak)! The possibilities are nearly endless and we’re so optimistic about them, we could write a whole blog post on the topic (actually we have, look for it later this week). That said, we’ll have to see where the road takes us – these are ideas, not a roadmap :)

All this is cool, but this is a long post.

Totally, this was a big update and there was a lot to share so we appreciate you sticking with us!  Also, while we’re pleased with the impact this update has had so far, we know there is still a lot of room to improve search so this is by no means the end of the road. There are many more algorithm, machine learning, and user experience updates to come. As always, if you have questions about this improvement, or search generally, please let us know in the comments below or reach out to our Support team.