About

Trey is SVP of Engineering @ Lucidworks, co-author of Solr in Action, founder or Celiaccess.com, researcher/ public speaker on search, analytics, recommendation systems, and natural language processing.

This year’s Lucene/Solr Revolution was held in Las Vegas, and was a blast as always. I had to fortune to present on the Apache Solr Semantic Knowledge Graph. The Semantic Knowledge Graph is a project that I was able to work on with my team while I was at CareerBuilder, and which CareerBuilder subsequently agreed to let us open source as both a standalone project and also as a patch back to the Apache Solr project.

Slides:

https://www.slideshare.net/treygrainger/the-apache-solr-semantic-knowledge-graph

Video:

Talk Abstract:
What if instead of a query returning documents, you could alternatively return other keywords most related to the query: i.e. given a search for “data science”, return me back results like “machine learning”, “predictive modeling”, “artificial neural networks”, etc.? Solr’s Semantic Knowledge Graph does just that. It leverages the inverted index to automatically model the significance of relationships between every term in the inverted index (even across multiple fields) allowing real-time traversal and ranking of any relationship within your documents. Use cases for the Semantic Knowledge Graph include disambiguation of multiple meanings of terms (does “driver” mean truck driver, printer driver, a type of golf club, etc.), searching on vectors of related keywords to form a conceptual search (versus just a text match), powering recommendation algorithms, ranking lists of keywords based upon conceptual cohesion to reduce noise, summarizing documents by extracting their most significant terms, and numerous other applications involving anomaly detection, significance/relationship discovery, and semantic search. In this talk, we’ll do a deep dive into the internals of how the Semantic Knowledge Graph works and will walk you through how to get up and running with an example dataset to explore the meaningful relationships hidden within your data.