Trey is SVP of Engineering @ Lucidworks, co-author of Solr in Action, founder or Celiaccess.com, researcher/ public speaker on search, analytics, recommendation systems, and natural language processing.

I was honored to be invited to present today, along with Khalifeh AlJadda, Lead Data Scientist at CareerBuilder, to a group of nearly 200 Georgia Tech graduate students and other members of the larger Georgia tech community. We thank Dr. Polo Chau and his Data and Visual Analytics class for inviting us and sponsoring the event, and we also appreciate the numerous other folks who also heard about and attended the presentation!

Khalifeh and I worked closely together while I was at CareerBuilder evolving their semantic search engine and recommendation engine into self-learning data systems. It was great being able to present some of the similar work I am now doing at Lucidworks, along with Khalifeh, who presented much of the work we had done at CareerBuilder, as well as some of the newer techniques they are now applying.



Talk Abstract:
In the big data era, search and recommendation engines have become the primary mechanisms through which users both actively find and passively discover useful information. As such, it has never been more critical for these data systems to be able to deliver targeted, relevant results that fully match a user’s intent.

In this presentation, we’ll talk about evolving self-learning search and recommendation systems which are able to accept user queries, deliver relevance-ranked results, and iteratively learn from the users’ subsequent interactions to continually deliver a more relevant experience. Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the collective feedback from all prior user interactions with the system. Through iterative feedback loops, such a system can leverage user interactions to learn the meaning of important phrases and topics within a domain, identify alternate spellings and disambiguate multiple meanings of those phrases, learn the conceptual relationships between phrases, and even learn the relative importance of features to automatically optimize its own ranking algorithms on a per-query, per-category, or per-user/group basis.

We will cover some of the core technologies that enable such a system to be built (Apache Lucene/Solr, Apache Spark, Apache Hadoop, cloud computing), and will walk through some practical examples of how such a reflected intelligence system has been built and is being leveraged in a real-world implementation.

Comments are closed.