About

Trey leads the Search Technology Development group at CareerBuilder, developing web-scale search & recommendation engine technologies.

I just got back from a fantastic trip to Dublin, Ireland for last week’s Lucene/Solr Revolution EU. I was privileged this year to to present a deep dive (75 minute) session on “Enhancing Relevancy through Personalization & Semantic Search.” I appreciate all the great questions and feedback from everyone who attended.

Video:

Slides:
http://www.slideshare.net/treygrainger/enhancing-relevancy-through-personalization-semantic-search-28741313

Talk Summary: Matching keywords is just step one in the effort to maximize the relevancy of your search platform. In this talk, you’ll learn how to implement advanced relevancy techniques which enable your search platform to “learn” from your content and users’ behavior.

Topics will include automatic synonym discovery, latent semantic indexing, payload scoring, document-to-document searching, foreground vs. background corpus analysis for interesting term extraction, collaborative filtering, and mining user behavior to drive geographically and conceptually personalized search results.

You’ll learn how CareerBuilder has enhanced Solr (also utilizing Hadoop) to dynamically discover relationships between data and behavior, and how you can implement similar techniques to greatly enhance the relevancy of your search platform.

I just made it back from the beautiful, sunny city of San Diego where LucidWorks hosted another fantastic Lucene/Solr Revolution conference this week. I was invited back this year to present on “Building a Real-time, Big Data Analytics Platform with Solr.” Thank you to everyone who came and packed out the room, especially those who provided great feedback afterward and asked all of the terrific questions!

Video:

Slides: http://www.slideshare.net/treygrainger/building-a-real-time-big-data-analytics-platform-with-solr

Talk Summary: Having “big data” is great, but turning that data into actionable intelligence is where the real value lies. This talk will demonstrate how you can use Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.

At CareerBuilder, we utilize these techniques to report the supply and demand of the labor force, compensation trends, customer performance metrics, and many live internal platform analytics. You will walk away from this talk with an advanced understanding of faceting, including pivot-faceting, geo/radius faceting, time-series faceting, function faceting, and multi-select faceting. You’ll also get a sneak peak at some new faceting capabilities just wrapping up development including distributed pivot facets and percentile/stats faceting, which will be open-sourced.

The presentation will be a technical tutorial, along with real-world use-cases and data visualizations. After this talk, you’ll never see Solr as just a text search engine again.

Solr in Action

I’m excited to announce early access availability of Solr in Action, a book on Apache Solr 4 which I am co-authoring with Timothy Potter. The MEAP (Manning Early Access Program) released today, which means that you can purchase the book early and receive new chapters as they are being written so that you don’t have to wait for the final release before having access. Three chapters are currently available (”Introduction to Solr”, “Key Solr Concepts”, and “Indexing”), and we expect a new chapter to be released every few weeks.

Please consider heading over to solrinaction.com and picking up a copy today!

Book Summary:
Whether you’re handling big data, building cloud-based services, or developing multi-tenant web applications, it’s vital to have a fast, reliable search solution. Apache Solr is a scalable and ready-to-deploy open-source full-text search engine powered by Lucene. It offers key features like multi-lingual keyword searching, faceted search, intelligent matching, and relevancy weighting right out of the box. Solr 4 provides new features to enable large-scale distributed search solutions that can be deployed as an elastically scaling cloud-based service and can provide additional intelligence to other big data technologies like Hadoop and Mahout.

Solr in Action is a comprehensive guide to implementing scalable search using Apache Solr 4. This clearly-written book walks you through well-documented examples ranging from basic keyword searching to scaling a system for billions of documents and queries. You’ll gain a deep understanding of how to implement core Solr capabilities such as faceted navigation through search results, matched snippet highlighting, field collapsing and search results grouping, spell checking, query auto-complete, querying by functions, and geo-spatial searching.

Along the way, you’ll discover more advanced topics, such as scaling Solr for large production environments, best practices and strategies for handling multi-lingual content, building a Solr-powered recommendation engine, performing complex data analytics, and integrating Solr with other big data technologies for machine learning and knowledge discovery.

You will also learn how Solr’s relevancy algorithm works, best practices and tricks for tuning and measuring your search relevancy, and even how to write and integrate Solr plugins and patches to introduce your own great new search features.

After a wonderful time last year, I was able to present yet again this year’s Lucene Revolution conference in Boston.  Lucene Revolution is a yearly conference put on by Lucid Imagination, a company focused upon supporting and commercializing the open source Apache Lucene and Solr search technologies (and integrating them with related technologies). My talk was entitled “Building a Real-time, Solr-powered Recommendation engine.”

Video:

Slides: http://www.slideshare.net/treygrainger/building-a-real-time-solrpowered-recommendation-engine

Talk Summary: Searching text is what Solr is known for, but did you know that many companies receive an equal or greater business impact through implementing a recommendation engine in addition to their text search capabilities? With a few tweaks, Solr (or Lucene) can also serve as a full featured recommendation engine. Machine learning libraries like Apache Mahout provide excellent behavior-based, off-line recommendation algorithms, but what if you want more control? This talk will demonstrate how to effectively utilize Solr to perform collaborative filtering (users who liked this also liked…), categorical classification and subsequent hierarchical-based recommendations, as well as related-concept extraction and concept based recommendations. Sound difficult? It’s not. Come learn step-by-step how to create a powerful real-time recommendation engine using Apache Solr and see some real-world examples of some of these strategies in action.

I recently gave a presentation at Lucene Revolution 2011 out in San Francisco.  The title of my topic was “Extending Solr: Building a Cloud-like Knowledge Discovery Platform.”

Video:

Slides: http://www.slideshare.net/treygrainger/extending-solr-building-cloudlike-knowledge-discovery-platform

Talk Summary: For CareerBuilder, a 1% deviance in search relevancy can mean millions of missed job opportunities for our users. When CareerBuilder moved to Solr from an expensive, proprietary search vendor, our top priorities were maintaining the quality of our search results and drastically improving our agility. This talk will describe how we addressed both needs. For search quality, we’ll cover some of our internal studies and resulting methods for dealing with multi-lingual content across dozens of languages, as well as customizing and experimenting with relevancy calculations. For platform agility, we’ll discuss CareerBuilder’s cloud-like search API framework which seamlessly handles millions of searches an hour, processes hundreds of millions of documents, and is powered by hundreds of globally-distributed servers. Come hear the results of our studies and some best practices for quality and performance. Learn how our framework has lead to staggering improvements in both maintainability and technology innovation, allowing us to learn from our content, not just find it.

Check out my interview, published today, with Mitch Pronschinske from DZone:  The Solr Conversion at CareerBuilder.com: Lower Costs, Greater Agility

Questions from the Interview:

  • Jobs are one of the most important things we search for on the web.  What are some of the major challenges for search technology on a jobs site?
  • What are some of CareerBuilder’s unique challenges in search?
  • You led the conversion of CareerBuilder’s search platform from FAST ESP to Apache Solr.  Why did you think this was necessary and how did you convince upper management to make the change?
  • What benefits have resulted from the switch to Solr?
  • What were some of your search experiences related to genetic algorithms?
  • Can you tell us about the cloud-like search API you created for CareerBuilder?
  • Tell us about your side project, Celiaccess.com.

Celiaccess.com Launches!

October 22nd, 2009

On October 17th, my wife Lindsay and I launched a new website for the gluten-free community.  Celiaccess.com is a gluten-free search engine & networking site.  It is community supported, with users being able to add, update, and search through any products or restaurants to determine whether or not they contain gluten.  The idea is that instead of spending countless time searching through Google/Bing/Yahoo results and paging through outdated gluten-free forums, that users can quickly find the research other users and organizations have compiled within a matter of seconds.

Please try out the site or Read our Press Release to find out more.

Celiaccess.com is a free resource provides to the gluten-free community - make it your own by contributing back!

My Professional Profile

June 3rd, 2009

For more information about my experience and for networking opportunities, you can view my professional profile at http://www.linkedin.com/in/treygrainger.