Matching Engine

Introduction

Tinder, Google AdWords and Pandora are some examples of businesses whose core is based on matching engine. Matching elements from one set to another has lot of business value, lets consider a few of them:

  1. Matching resources to jobs
  2. Matching a product to a potential customer
  3. Matching customer service representative to a complain customer might have

In this article we explore different matching problems, how to abstract them into a form that modern algorithms can solve.

Motivating Examples

There are three projects in distinct domains where we’ve built custom matching engine. Each of the domain had special domain requirements which will be covered in detail

  1. Travel: Matching destinations to traveller
  2. Recruiting: Matching job seeker with recruiting company based both on preference and location proximity
  3. Legal: Matching technical expert that an Attorney can hire for help with Patent Litigation they might be fighting

Travel

For a Mobile Application that our client was building they wanted to build a matching engine which will enable users to make informed decisions about their itinerary.

Depending on a brief questionnaire users filled in, we build a profile of what profile category the user might fall into. Based on this we look at recommending activities they might do at a location of their choice.

In computer science this problem is often reffered to as Collaborative Filtering There are two variants of this algorithms namely Item Based and User Based. For this project we’ve implemented User Based. The core idea of Collaborative Filtering is to

  1. Capture preference of the user
  2. Find set of users that have similar preference to the requestor
  3. Recommend items based on preference of similar users

At high level it be expressed as:

from collections import defaultdict

# this function is responsible for generating recommendations
def get_suggestions(current_user_preference, similar_users):
    # this dictionary will have lookup of item to score
    global all_preferences

    results = defaultdict(float)
    for candidate_user  in similar_users:
        for candidate_item in all_preferences[candidate_user]:
            # make sure we're not suggesting item that has
            # already been rated by current user
            if candidate_item not in preference:
                # what is similarity between current user and candidate
                similarity_score = similar_users[candidate_user]
                candidate_pref_score = all_preferences[candidate_user][candidate_item]
                results[candidate_item] += similarity_score * candidate_pref_score
    return results

if __name__ == "__main__":
    preference = capture_user_preference()
    similar_users = get_similar_users(preference, all_preferences)
    suggestions = get_suggestions(preferences, similar_users)
    sorted_suggestions = reverse_sort_by_score(suggestions, k=10)

Few things to note in the code listing above:

  1. capture_user_preference: function is the one that gets some baseline information from current user
  2. get_similar_users: is used to get list of similar users sorted by similarity score. Similarity score can be Jaccard score
  3. all_preferences: has ratings from reference users whom we’re using to profile current user
  4. sorted_suggestions: will have at most k suggestions sorted by score

Recruiting

Finding the right candidate is often difficult, specially if you’re a HR professional dealing with stream of resumes on daily basis. One of our Clients asked us to develop a technical solution for this problem. We build a location based mobile web application that automated matching and documented whole interaction {for algorithmic improvements}.

We had implemented an algorithm similar to one mentioned above with an extension to filter results by location proximity. Apache Solr’s Geo features were used, upon completion of the process the Interviewer can fill out an evaluation form which would get sent to recruiter instantly.

Outcome of this tool were screening time for resumes was reduced, allowing recruiters to focus on important tasks.

../../_images/xc.png

Other Variations

There are other variations of Matching algorithms that are used in our daily lives:

  1. Microsoft’s TrueSkill
  2. NRMP Algorithm