Introduction to Recommender Systems Week 1

Basics of recommender system conceptually; learning from signals from other actors. In this sense, a movie review in a newspaper is a recommender system.

Can also be seen in a group of ants spreading out. Once one find a piece of food, they will lay a trail that will be followed by other ants and soon all will group to the piece of food.

Information retrieval

  • Answer questions about large selection of documents Like TFIDF, good for finding things in static content base. Dynamic information need (search query).

Information filtering

  • Rapidly changing streams of information, dynamic content base. Focuses instead on modelling user need and then filtering new incoming content.

Collaborative filtering

Idea that the filtering needs to be better than just high level topics or keywords.

Manual

  • Community of users that recommend things to each other and to distribution groups that people can subscribe to.

Automatic

  • K-Nearest Neighbour for example.
  • For early approaches here were the phrase recommender systems originally coined.

Types of Recommenders

  • Filtering interfaces (e.g. e-mail)
  • Recommendation interfaces (“recommended for you”)
  • Prediction interfaces (predict review score)

Vocabulary

  • Rating - Expression of preference
    • Explicit
    • Implicit
  • Prediction - Estimate of preference
  • Recommendation - Selected items for user
  • Content - Attributes, text etc.
  • Collaborative - Using data from other users

Recommendation approaches

  • Non-personalised and stereotyped
    • Popularity, group preference
  • Product association
    • Association rules
  • Content-based
    • Learn what I like (in terms of attributes)
  • Collaborative
    • Learn what I like, use other similar users experiences for recommendations

Designing a Recommender

  • Collect Opinion and Experience data
  • Find relevant data for your purpose
  • Compute recommendations
  • Present data in useful way

Preferences and Ratings

Types of explicit ratings

  • Consumption - Fresh after consumption
  • Memory - Some time after consumption
  • Expectation - Before consumption
    • For seldom purchases (house for example), this is the most common …and sometimes;
  • Joke ratings - Amazon joke reviews for example

Implicit data

  • Data collected from user actions
  • User is acting with another purpose besides expressing their (conscious) preference
  • Binary and continuous metrics
  • Might be creepy
    • Be nice; allow users to opt-out

Learning Objectives

  • Predictions: Estimate how much you will like an item or buy an item.
    • A numeric value on some scale.
    • Positive is that it gives a strength to the recommendation.
    • Negative is that it provides something falsifiable.
  • Recommendations: Selects items for things you might like from a larger set.
    • Oftentimes a list of things you might like.
    • Positive is that it provides good choices as a default.
    • Negative can be that if it’s perceived as a top N list of highest predicted, people might not explore further if none matches their taste assuming that no later results will match better (think second page of Google)

Explicit vs organic recommendations - “Top recommendations for you” vs. “Things you might like”.

Different types of recommender systems

Analytical Framework

  • Domain
    • What is being recommended? Items, bundles, sequences?
    • New items or old items?
  • Purpose
    • Sales, connecting people, education, etc…
  • Recommendation context
    • What is the user doing at the time of recommendation? Shopping, listening to music, hanging out with other people etc…
    • How does the context constrain the recommendation?
  • Whose opinions
    • Experts, ordinary people, people like you?
  • Personalisation level
    • Non-personalised
    • Demographic
    • Ephemeral (matching current activity)
    • Persistent (matching long-term interests)
  • Privacy and trustworthiness
    • There are personal information and preferences people don’t want revealed
    • Transparency and opt-out
    • Vulnerability to external recommendations
  • Interfaces
    • What does the output look like? Predications, recommendations, filtering?
    • Types of input? Explicit, implicit?
  • Recommendation algorithms
    • Non-personalised summary statistics
    • Content-based
      • Information filtering
      • Knowledge-based
    • Collaborative filtering
      • User-User
      • Item-Item
      • Dimensionality Reduction
    • Others
      • Critique / Interview based recommendations
      • Hybrid techniques

Recommendation algorithms

Basic model consists of Users, Items and Ratings.

The users and items together forms a community, which is the context within where the recommendations make sense.

Further, the entities can contain attributes;

  • User attributes (demographics)
  • Item attributes (properties, genres etc.)

Non-personalised summary statistics

Uses external community data such as best-seller, most popular, trending, average rating etc…

Content-based filtering

Users gets a taste vector of attributes via their interest signals. Items are then compared to the user vector and a vector . product is made for similarity.

  • User ratings x Item attributes => model
  • Model applies to new items via attributes
  • Alternative is knowledge based where item attributes form a (vector?) space, which the user can navigate.

Collaborative filtering

  • Uses opinions of others to predict/recommend
  • User model - set of ratings
  • Item model - set of ratings
  • Common score: sparse matrix of ratings
    • Fill in missing values (predict)
    • Select promising cells (recommend)
  • Several different techniques

  • User-user. Find neighbourhood of similar-taste people (or let user explicitly select people they trust) and use their opinions.
  • Item-item. Pre-compute similarity among items via ratings. Use own ratings to triangulate for recommendations.
  • Dimensionality reduction. Create lower-dimensionality taste-vector via vector compression.

Notes on evaluation

Evaluation is hard. Accuracy of predictions, usefulness of recommendations (correctness, non-obvious, diversity), computational performance.

Meta questions

  • To what extent are recommenders just recommending what people would have done anyways?
  • Can we teach people to give more consistent signals to move beyond the ‘magic barrier’ of custom inconsistency between previous and future actions.

State of the art (2016 :D)

We have great well-known algorithms, but they still need to be contextualised by exploring the use-case, data and users. In that sense, it’s still very much a craft to device a good recommender system.

Looking forward (from 2016)

  • Many hard problems remaining
    • Temporal Recommendations (/sequences)
    • Recommendations for education
    • Low frequency, high-stakes recommendations
    • Explore vs. exploit
  • Recognised specialty that brings together multiple disciplines
    • Machine learning / data mining
    • Business / Marketing
    • HCI / consumer understanding

Comments