Matrix Factorization and Advanced Techniques - Week 6

Interview with Francesco Ricci

Context is in general any kind of information or condition that might influence the users perception of an item.

Examples of factors;

  • Distance
  • Temperature
  • Weather
  • Season
  • Weekday
  • Time of day
  • Crowdedness
  • Companion
  • Mood
  • Sleepiness

…und so weiter…

Adding context can be for example adding additional dimensions to the classic 2-dimensional user-item model. Ratings can be grouped by context for examples, meaning less data for each condition, but hopefully more relevant data.

Following on this, you can look at having a similarity measurement between contexts to determine which data can be shared.

Paradigms for incorporating context

  • Contextual pre-filtering
    • Limit the data for the model based on context (as mentioned above), train one 2D model per context
  • Contextual post-filtering
    • Train the model on all data, filter out items post-recommendation based on context
  • Contextual modelling
    • True multidimensional modeling using tensor factorization instead of matrix factorization.

A lot of applications for contextual recommendations comes from mobile applications, as the device has many sensors and follows the user throughout their day.

Context is used automatically by our minds in order to decode ambiguous messages (example by Kahneman on reading hand writing either as B or 13 depending on surrounding text).

The music you’ve listened to recently will influence the music you want to hear next.

More …

Matrix Factorization and Advanced Techniques - Week 5

Learning Recommenders

  • ; parameters and/or recommendation model
  • ; error or utility computing predictions or recommendations with model .
    • When using utility, maximize instead of minimize.
  • An optimization algorithm is used to find the best parameters (e.g. gradient descent or expectation maximization)

Interview with Xavier Amatriain

For Netflix, recommendations are considered more important than search. Search is what happens when recommendations fail. Most videos watched comes from people following recommendations, over explicit search.

They work with rows and rankings, i.e. a row is a category and the order of movies within the row is the ranking. So recommendations works on two levels; order of categories and order of videos within the row.

The basic idea of their recommendations is to optimize the amount of hours that people engage with the service as that corresponds to actual business objectives.

Their recommendations look at implicit signals, not explicit ratings, as this corresponds better to actual behavior. This is also because users over time have begun giving less explicit feedback. A further complication is that accounts are oftentimes shared by multiple persons in the household - which also means looking at the current context and implicit feedback works better.

When starting a ranking problem, popularity is a good start. Then the question is how to start adding personalization into the mix. A linear model combining these scores is a good start.

Learning the weights for the two parts can be done with logistic regression, treating it as a classification problem trying to classify the items the user watched correctly.

There are a lot of more advanced way to try to better learn the ranking of items that closer corresponds to the objectives you might have (learning to rank, list wise approaches etc…), but these are oftentimes not differentiable so will require more complex approaches.

More …

DeepFM Paper - Notes

Notes from reading DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

Abstract

The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture.

The key challenge is in effectively modeling feature inter-actions.

Generally, it’s said that models historically have been biased towards either low-order or high-order interactions between features. From the Wide & Deep model, both of these factors have begun being combined in models. DeepFM thus builds on the Wide & Deep model as a further improvement.

More …

Matrix Factorization and Advanced Techniques - Week 4

Since week 3 were only a very short assignment, no notes. Week 4 deals with hybrid recommender systems.

In practice, the strength of different algorithms are oftentimes combined.

Example: Popularity

  • Collaborative filters trained on ratings predicts whether a user would like a move if the saw it.
  • Overall popularity though may be a better predictor for whether they will watch it or not.
  • Blending overall popularity with collaborative filtering might produce a good list of movies that user both will watch, and like.

Means of Hybridization

Most Common:

  • Combine item scores
  • Combine item ranks
  • Integrated models

Many others possible:

  • Conditionally switch algorithms
  • Deep integration (e.g. putting content-based computations inside a collaborative filter)
More …

Matrix Factorization and Advanced Techniques - Week 1-2

Dimensionality reduction is a way to use efficient methodologies from machine learning to distill out the essential signals from high-dimensional vectors.

Matrix factorization can be done in multiple ways, from algebraic SVD to learned representations using methods like gradient descent or probabilistic methods.

More …