What's in a Number - Aggregating Item Reviews

When comparing items at sites that gathers reviews, it’s a quite common problem that the scores are plain averages of the reviews which doesn’t feel fair in the case where items with only a few maybe biased reviews gets a very high ranking - appstore developers asking their friends to rate their new app upon release as a prime example. This feeling of unfairness has been shown in usability studies as well; an item with 4.5 average and 12 reviews is generally preferred over an item with 5 average but only 2 reviews.

Another way to think of review scores rather than as simple averages then is to see it as an expression for an aggregated wisdom of the crowd. From this understanding follows that in order to not unfairly assign too much weight to reviews for items with few reviews, we need to weight the reviews based on how many there are.

More …

RNN - Week 3

Sequence to sequence models, prime example is translating sentences from one language to another. This uses an encoding network followed by a decoding network, so that you try to predict the most likely sentence in an output language conditioned by the sentence in the input language.

More …

RNN - Week 2

This week cover word embeddings with the popular algorithms Word2Vec and GloVe. Good basic coverage of the topic, but didn’t present much new personally since I’ve already read the original papers in question and used word embeddings a bit.

In essence;

  1. Learn word embeddings from big corpus (alternatively get pre-trained model).
  2. Transfer the embeddings to new task with smaller training set.
  3. Optionally finetune the embeddings with the new data.
More …

Recommendations at Amazon and Youtube

Main takeaways from the papers Two Decades of Recommender Systems at Amazon.com and Deep Neural Networks for YouTube recommendations, plus a bunch of insightful quotes included below.

  • Recommender systems usually involves a surrogate problem, where you take interest signals and transfer them to a new context. It thus becomes important to have good evaluation metrics to avoid overfitting the model on the surrogate problem rather than what you actually want to optimise.
  • There are complex relations between human behaviour, objects and objects internal relations. User intent, product features (frequently/seldom bought items, expensive vs. cheap => window shopping etc.) and objects with built-in sequential consumption (series) changes behaviors radically.
  • (Relative) Time is very important in order to make sense of historic data, and how to make future predictions based on it.
  • When having huge sets of items that could be recommended, one way to scope the problem is to have one system that generates a limited set of candidates, and another that then ranks this subset relatively to get top recommendations.
  • Despite the promises of deep learning, for recommendations a big part of the work is usually still feature engineering.
More …