Collaborative Filtering - UUCF - Week 1

Collaborative filtering ignore the attributes of users and items and instead only look at the interactions between them. Course begins by looking at User to User collaborative filtering (UUCF).

User-User Collaborative Filtering

Naive approach. Predicts the score for an item based on the average score of all other users.

A more nuanced approach might weight the scores of other users based on their similarity to you.

One challenge here is that users generally might score very differently. Normalisation techniques can be used to address this. For the naive case;

…where is the average rating of user u.

It’s worth noting that the normalisation differs from how it’s normally done as we don’t scale it by the standard deviation. This is because we usually have limited data and can’t estimate the standard deviation well. In practice, skipping this step has proved to usually work better.

Similarly in the more nuanced case;

Still, we need a way to compute the weight. The most common technique is a correlation measure, typically the Pearson correlation as defined below;

This generally works quite well. In a sparse dataset with small overlap however, this might not yield good results. Similarly, unary or binary datasets need to be handled differently.

Neighbourhood

In the end, generally you might not want to compare each user to every other user in the whole space. Thus, you will generally first have a filtering step to filter out a subset of users that you compare to.

You might want to first filter out people that are too dissimilar to be likely to provide good recommendations for each-other (or people negatively correlated). With sparse clusters, a person from a different cluster might also be likely not to be able to yield good recommendations for you.

This pre-step can also be useful in order to limit the computation time for the second step.

For these cases, the global population in the formulas is replaced with the filtered neighbourhood .

Implementation Issues

Computation can be a bottleneck for this as performing computations between big sets has high complexity. Limit and cache neighbourhood. Use caching and/or incremental adjustments to correlations. Still, for large sets of data computation remains a challenge.

Core Assumptions/Limitations

  1. Our core assumption is that agreements in the past predicts future agreement. This is generally true if our preferences stays the same, or develops the same way. In some cases, this might not hold true.
  2. The domain of items that are being recommended are within a domain with consistent agreements. This might not hold true for very diverse items. Just because you have the same humour, you might not have the same political views, etc… For retailers, finding these domains is a challenge.

Configuring UUCF

Selecting Neighbourhoods

  • All neighbours
  • Threshold similarity or distance
  • Random neighbours
  • Top-N neighbours by similarity or distance
  • Neighbours in a cluster

How many neighbours?

In theory, the more the better. In practice, noise from dissimilar users oftentimes worsens predictions. Commonly, somewhere between 25 and 100 are used.

Scoring from Neighbourhoods

  • Average
  • Wighted average
  • Multiple linear regressions

Weighted averages are simple but generally works well.

Challenges

  • Users rate differently
  • Some rate high, others low
  • Some use more of the scale
  • Averaging ignores these differences
  • Normalisation compensates for them

Normalizing Data

  • Subtract user mean ratings
  • Convert to z-score (1 = 1 standard deviation above mean)
  • Subtract item or item-user mean

Must reverse normalisation after computing.

Computing Similarities

  • Pearson correlation
  • Usually only over ratings that users have in common
  • User normalization not needed
  • Spearman rank correlation is Pearson applied to ranks
    • Hasn’t been found to work as well

What if users have few ratings in common? In the correlation formula, you can sum over only common items in the numerator, and use all items the users have scored in the denominator. This means that for users with very little overlap, the denominator will be much larger than the numerator and thus give much smaller weights. Equivalent to over mean-centered user rating vectors.

Good Baseline Configuration

  • Top N neighbours (~30)
  • Weighted averaging
  • User-mean or z-score normalisation
  • Cosine similarity over normalised ratings

Influence Limiting and Attack Resistance

Generally, recommender systems are target for gaming. People might create fake accounts to create more positive ratings for items they want to push up. Oftentimes, you might want to have some fraud detection in place to detect and filter out users like this and their impact on the system.

Trust-Based Recommendations

Oftentimes modelled based on network graphs where people you trusts trust in other people determine their trust score for you.

An example of where this is useful are in contexts where a lot of opinions are of less value than a few which the person cares strongly about, in which case naive collaborative filtering might not yield very good results. In a case like this, rather than looking for people who are most similar to you overall, looking for trustworthy people who are likely to agree with you within a specific domain of importance to you might yield better results.

One of the drawbacks of this is that it’s oftentimes not straight forward how to model trust implicitly, unless users in a network are willing to provide explicit trust ratings.

Comments