Freddie Karlbom · Random Tech Ramblings

Evaluating Recommendations

16 Jun 2018

Putting some thoughts down in words of trying to wrap my head around how to measure if a recommendation is good or bad, and how to formulate the problem in a way so that models can be trained.

As food for my thoughts, I’m reading through Common pitfalls in training and evaluating recommender systems, from where the block quotes are taken, which kinda’ leads me of into a different direction and makes the post something of a stream of consciousness.

More …

Downloading Kaggle Datasets to SageMaker

10 Jun 2018

Generate API key in Kaggle UI and upload to your root folder via web interface.
Agree to competition in Kaggle UI if you haven’t already.
Start a notebook. Execute following commands in a notebook cell replacing the competition name with the competition you are interested in;

%%bash
pip install kaggle

# Move API key to where Kaggle expects it
mv /home/ec2-user/SageMaker/kaggle.json /home/ec2-user/.kaggle

# Download datasets, optionally specify destination folder using --path
kaggle competitions download -c planet-understanding-the-amazon-from-space

Realise that the instance you started doesn’t have enough space for the datasets. Oh, crap. Anyhow, if you were forward thinking than me then you should be good to go.

AWS Summit Berlin 2018

06 Jun 2018

A few quick notes from AWS Summit Berlin of things that I found interesting.

More …

Food Recommendations at Delivery Hero

05 Jun 2018

Notes from an excellent talk by Gugulethu Ncube from Delivery Hero at the Berlin RecSys Meetup tonight on how they work with food recommendation. Rephrased in my words and with my thoughts interspersed, so anything crazy-sounding should with all likelihood be attributed to me.

Goals for recommendations:

Provide users with new restaurants to order from as ordering from multiple restaurants makes customers more loyal

Collaborative vs. Content-based:

“Collaborative filtering produces very strange results”

Seems to end up giving quite uninteresting recommendations, such as very popular chains.
Reason behind is the extreme sparsity of data, and the distribution of data that is very tied to geographical location of restaurants.
This also explains why chains ends up in top, as they have so much more data from their different locations.

FoodRank - Content based

In a nutshell; ranking how good restaurants are at your favourite food.

More …

FastAI - Notes - Deep Learning week 1-2

02 Jun 2018

Some interesting tricks not mentioned in the Deep Learning.ai course (based on recent papers), and apparently at time of recording at least only implemented in the fastAI library that sits on top of PyTorch.

More …