Jason Ball's TechBytes

Technology & Venture Capital. Early stage venture capital news mixed with personal views and comments

Feeds 2.0 and the Netflix Prize

I’ve blogged about my friends at Feeds 2.0 many times over the past few years; I’ve followed the Netflix prize out of interest since it was covered in Wired back in 2007, mainly because I was rooting for the Feeds 2.0 team…

For those of you who may not spend your weekends and evenings reading up on artificial intelligence, here’s an overview of the Netflix competition:

Netflix released a large movie rating dataset and challenged the data
mining, machine learning and computer science communities to develop
systems that could beat the accuracy of their in-house developed
recommendation system (Cinematch) by 10%. In order to render the
challenge more interesting, the company will award a Grand Prize of $1M
to the first team that will attain this goal, and in addition, Progress
Prizes of $50K have been awarded on the anniversaries of the Prize to
teams that have made sufficient accuracy improvements. Apart from the
financial incentive however, the Netflix Prize contest is enormously
useful for recommender system research since the released Netflix
dataset is by far the largest ratings dataset ever becoming available
to the research community. Most work on
recommender systems outside of companies like Amazon or Netflix up to
now has had to make do with the relatively small 1M ratings
MovieLens data or the 3M ratings EachMovie dataset. Netflix provided
100480507 ratings (on a scale from 1 to 5 integral stars) along with
their dates from 480189 randomly-chosen, anonymous subscribers on 17770
movie titles. The data were collected between October, 1998 and
December, 2005 and reflect the distribution of all ratings received by
Netflix during this period. Netflix withheld over 2M most recent
ratings from those same subscribers over the same set of movies as a
competition qualifying set and contestants are required to make
predictions for all 2M withheld ratings in the qualifying set.

I was thrilled when I saw the final Leader Board: The Ensemble (Feeds 2.0 + others) was at the top:

The final winner will not be announced until next month; Netflix still has to decide which of the leading algorithms perform best and how they score on various tests…

So, as the latest Wired article says, it ain’t over til it’s over.

Good luck Nicholas!

Filed under: Artificial Intelligence

Twitter Updates

wordpress
stats

Contact

Archives

Follow

Get every new post delivered to your Inbox.

Join 4,726 other followers