I first learnt about the concept of using “the crowd” to analayse sets of data in the book Wikinomics which tells the story of the The “Goldcorp Challenge” –when a struggling Canadian gold miner publishes its geological data online and offers a prize of $575,000 prize for the answer to the question – Where is the gold ? Over 1,000 entries lead to the discovery of masses of gold solving a problem that the in-house analysts couldn’t solve.
Kaggle, founded in April 2010, provides a platform for data analytics competitions.
Companies and researchers post their data. Statisticians and data miners from all over the world compete to produce the best models. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective.
Entries are reviewed and ranked with results published on a Leader Board.
At the end of a competition, the competition host pays prize money in exchange for the intellectual property behind the winning model.
Kaggle competitions have achieved extraordinary results and draws on data scientists from over 100 countries and 200 universities to solve real world problems using real world data.
Brilliant, brilliant stuff.
At the time of writing the top prize money is USD$3Million for the Heritage Health Prize.
The goal of the prize is to develop a predictive algorithm that can identify patients who will be admitted to a hospital within the next year, using historical claims data.
Check it all out yourself at www.kaggle.com or watch the excellent clip below from ABC Australia’s Catalyst Programme.
I have long known the power of Mathematic Modelling. After all AUFC Club Legend – Bob Neil – a mathemetician who has worked in the Defence sector has long been charged with “solving the simultaneous equations that defend the nation”.