DataScience SG Meetup - How we got top 3% in Kaggle

[ machinelearning ] · 1 min read

One Saturday afternoon, I volunteered to share about my recent effort in Kaggle’s Otto competition where I placed 85th / 3514 with my fellow competitor Weimin.

Given that it was a lazy Saturday afternoon, I did not expect the lecture room at SMU to be fully packed. The data science meetup scene in Singapore was more vibrant and hotter than I thought.


In approximately 45 minutes, we shared about how we thought about and had an in-depth discussion with the audience on the topics below:

  • The evaluation metric (multi-class log loss)
  • Validation approaches
  • Feature engineering and selection
  • Feature transformation (e.g., standardization, log-transformation, tf-idf)
  • Creating aggregate and t-sne features
  • Machine learning techniques (trees and neural nets)
  • Ensembling techniques
  • Top solutions and architectures
  • A suggested framework for Kaggle competitions

More details can be found in the slides below.

Questions? Want to follow my journey? Reach out on Twitter @eugeneyan!

Share on:

Browse related tags: [ machinelearning ]

Join 4,300+ readers getting updates on data science, ML systems, & career.

    Welcome gift: A 5-day email course on How to be an Effective Data Scientist 🚀