DataScience SG Meetup - How we got top 3% in Kaggle

[ machinelearning ] · 1 min read

One Saturday afternoon, I volunteered to share about my recent effort in Kaggle’s Otto competition where I placed 85th / 3514 with my fellow competitor Weimin.

Given that it was a lazy Saturday afternoon, I did not expect the lecture room at SMU to be fully packed. The data science meetup scene in Singapore was more vibrant and hotter than I thought.

Image

In approximately 45 minutes, we shared about how we thought about and had an in-depth discussion with the audience on the topics below:

  • The evaluation metric (multi-class log loss)
  • Validation approaches
  • Feature engineering and selection
  • Feature transformation (e.g., standardization, log-transformation, tf-idf)
  • Creating aggregate and t-sne features
  • Machine learning techniques (trees and neural nets)
  • Ensembling techniques
  • Top solutions and architectures
  • A suggested framework for Kaggle competitions

More details can be found in the slides below.

Questions? Want to follow my journey? Reach out on Twitter @eugeneyan!


If you found this useful, please cite this write-up as:

Yan, Ziyou. (Jun 2015). DataScience SG Meetup - How we got top 3% in Kaggle. eugeneyan.com. https://eugeneyan.com/speaking/dssg-kaggle-top-3-percent-talk/.

or

@article{yan2015kaggle,
  title   = {DataScience SG Meetup - How we got top 3% in Kaggle},
  author  = {Yan, Ziyou},
  journal = {eugeneyan.com},
  year    = {2015},
  month   = {Jun},
  url     = {https://eugeneyan.com/speaking/dssg-kaggle-top-3-percent-talk/}
}

Share on:

Browse related tags: [ machinelearning ]

Join 6,700+ readers getting updates on machine learning, RecSys, LLMs, and engineering.