DataScience SG Meetup - How we got top 3% in Kaggle

[ machinelearning ] · 1 min read

One Saturday afternoon, I volunteered to share about my recent effort in Kaggle’s Otto competition where I placed 85th / 3514 with my fellow competitor Weimin.

Given that it was a lazy Saturday afternoon, I did not expect the lecture room at SMU to be fully packed. The data science meetup scene in Singapore was more vibrant and hotter than I thought.

In approximately 45 minutes, we shared about how we thought about and had an in-depth discussion with the audience on the topics below:

The evaluation metric (multi-class log loss)
Validation approaches
Feature engineering and selection
Feature transformation (e.g., standardization, log-transformation, tf-idf)
Creating aggregate and t-sne features
Machine learning techniques (trees and neural nets)
Ensembling techniques
Top solutions and architectures
A suggested framework for Kaggle competitions

More details can be found in the slides below.

Questions? Want to follow my journey? Reach out on Twitter @eugeneyan!

If you found this useful, please cite this write-up as:

Yan, Ziyou. (Jun 2015). DataScience SG Meetup - How we got top 3% in Kaggle. eugeneyan.com. https://eugeneyan.com/speaking/dssg-kaggle-top-3-percent-talk/.

@article{yan2015kaggle,
  title   = {DataScience SG Meetup - How we got top 3% in Kaggle},
  author  = {Yan, Ziyou},
  journal = {eugeneyan.com},
  year    = {2015},
  month   = {Jun},
  url     = {https://eugeneyan.com/speaking/dssg-kaggle-top-3-percent-talk/}
}

Share on:

Browse related tags: [ machinelearning ] or

DataKind Singapore’s Latest Project Accelerator »

Join 11,100+ readers getting updates on machine learning, RecSys, LLMs, and engineering.