Why did I start writing? What's my writing process? What's the writing culture at Amazon like?
Why real-time? How have China and US companies built them? How to design and build an MVP?
A public roadmap to track and share my progress; nothing mission or work-related.
Wrapping up 2020 with writing and site statistics, graphs, and a word cloud.
"When can I get my own daggers?", he asked. "Catch the daggers I throw and they are yours", she replied.
Time to clear the cache, evaluate existing processes, and start new threads.
Learn how he switched from engineering to data science, what "senior" means, and how writing helps him.
Data cleaning, transfer learning, overfitting, ensembling, and more.
Interview questions you should ask and how to evolve your job scope.
A personal take on their deliverables and skills, and what it means for the industry and your team.
Chip shares openly about the setbacks she faced, overcoming them, and how writing changed her life.
What questions do they answer? How do they compare? What open-source solutions are available?
DNS server snafus leading to missing email and security issues. Also, limited free build minutes monthly.
21 Oct 2020  ·  3 min  ·  misc
Instead of "How to build a data science portfolio", we'll discuss the "Whys" and "Whats" around a portfolio.
Step-by-step walkthrough on the environment, compilers, and installation for ScaNN.
Building prototypes helped get buy-in on data science efforts when roadmaps & design docs failed.
As our careers grow, how does the balance between writing & coding change? Hear from 4 tech leaders.
Emphasis on bias, more sequential models & bandits, robust offline evaluation, and recsys in the wild.
What if the alternative was nothingness?
26 Sep 2020  ·  1 min  ·  life
What's an average day like? What's great about the role? How's working in Amazon?
For years I've refined my routines and found tools to manage my time. Here I share it with readers.
My tools for organization and creation, autopilot routines, and Maker's schedule
A step-by-step of how to migrate from json comments to Utterances.
Checking for correct implementation, expected learned behaviour, and satisfactory performance.
My chat with James Le about my experience, leadership, agile, ML in production, writing, and more.
Why read papers, what papers to read, and how to read them.
How not to become an expert beginner and to progress through beginner, intermediate, and so on.
Examining the broad strokes of NLP progress and comparing between models
Why (and why not) be more end-to-end, how to, and Stitch Fix and Netflix's experience
Updating our FastAPI app to let users select options and download results.
Surprising lessons I picked up from the best books, essays, and videos on writing non-fiction.
Why OMSCS? How can I get accepted? How much time needed? Did it help your career? And more...
I couldn't find any guides on serving HTML with FastAPI, thus I wrote this to plug the hole on the internet.
Ever revisit a project & replicate the results the first time round? Me neither. Thus I adopted these habits.
It's not enough to have a good strategy and plan. Execution is just as important.
I wanted to add my recent writing to my GitHub Profile README but was too lazy to do manual updates.
I thought giving it my all led to maximum outcomes; then I learnt about the 85% rule.
Part II of the previous write-up, this time on applications and frameworks of Spark in production
Sharing my notes & practical knowledge from the conference for people who don't have the time.
After this article, we'll have a workflow of tests and checks that run automatically with each git push.
A curious discussion made me realize my expert blind spot. And no, Airflow is not late.
Haste makes waste. Diving into a data science problem may not be the fastest route to getting it done.
Initially, I didn't like it. But over time, it grew on me. Here's why.
Crocker's Law, cognitive dissonance, and how to receive (uncomfortable) feedback better.
Can maintaining machine learning in production be easier? I go through some practical tips.
I thought deploying machine learning was hard. Then I had to maintain multiple systems in prod.
An expansion of my Twitter thread that went viral.
09 May 2020  ·  4 min  ·  writing
What I Learnt about evaluating ideas from first-hand participation in a hackathon.
What I learned about measuring diversity, novelty, surprise, and serendipity from 10+ papers.
Why you should give a talk and some tips from five years of speaking and hosting meet-ups.
Should I join a start-up? Which offer should I accept? A simple metaphor to guide your decisions.
12 Apr 2020  ·  6 min  ·  career
Using a Zettelkasten helps you make connections between notes, improving learning and memory.
Writing begins before actually writing; it's a cycle of reading -> note-taking -> writing.
Automate your experimentation workflow to minimize effort and iterate faster.
How hard work, many failures, and a bit of luck got me into the field and up the ladder.
Comparing baselines (matrix factorization) against novel approaches using graphs & NLP.
Beating the baseline using Graph & NLP techniques on PyTorch, AUC improvement of ~21% (Part 2 of 2).
Building a baseline recsys based on data scraped off Amazon. Warning - Lots of charts! (Part 1 of 2).
OMSCS CS6200 (Introduction to OS) - Moving data from one process to another, multi-threaded.
In-depth sharing on how to put machine learning systems into production.
Keynote on how Asia's tech giants scale and their SuperApp strategy.
OMSCS CS6750 (Human Computer Interaction) - You are not your user! Or how to build great products.
Moving off wordpress and hosting for free on GitHub. And gaining full customization!
25 Aug 2019  ·  1 min  ·  misc
OMSCS CS6440 (Intro to Health Informatics) - A primer on key tech and standards in healthtech.
OMSCS CS7646 (Machine Learning for Trading) - Don't sell your house to trade algorithmically.
No, you don't need a PhD or 10+ years of experience.
How we built an ML system to predict hospitalization costs at admission; sharing at DATAx Conference.
Taking the best from agile and modifying it to fit the data science process (Part 2 of 2).
A deeper look into the strengths and weaknesses of Agile in Data Science projects (Part 1 of 2).
What's the difference between a data scientist, data engineer, and ML engineer? A panel at Google.
OMSCS CS6601 (Artificial Intelligence) - First, start with the simplest solution, and then add intelligence.
Yes, Agile can be adopted by data science teams. Moderating a panel at GovTech STACK.
OMSCS CS6460 (Education Technology) - How to scale education widely through technology.
OMSCS CS7642 (Reinforcement Learning) - Landing rockets (fun!) via deep Q-Learning (and its variants).
Technical challenges easy compared to business and people issues. Sharing at the BDA Summit.
Culture >> Hierarchy, Process, Bureaucracy.
And my idiosyncratic journey to VP of Data Science at Lazada (Alibaba). A Lunchtime chat at INSEAD.
OMSCS CS7641 (Machine Learning) - Revisiting the fundamentals and learning new techniques.
How being a Lead / Manager is different from being an individual contributor.
What is data science, how to pick it up, and how to enter the field? A discussion with SMU undergrads.
OMSCS CS6300 (Software Development Process) - Java and collaboratively developing an Android app.
Sharing about why data science, data science myths, a typical day, and more with TIA.
Tools and skills to pick up and how to practice them. An Invited Talk with Masters in IT candidates.
Tools and skills to pick up, and how to practice them.
OMSCS CS6476 Computer Vision - Performing computer vision tasks with ONLY numpy.
If things are not failing, you're not innovating enough. - Elon Musk
Or how to put machine learning models into production.
A web app to find similar products based on image.
Cleaning up text and messing with ascii (urgh!)
How Lazada ranks products to improve customer experience and conversion at Strata 2016.
A simple web app to classify fashion images into Amazon categories.
Got accepted into Georgia Tech's Computer Science Masters!
A card sorting game to discover youl passion by identifying skills you like and dislike.
23 Oct 2016  ·  3 min  ·  misc
Parsing json and formatting product titles and categories.
Learning Scala from Martin Odersky, father of Scala.
31 Jul 2016  ·  3 min  ·  learning
Guest post of how DataKind SG worked with NGOs to frame their problems and suggests solutions
17 Sep 2015  ·  8 min  ·  datascience
Sharing about my first data science competition at DataScience SG.
20 Jun 2015  ·  1 min  ·  machinelearning
I write about data science, machine learning, and career. Join 700+ readers. Weekly updates.
Welcome gift: 5-day email course on How to be an Effective Data Scientist 🚀