Hi, I'm Eugene Yan. I design, build, and operate machine learning systems that serve customers at scale. I also write and speak about ML, RecSys, LLMs, and engineering.
I'm currently a Senior Applied Scientist at Amazon where I focus on helping customers read more. Here, I built systems including real-time retrieval, bandit-based ranking, and recsys in search (see RecSys 2022 keynote). More recently, I'm exploring how LLMs can help us serve customers better.
Previously, I led machine learning at Lazada (acquired by Alibaba in 2016). Here, I built the product ranking system (conversion & revenue up 5-20%), smart push-notifications (CTR & add-to-cart up 10%), and automated product & review classification (cost down 90%). Hypergrowth's fun!
As an early hire in a healthtech Series A, I led the team to ship an ML system for Southeast Asia's largest healthcare provider. At IBM, I built job forecasts & recommendations.
Outside of work, I share the ghost knowledge of applying ML via curated papers, guides, and practitioner interviews, and host a monthly meetup on ML in industry. I also angel invest in early-stage data/ML, infra, and devtool startups.
2020-now: Senior Applied Scientist @ Amazon
(ML, RecSys, LLMs)
2018-2019: ML Lead @ Healthtech Series A (Disease Detection, Cost Estimation)
2017-2018: VP, Data Science @ Lazada (Alibaba) (E-Commerce ML Systems)
2015-2017: Data Scientist @ Lazada (Alibaba) (Ranking, ML Automation, MLOps)
2013-2015: Data Scientist @ IBM (Workforce Analytics, Fraud Detection)
AIterate Labs • How I can help • README • • I read everything but receive too much to respond to all of it.
Eugene Yan designs, builds, and operates machine learning systems that serve customers at scale. He's currently a Senior Applied Scientist at Amazon. Previously, he led machine learning at Lazada (acquired by Alibaba) and a Healthtech Series A. He writes & speaks about ML, RecSys, LLMs, and engineering at eugeneyan.com and ApplyingML.com.
Images: 960 x 960, 738 x 738, 200 x 200
I enjoy sharing my experiences and what I’ve learned, so others can avoid my mistakes and build on my lessons. I’ve been fortunate to work in awesome teams to solve challenging data & ML problems and ship to customers worldwide. Here, I hope to share some lessons and perspectives, with a pragmatic and product slant.
Also, as I continue to explore and learn, writing helps me to learn better. And when I share my thoughts and writing online, they attract like-minded people with whom I can discuss with and learn from. Read something that you would like to share, ask, or discuss about? Tweet me at @eugeneyan or reach out via email.
I’ll write about what I’ve learned or thought about. It’s usually related to the topics of data science, ML in production, or career. Here’s a word cloud based on the 55 posts I wrote in 2020. We can see common themes in (i) data & ML, (ii) problem & product & user & people (iii) writing & coding & learning.
Sometimes, I also write to answer questions I get from readers. This includes questions about my productivity habits, why read papers, the importance of writing for tech roles, and the difference between data/ML roles.
There will also be writing that might not be related to data science or ML. This includes topics on Commando, Soldier, Police, Beginner’s mind, and the 85% rule. Nonetheless, I hope you’ll also find them valuable.
Truth be told, I’m still figuring it out. Don’t be surprised if I write about completely different topics in future. If you want to keep in touch, subscribe to my newsletter.
I mainly write for two individuals and a group.
The first person I write for is myself. I try to write things that, when I revisit in a year or two, are (still) interesting; hopefully, this makes it useful for others too.
The second person I write for is my wife. She's always the first reader of my drafts (at least, until she gets sick of it). If she can understand what I write, especially on some of the more technical topics, mission accomplished.
The group of people I write for is my previous, current, and future teams. I pen my views on machine learning in production, data science & agile, and end-to-end data science, etc. This way, when the need arises, I've thought through them.