Evals, retrieval-augmented generation, guardrails, and collecting feedback; all that good stuff.
Evals, RAG, fine-tuning, caching, guardrails, defensive UX, and collecting user feedback.
Writing drafts via retrieval-augmented generation. Also reflecting on the week's journal entries.
9 patterns including HITL, hard mining, reframing, cascade, data flywheel, business rules layer, and more.
My three favorite papers, 17 paper summaries, and ML and non-ML lessons.
Invited keynote at the Workshop for Online Recommender Systems and User Modeling (ORSUM)
Or why I should write fewer integration tests.
Pushing back on the cult of complexity.
Some off-the-beaten uses of Python learned from reading libraries.
Understanding and spotting patterns to use code and components as intended.
How they differ and why they work better in different situations.
Hard-won lessons on how to start data science projects effectively.
Breaking it into offline vs. online environments, and candidate retrieval vs. ranking steps.
Pointers to think through your methodology and implementation, and the review process.
Three documents I write (one-pager, design doc, after-action review) and how I structure them.
Access, serving, integrity, convenience, autopilot; use what you need.
What the top teams did to win the 36-hour data hackathon. No, not machine learning.
A personal take on their deliverables and skills, and what it means for the industry and your team.
What questions do they answer? How do they compare? What open-source solutions are available?
Checking for correct implementation, expected learned behaviour, and satisfactory performance.
Updating our FastAPI app to let users select options and download results.
I couldn't find any guides on serving HTML with FastAPI, thus I wrote this to plug the hole on the internet.
I wanted to add my recent writing to my GitHub Profile README but was too lazy to do manual updates.
After this article, we'll have a workflow of tests and checks that run automatically with each git push.
A curious discussion made me realize my expert blind spot. And no, Airflow is not late.
Can maintaining machine learning in production be easier? I go through some practical tips.
I thought deploying machine learning was hard. Then I had to maintain multiple systems in prod.
OMSCS CS6200 (Introduction to OS) - Moving data from one process to another, multi-threaded.
Keynote on how Asia's tech giants scale and their SuperApp strategy.
OMSCS CS6300 (Software Development Process) - Java and collaboratively developing an Android app.
Join 6,100+ readers getting updates on machine learning, RecSys, LLMs, and engineering.