For a data scientist, what does an average (and non-average) day look like? What’s great (and not so great) about the role? How’s working in Amazon like?
I recently had an interview with CareerFair where we discussed these questions and more. (They also have other interviews with professionals in marketing ops, customer success, and engineering.) Here’s the interview write-up.
Hi, I’m Eugene Yan. I work at the intersection of consumer data & tech to build machine learning systems to help customers. I also write about how to be effective in data science, learning, and career.
Currently, I’m an Applied Scientist at Amazon helping users read more, and get more out of reading. We build book recommendation systems and contribute to efforts in discovery (e.g., search). Previously, I led the data science team at Lazada (acquired by Alibaba in 2016) and worked on e-commerce ML systems (e.g., ranking, automation, fraud detection).
First, a disclaimer: Even for people with the same title (i.e., applied scientist), the average day will look different. It will also vary with the project’s lifecycle, such as research, prototyping, development, and maintenance. My role mostly involves prototyping and development.
My day usually has these buckets of activities:
I’m struggling with this question as most days don’t seem average. Nonetheless, here are some exceptional events that may come up:
I really enjoy working with data. Through data (e.g., search logs, clickstreams, transactions), we understand our customers and how they interact with our platform and products. The data reveals interesting patterns in human behaviour. For example, consumption changes due to life-stages (e.g., becoming a parent) and socio-economic events (e.g., COVID-19, work from home). By understanding our customers better, we can serve them better.
I also enjoy applying machine learning. While data helps us understand customers, there’s far too much for a person (or even a large team) to process. Machine learning (algorithms) can help with this. For example, machine learning helps (i) automatically classify products and audit product reviews (ii) identify fraudulent sellers and products, (iii) recommend products to customers given their historical preferences, etc. Data and machine learning helps to write software 2.0.
Another aspect I enjoy is the amount of leverage working in a consumer tech company (e.g., Lazada, Amazon) provides. Our team can build and deploy machine learning systems to help customers around the world. It scales well too. Most of the system doesn’t need to change from country to country. Some necessary changes include using local data and adapting to local regulations (e.g., privacy). I get a huge kick from seeing customers benefit from our work (we see this through metrics and anecdotes).
I’m still learning about how to manage this, but sometimes, I spend more time than I would like writing documents and in meetings. Nonetheless, it’s essential for socialising ideas and getting buy-in and feedback. I just wish I was more effective and faster at it.
Occasionally, stakeholders suggest solutions that are way more complex than it needs to be. I blame the overhyping of tech and machine learning in the media. When this happens, our team patiently tries to understand their perspective and educate them. Nonetheless, it takes considerable time and effort and distracts us from work that helps customers.
Lastly, because my work revolves around data, I’m also constrained by access to high-quality data. Delays happen now and then. Sometimes, it’s a minor lack of permissions which takes a few hours to a few days to resolve. Other times, we find that our system isn’t tracking a specific field and we need to update our trackers and wait a few months, or backfill the data.
While there are some cultural differences, they don’t affect the day-to-day. For example (and this is likely a stereotype), Asians are more reserved while Americans are more outgoing—something like this doesn’t affect my work. I think the organization and team culture matters a lot more—this is independent of the country. Before deciding to join a team, it’s important to interact with them to get a feel of the culture.
Having a humanities background is associated with certain traits: Being more open-minded, critical thinking, better problem framing, research skills, and the ability to communicate with laymen. I think such traits would benefit everyone, not just tech folks. While a humanities degree helps with cultivating these traits, there are plenty of other ways—it can also come from having the opportunity to work on diverse, challenging problems, good role models, and work experience.
Other than the traits mentioned above, my Psychology degree taught me how to analyse qualitative and quantitative data. It also taught me about statistics (and how to be skeptical of it). In addition, I learned about how people perceive, think, and behave; this helps when I’m building customer-facing machine learning features.
I’m still fairly new in Amazon. Nonetheless, I think Amazon’s more like a group of start-ups (rather than a big company). For example, each AWS service seems to operate like a start-up. In that sense, my experience so far has been similar to working in Lazada. We’re constantly experimenting, shipping, and getting feedback from customers. Nonetheless, being a global company, Amazon does provide slightly more leverage (see “What’s your favourite part about the job”).
I enjoy—and work best in—a role that’s between commando and soldier. Both my experience in Lazada and Amazon allow me to do this which plays to my strengths.
Nope. I really enjoy working with data and machine learning to build useful systems and products for customers. The Masters in CS was essential to improve my understanding of the fundamentals so I could be a better data scientist, especially when developing and maintaining production systems.
Communication is one of the most important—if not the most important—skill for an effective data scientist. Initially, I didn’t think this way. But I reached out to several mentors asking what the most important skill for a data scientist was and guess what—it was communication. Thus, I focused on improving my communication and saw gains in my effectiveness within a year. (Ahmed wrote a great thread summarising my views.)
I think the best way to improve communication is through practice. At the start, it’s useful to read about the fundamentals of good writing and speaking—this arms us with knowledge from the experts. But to really get better, we need to practice.
How can we practice? Offer to write documents at work. This can be in the form of proposals, design documents, or internal newsletters. Or write about personal projects or what we learn on a blog. To practice speaking, offer to share at meet-ups or conferences about work-related or personal projects. With everything online now, it’s much easier.
Join 2,200+ readers getting updates on data science, data/ML systems, and career.
Welcome gift: 5-day email course on How to be an Effective Data Scientist 🚀