Alexey Grigorev on His Career, Data Science, and Writing

[ informalmentors career datascience writing ] · 22 min read

This week, we chat with Alexey Grigorev, a lead data scientist from OLX Group. We actually met in-person last year at OLX Group’s Prod Tech Conference where he presented how to deduplicate images. However, we didn’t recognize each other online, and only found out when I asked him about it right before this chat!

Alexey has an interesting career. He started as a software engineer focused on Java. Then, he stumbled upon something that made him want to switch to data science. He did this by taking a Master’s in Business Intelligence (which he will share was not the best time investment).

After that, in a single interview, he secured his first data science role, and has since grown in his career. In this chat, I probe him about his thought process, as well as his advice for others looking to emulate his career.

Prefer to watch us banter instead? Here’s the video.


• • •

First, let’s have Alexey introduce himself…

Alexey: Hi, I’m Alexey, a lead data scientist at OLX. My role includes too many things and involves overseeing anything related to machine learning. I try to help where help is needed, and stay on top of everything.

I also work a lot with infrastructure, such as making sure that once we have a model, we can serve it to real users. I also mentor a lot of people, such as data analysts or engineers who want to get into machine learning. I help them with training their first model and then deploying it.

In 2010, you graduated with a degree in IT and became a software engineer, mostly focused on Java. Then, in 2013, you underwent a Master’s degree in BI. What was your thinking behind it? Why switch from software engineering to data? What did you see in 2013?

Alexey: It’s funny that you asked what I saw. I actually saw a video, or rather, a video course by Andrew Ng (laughs). I saw that course, and I thought, okay, this is what I want to do. This led me to taking more courses, also on Coursera, and chatting with a couple of companies who were looking for data scientists.

However, everyone was telling me that I didn’t have enough education and the background wasn’t a good fit. This was how I decided that I needed to get a Master’s.

Back then, I didn’t know that BI is not really data science (laughs). But, I still had a couple of courses on machine learning, and the BI courses were also helpful.

For people looking to emulate your journey, how should they think about it? Should they take on a Master’s or further education, like you did?

Alexey: I think the path I took wasn’t the most optimal. I spent two years doing the Master’s, but I think I could have done the switch in like 6 months. Just watching courses, doing projects, and trying to find a job. But back then, it wasn’t easy to get a job without “proper” education; the field was still new, and people didn’t know who to hire, what kind of background was needed.

It’s easier now. People kinda know more or less what they want to see in a data scientist. For example, a PhD is a nice to have, but it’s not a showstopper.

Thus, I wouldn’t advise doing a Master’s—you’ll spend two years doing it, and not all the courses are needed. While I did study useful things, not all of them turned out to be useful for me.

If someone doesn’t have a Master’s or PhD, would you hire them?

Alexey: Haha, of course! (Smiles). Yes, I know that people can acquire useful skills from a Master’s or PhD. But, I don’t think it’s the only way of doing this.

Eugene: There you have it everyone. I mean, a lead data scientist, a hiring manager himself is saying that it’s not necessary to have a Master’s, if any recruiters turn you down, you can show them this video (haha).

After completing your Master’s in 2015, you were a research assistant for a while. Then, you secured a role as a data scientist at Searchmetrics, focused on SEO, text mining, etc. How did you do that? What did you have to demonstrate to the hiring manager and team?

Alexey: I remember that interview. It was a long interview, just one interview. Usually these days, you have a series of interviews. But that was just one interview, 2.5 hours; it was pretty tough.

Most of the time, we talked about my thesis. It was about math information retrieval. Basically, trying to work with formulas in Wikipedia. Let’s say we have an expression, e = mc**2; we want to know that e is referring to energy.

My takeaway is that it’s good to have a project to talk about. It doesn’t have to be a thesis, you don’t really have to do a Master’s. But you should have a real-life project, a real application of machine learning.

This is especially important if you don’t have a lot of experience in your CV, if it’s your first full-time job. It can be a thesis, a course project, a side project, a Kaggle competition. If you have something to talk about, it’s a big plus; it makes the conversation smoother.

One other thing the hiring manager was interested in was my Java experience. The company had a lot of services written in Java, and they needed somebody who can help integrate machine learning into these services.

So knowing Java, and having a project (I could talk about), made the interview go well.

Eugene: That’s a great experience.

I had a similar experience where I attended a three-hour long interview which included lunch. The company, an e-commerce startup, was interested in my experience in a Kaggle competition on product classification. Turns out that they had a problem with product classification too, and needed someone to fix it. And eventually, I got hired.

But these are very unusual experiences, and I think I got lucky. Maybe you got lucky too. And so, for the benefit of our listeners…

How did you fail in your first application? What were some of the interviews you failed?

Alexey: (Laughs) Sorry, I’m going to disappoint you. After graduating, I just had one interview, got an offer after that, and accepted it.

Maybe it was a mistake actually. Maybe I shouldn’t have stopped at one interview. Maybe I should have interviewed with more companies.

Don’t get me wrong, I’m very happy with my experience at Searchmetrics. It was a really great job. I learned a lot and met a lot of great people, some of whom I’m still in touch with.

But now when I look back, I think it’s always good to have multiple opportunities to choose from. Maybe in some other instances, it’s good to have an offer, and continue to interview at other places. Maybe there’s something better out there. But if you don’t try, you don’t know.

From 2015 to 2018, you had two stints as a data scientist. Then, in 2018, you joined OLX as a senior data scientist, and now, you’re a lead data scientist. For people who’ve had a few years of experience, what does it mean to be senior? I know some people who get the title of senior and suddenly have imposter syndrome.

Alexey: First of all, let’s talk about the title itself; “senior”, what does it actually mean?

If you talk to people from different companies, maybe all the answers will be different. But I think there’s one common aspect in all the answers.

A senior is someone who can take end-to-end responsibility.

And the important word is not end-to-end, but responsibility. If you have a problem, a senior is someone who will find a way to get it done. Even if it’s a complex project, even if there are obstacles, they will figure it out. They will find the right people and remove the blockers. Instead of sitting there with the blocker, they will find the solution. This is the most important quality of a senior, in my opinion.

Specific to data science, I think a senior is someone who can do a project end-to-end. This includes talking to stakeholders, figuring out of ML is actually the right tool, translating requirements into the language of ML, understanding if the problem is worth solving, breaking a big, ambiguous problem into smaller tasks for other members of the team.

It doesn’t mean that they are a rockstar who can do everything. They’ll work with other people, with product managers, with stakeholders. But they’ll need to assess the situation—is it really worth spending time working on this problem? Do we need ML, or is something simpler good enough? They might work with data engineers, or build the data pipelines themselves. Then, they’ll train the model, and serve it.

If someone can do all these things, they can 100% call themselves a senior data scientist. I think the important part here is the first part. A mid-level data scientist will also be able to train a model, or work with data engineers on pipelines. But to become a senior, the communication and problem framing aspects are essential.

Some people might call this position a lead data scientist. For me, the main distinction is that a senior is mostly involved in one project. They’re making all the decisions in one project. They’re still very hands-on, spending more than 50% of the time coding.

For a lead, it’s more projects, less hands-on. It’s more communication with multiple stakeholders.

Eugene: To summarize, a senior data scientist is someone you can trust. You give them a problem, they’ll be independent, they’ll run with it. (Alexey: Exactly)

And how do they do it? They need to have the end-to-end understanding. They might not be doing everything themselves. They can get other people to help them.

And it’s not just doing things the right way. It’s questioning even the decision of whether to do it or not. This is what makes a data scientist senior.

You’ve had a pretty varied experience, in both startups and big companies. For people out there looking for a role, they’re probably thinking—should I join a startup, or big company. What’s your take?

Alexey: What I liked at a startup was exposure to pretty much everything. I could do everything I wanted, there were no boundaries.

And there was a lot more work than everyone in the company could possibly do. Our data science team had only three people. When you only have three people, you really have to think carefully about what to do that would have the biggest impact on the customer. How would it affect our product? How does it help customers?

And we didn’t have anything, and had to build everything from scratch. How to do this so that everything doesn’t fall apart (i.e., sustainable), and yet still move fast? We had to make many trade-offs. In some areas, we decided to move faster. In other areas, we dedicated more time to make the product more robust and sustainable.

That was a fun experience.

I think it’s also possible to have such experience in a bigger company. But, a bigger company has more people. You don’t always have as much freedom as a startup, to make all these technical decisions. Especially in a corporation, you have all these existing infra and tools that you have to use, so you’ll have some limits.

Nonetheless, a start-up will also have such limits, such as time and money, which also forces you to be creative.

What I like about corporations, is that there’s always someone who can teach you how to do these things. At a start-up, you’re mostly on your own. In corporations, you might find more senior colleagues, more mature processes, and also more resources.

What I would suggest to people who are just starting, and if they have two offers, is to go with a startup. It’s more likely that you’ll have a broader experience there. You’ll be exposed to many different things. You have to talk to sales, engineers, product. You’ll learn more.

Eugene: Haha people are going to listen to this, and just hear that Alexey said: “Startup” (Alexey: laughs).

I like the startup life too, but I’m going to add a bit of balance here.

I think that in a bigger company, you can work at scale (Alexey: nods). OLX is in 40, 45 marketplaces. You get to learn about the various cultural differences, regulation differences, and work at scale across multiple teams.

For people who are very young in your career, I think there’s no way you can make a mistake. Everything is just going to be a great learning experience.

Throughout your career, you’ve worked with many data scientists. Think about the most effective data scientists that you’ve worked with, or people you respect. What is it they do that’s effective? What is your definition of effective?

Alexey: Effective is somebody who gets the job done, in a reasonable amount of time. Also, pragmatic to some extent, not a perfectionist.

I remember my team lead in Searchmetrics. I learned a lot from him. He was really effective. He could just come with a problem, dig deep, and focus on solving this problem. After a couple of days, he would have a solution.

I found myself thinking, hey, I want to be like him. I want to quickly come up with an idea, and develop it. That was cool.

Also, being effective is being able to focus. I know it’s very difficult. You have so many things you have to do, and have to spend the day in meetings. Now that I know this, I admire his work even more.

I think what helps with effectiveness is to focus on the problem. When it comes to data science, what we often think of first is the solution.

Okay, I have this problem, and I’m going to hit it with this hammer of machine learning. We see this with people starting with the latest TensorFlow models that are huge.

It helps to take a step back, and think: Do we really need to solve this with the fanciest model out there? Or is there a simpler solution?

Just focus on solving the problem, not how to solve it. Usually, the simplest way to solve it is the right way.

Also, on learning effectively, I think it’s good to have the right balance of courses and hands-on practice. Don’t just do courses.

Sometimes, we follow a tutorial and come away thinking that we can now do everything. Usually, this is not true. For me, after watching a tutorial, and then trying to apply it to a project, I find that I don’t know the topic well (Eugene: exactly).

Then I need to do a lot of googling, a lot of research. I think this is what helps. After doing a course, do a project. Or just do projects and learn along the way.

Eugene: Alexey mentioned a great point and I just wanted to add to that. After you do a course, and you try to apply what you learned, you’re gonna realize that there are 1,001 things that you don’t know.

The same thing happens when you’re writing. You think you know something and try to write about it. Then you realize that your mind is a blank.

Doing projects and writing are a great way for you to consolidate your thinking and to fill the gaps.

When you interview and hire people, how do you think about it? What makes you go “We cannot hire this person?” And vice versa?

Alexey: Usually, when I interview, I ask a simple Python question. It’s a very simple problem, similar to fizz buzz. It requires one for-loop and an if-statement within it. Some people can solve it in 3 - 5 minutes.

If someone has Python in their CV, and they can’t solve it, it’s a no-go. If someone starts on the problem, and then says “I usually google how to do for-loops in Python”, then for me, it’s very suspicious (Eugene: yeah).

I mean, then why do you put Python in your CV, if you need to google how to do a for-loop? For me, this is a big red flag.

Apart from this Python thing, there are no other showstoppers. It shows that the person claims they can program, but they actually cannot.

On the other hand, what stands out?

Having projects stands out. Having end-to-end responsibility, delivered a project end-to-end, being able to talk about the trade-offs they made, all the decisions, why the project started. These stand out.

I understand that not everyone will work on this level to be able to answer this. But it’s good to ask, why are we doing that. And then be able to think through that question and answer it properly.

You’ve worked with many data scientists in the past. What are the common challenges and pitfalls that most people trip on?

Alexey: People often underestimate the amount of time it takes to deploy a model. Not just the deployment, but building data pipelines, etc.

Also, we don’t spend enough time making sure that we’re solving the right problem, making sure that what we do actually matters. If you spend half a year developing this great model that nobody cares about, then you’ve just wasted half a year.

Ask yourself, why are we doing this? What kind of problem are we trying to solve? Who is the user? How will they use it? Are they going to use it the way we imagine, or will they do something different? Having this conversation with the user is very important.

As data scientists get more experience, they get to a point where they can choose to continue building products and systems, as an IC, or build teams and engage with stakeholders, as a manager. How do you think about these two different paths?

Alexey: Currently, I’m an IC as lead data scientist. I haven’t been a manager to answer this question.

Nonetheless, for the IC track, what I see is exposure to many projects, with less hands-on. I like this. I like to guide and mentor people. What should they watch out for? What should they make sure to understand? Why do we need to solve this? I like asking these questions, and engaging with stakeholders, and mentoring people.

Instead of solving the problem myself, I show others how to do it. This scales a lot better. As a lead, I can work on multiple projects, and teach people by showing them how to do it. While they may not be effective immediately, in a year, they’ll be able to work at the same level as I am now. This means I can scale out my skills, and in this way, be more effective.

That’s what I like about being an IC. It’s technical, still hands-on, I still do a lot of pair-programming.

For a manager, it’s pretty different. They need to think about things like performance reviews, and other things. (Eugene: budgeting, fighting for resources, politics).

But now that I’m thinking about this, I think it might not be a bad idea to take on the managerial path too. Maybe it’s a good idea to first have a small team, maybe 3 - 4 people, and test if you really enjoy the people management aspect of the job. If after 3 months, and you find that it’s not your cup of tea, you can safely backtrack.

Eugene: I think that’s a great idea, and companies should provide that option to their technical contributors. It can be overwhelming for people to become a manager, and they realize there’s no u-turn, and they burn out and leave.

Instead, if we give people an option of being a manager, with the option to backtrack, and after the trial, they decide it’s not for them, they can stop. And we still retain them. It’s a win-win.

Looking back on your career, what are some things that were a waste of time? And what were some things that you think you should have done earlier?

Alexey: The Master’s, I don’t think it was really necessary. Maybe back then it kinda was, but now, it’s not necessary for sure. Just doing courses and projects should be enough.

One thing that helped was starting to freelance in parallel with my studies. That gave me a lot of projects, and gave me a great portfolio. That was helpful. While freelance is not for everyone, if you’re studying and have some free time, it’s probably a good idea to freelance a bit.

Eugene: Yes, that’s a great point. What’s difficult for people who just come out of school is that they don’t have much experience to show for it. Freelancing, internships, these are great ways to demonstrate your work. You’ll also learn a lot of stuff that you don’t learn in school.

Alexey: Another thing is this document I have for each project. From the very first project meeting, I capture everything in a document. What’s the problem? Why do they think ML is a good solution? How does success look like? What are the next steps? Every time we follow-up on a topic, I capture it in the document. And over time, it captures the history of how this problem evolves.

I started doing this in OLX, I didn’t do this previously. But now thinking back, I should have started doing this even when I was freelancing.

Another thing that’s helpful is writing blog posts about things and sharing online. A blog post helps you consolidate what you learn, what is important. I don’t think I did this enough, and only started it at OLX.

You talk a lot about writing. Why do you think people should write?

Alexey: It’s difficult to write, and make it crisp on paper. All my thoughts, it seems really clear in my head. But when I’m trying to pull it out into a document, it’s super difficult. Just two minutes ago, when I was thinking about this, it was so clear in my head. But why doesn’t it come out on paper like that?!

I know why people don’t write. It’s difficult. They spend days, or weeks writing something. And when they ask for comments, they get a lot of comments. And they get … sigh… discouraged, and they give up.

I think it helps to do it more. The more you do this, the easier it becomes. Even though I’ve been writing for quite a while, it’s still very difficult for me. But when I force myself to write things, it becomes easier for me.

Writing helps with speaking as well. In future, when I have a conversation, or a podcast, like now, these things are clear in my head, because I’ve written them. And speaking about it becomes easier.

I think this is the power of writing, and everyone should do this to structure their thoughts and think clearly. Even if you just write your own internal notes, that nobody sees, it still helps to structure your thoughts.

More advice about writing from Alexey and other leaders here.

In our field, things progress very quickly. It’s impossible to keep up with all of it. How do you decide what to learn? How do you know if it’ll relevant and worth your time?

Alexey: I remember trying. I had one RSS reader to subscribe to arXiv RSS. It was basically impossible to keep up. I also had a folder on Dropbox, and I called it “To Read”. At some point, it became half a gig. I remember that day, when I decided that I know I’m not going to read this, that was a good day (laughs).

At some point, I just thought to myself: Do I really need all this information? What am I going to do with it? Just realizing that there’s no way to digest all this information, it helps a lot.

How to choose what to learn? I don’t know, I just try to focus on the problem that I’m solving. Whatever works for that problem, I try to find.

Also, for the last 2 - 3 years, I’m trying to learn things outside of data science. A bit of marketing, how to speak, how to read. If you like something, you learn, and when you stop liking it, maybe you’ve learned enough, and you move on.

Eugene: What Alexey is sharing is just-in-time learning (Alexey: Yes).

You have a problem you need to solve, you try to learn about it, and immediately you get hands-on practice. That’s how it sticks.

Last question. Recently, you started the datatalks community. Why? How do you see it growing? How will it help people?

Alexey: It somehow happened naturally. I’m writing a book, and one of the readers asked, “Is there a place where I can talk about this book?” I realized that there’s actually no such place, and decided to create a place for that.

I also get a lot of questions on LinkedIn, Twitter, email, Quora. I try to answer these questions, but it doesn’t scale. That was another reason I started the community slack. People can ask these questions in public, and we can answer them. Maybe if I start by showing an example, more people can help with the questions?

The community also hosts meetups. One reason why I do this, is sometimes I want to talk at a conference, I submit a great proposal, and I’m rejected (laughs).

It’s disappointing, and I think, why do I have to submit a proposal? Why do I need a conference? Can’t I just talk about it myself? And this is how it happened, maybe I can just host meetups.

Also, a friend asked, do you know of a place where I can give a talk? Why yes I do! This was the SageMaker event that you attended. It was our first talk.

Eugene: Currently, Alexey hosts one talk a week. I don’t know how he keeps up with the cadence, but it’s a great time to join datatalks.club.

This was a great chat, thanks Alexey. I’ll share the videos on YouTube and write this up.

• • •

Thanks to Alexey Grigorev for the interview. Thanks to Yang Xinyi and Alexey Grigorev for reading drafts of this.


Share on:

Browse related tags: [ informalmentors career datascience writing ]

Want weekly updates?

I write about effective data science, ML in production, & career growth.

    Welcome gift: A 5-day email course on How to be an Effective Data Scientist 🚀