I was recently invited by DataScience SG to join a panel discussing the various roles in data (e.g., data scientist, machine learning engineer, data engineer, data analyst, etc.) They were looking for someone who had experience hiring across the different roles and I was happy to share my experience.
Considering that it was a Thursday night, it was a great turnout where >200 people showed up at Google’s Auditorium to attend and ask great questions.
Ever wondered what the different data roles like AI researcher, data scientist, big data engineer, machine learning engineer, and data analyst entail? What are skills needed to join their ranks? What role is suitable for you? Let us kick off 2019 with a panel of great data people to answer your burning questions about the data industry and what it takes to have a successful data career.
We had an amazing group of panellists from different companies and roles:
Jaideep: An ML Engineer at NTUC. He works with data scientists in productionising ML models. Prior to that he worked as a data engineer in Lazada. In Lazada he worked on creating the data ingestion pipelines and later worked as an ML engineer for the onsite ad project.
Michael: A Business Intelligence and Data Analyst from Agilent Technologies where he drives BI solutions to deliver insights to internal stakeholders. Prior to this, he worked in market research providing consumer insights to tech clients worldwide.
Weina: A Research Scientist in Rakuten institute of Technology Singapore who has working experience in recommender systems, customer churn prediction and retention improvement, data-based investment, and deep learning-based classification.
Koo (moderator): An experienced Data Scientist/Analytics Instructor with an MBA degree & 12+ years of relevant experience. Personal research interest in using Data Science to make organizations become efficient & effective and assisting more people to understand & pursue Data Science.
Here’s a summary of the key points discussed
Essential skills in data:
Logical thinking and understanding of how data flows
Communication (in terms of business impact)
Basic programming (SQL, Python)
Basic statistics and machine learning
Most panellists were motivated by being able to create an impact through work while tedious documentation was someone de-motivating (lol).
Even after moocs, bootcamps, and formal education programs, continuous self-learning is essential in order to improve and keep up with the rapid industry advancements. The internet is abundant with resources for self-learning such as moocs, youtube videos, and great articles (Some suggested resources here).
Personality traits associated with success in data (and arguably all roles):
Curiosity: Of data, new methodology and technology, and various business problems. This drives learning and levelling up.
Persistence: Data science is partly research and thus has failed experiments. Anything involving technology and building software and systems will have bugs that need debugging. Grit helps push you through the tough times.
Humility: Knowing that you don’t know everything and being willing to consult with data domain experts, research and apply existing literature, and being open to feedback.
To get an interview
Having a portfolio helps demonstrate how you’re able to apply your knowledge and skills to build a useful data product
You can write (this is partially why this site exists), share at meetups and conferences, or build your own website
How do you know what you’re lacking / need to learn?
One most direct way is to find a mentor who is two-three steps ahead of you and track their career path.
What steps did they take? What did they have to demonstrate?
Invite them for coffee and ask them about it (e.g., What did you demonstrate to land your first data role? Of the good data scientists you know, what key skills make them effective?)