4 Things You Need To Know About Data Science

Mavis LohUncategorized

1. You’ll likely be using SQL a lot more than you’d expect

tl;dr: Like it or not, SQL will always be here to haunt you, so make sure you take the time to be proficient in it.

It’s a myth that SQL is a skill that only data engineers, data scientists or data analysts used. Rather, if you are working in a data-related role, whether it’s a data science role or not, you’ll be exposed to SQL. As a data scientist, you’ll need data to build machine learning models. This means that you’re either going to have to query your data from existing data banks, or build pipelines if the data doesn’t exist yet. Therefore, it’s extremely important that you know SQL well so that your data is robust and scalable.

2. Data in the real world is a lot messier than you’d imagine it to be

  1. Differently spelled data entries, i.e. United States, USA, US, United States of America
  2. Incomplete or missing data entries
  3. Inconsistent data where numbers or logic does not tally

To manage your expectations, you should bare in mind that majority of your time is going to be spent on cleaning your data. It’s very unlikely that you’ll be able to jump straight into modelling.

3. A vague term like “data science” equates to vague responsibilities

4. Communication skills are essential

Working in a data science-related role doesn’t mean you simply work with data to build models all day long. Rather, you’ll be required to collaborate and communicate with other cross-functional stakeholders. Even if you’re a team of one, you’re going to have to communicate with leadership about the work that you’re doing and it’s tangible business impact. You’ll also likely have to collaborate with other teams and business analysts to build that domain knowledge. So yes, communication skills is instrumental in helping you become a successful in your data science career!