When I started learning data science, I tried learning the most complicated concepts without learning the basics.
After years of experience, I’ve realized that the basics are sufficient enough to get you over 80% of the way in your career. Why? Simpler solutions always win. They’re easier to understand, easier to implement, and easier to maintain. Once a simple solution demonstrates its value to the company, only then could you look into more complex solutions.
So what exactly are the fundamentals?
After 3 years of work, I am convinced that mastering SQL is pivotal to have a successful career. SQL is not a hard skill to learn (i.e. SELECT FROM WHERE), but it is certainly a hard skill to perfect. SQL is essential for data wrangling, data exploration, data visualization (building dashboards), building reports, and building data pipelines.
Check out my guide below if you want to master SQL:
B) Descriptive and Inferential Statistics
Having a good understanding of fundamental descriptive and inferential statistics is also very important.
Descriptive statistics allow you to summarize and make sense of your data in an easy manner.
Inferential statistics allow you to make conclusions based on limited amounts of data (samples). This is essential for building explanatory models and A/B testing.
C) Python for EDA and Feature Engineering
Python is important mainly for performing EDA and feature engineering. That being said, these two steps can also be completed using SQL, so that’s something to keep in mind. I personally like to have Python in my tech stack because I find it easier to perform EDA in a Jupyter Notebook than a SQL console or a dashboard.
Build, test, iterate, repeat.
Generally, it’s always better to spend less time on a model to get an initial version into production and iterate from there. Why?
- Allocating less time on an initial model incentivizes you to come up with a simpler solution. And like I said earlier in this article, there are several benefits to a simpler solution.
- The faster you come up with a POC (proof of concept), the faster you can receive feedback from others to improve on it.
- Business needs constantly change, so you’re more likely to be successful if you can deploy your project sooner than later.
The point I’m trying to make is not to rush your projects, but to quickly deploy them so that you can receive feedback, iterate, and improve your projects.
I hope you found this insightful and helps you in your data science career! If…
Continue reading: https://towardsdatascience.com/3-most-important-lessons-ive-learned-from-3-years-into-my-data-science-career-acdf783d889c?source=rss—-7f60cf5620c9—4