ดูคลิป Data Science Learning Roadmap for 2021 ของ Harshit Tyagi
https://youtu.be/nM_wZIzKEhc&t=220
ประกอบกับบทความ
https://towardsdatascience.com/data-science-learning-roadmap-for-2021-84f2ba09a44f
ได้หัวข้อที่นำไปค้นหา เพื่อเป็นแนวทาง สำหรับผู้ต้องการศึกษา Data Science
1.1 Data Structures (python/R)
1.2 SQL scripting
1.3 Conditionals, List/Dict comprehension
1.3.1 Conditionals List comprehension
1.3.2 Conditionals Dictionary comprehension
1.4 Object oriented programming
1.5 Working with external libraries
1.6 Fundamental algorithms - searching, sorting, trees, graphs, etc.
1.6.1 Fundamental algorithms – searching
1.6.2 Fundamental algorithms - sorting
1.6.3 Fundamental algorithms - trees
1.6.4 Fundamental algorithms - graphs
1.7 Advanced: Functional programming
--------------------------
Profile: Data Analysts(Any deptt.)
Data Extraction
Data Wrangling
2.1 Scripting - extracting data from websites, APIs, DBs
2.2 Data formatting (type conversion)
2.3 Libraries - Pandas and NumPy
2.3.1 Data Extraction Pandas
2.3.2 Data Wrangling Pandas
2.3.3 Data Extraction Numpy
2.3.4 Data Wrangling NumPy
2.4 Data transformation- joining, slicing, indexing
2.4.1 Data transformation- joining
2.4.2 Data transformation- slicing
2.4.3 Data transformation- indexing
2.5 Handling missing values - can use tools like trifacta
-----------------------------------
Profiles: Data Analyst, Business
Analysts, Marketing Analyst.
Data Product Manager
3 EDA Exploratory Data Analysis
3.1 Defining business-focused questions
3.2 Studying data distribution – outliers
3.3 Univariate and multivariate analysis
3.4 Data Visualization
3.4.1 Data Visualization – matplotlib
3.4.2 Data Visualization – seaborn
3.4.3 Data Visualization - plotly
3.5 Building dashboards- excel/tableau, Jupyter
3.5.1 Building dashboards- excel
3.5.2 Building dashboards- tableau
3.5.3 Building dashboards- Jupyter
3.6 Writing concise and insightful reports
3.7 Business acumen
-----------------------------
Profiles: Data Engineer,
DevOps Engineer,
Data Architect
4 Data Engineering
4.1 Strong programming skills
4.2 Working with CLI Command Line Interface
4.3 Building ETL Extract-Transform-Load pipelines
4.4 Data engineering tools
Using tools - Spark, Kafka, Airflow, etc
4.4.1 Data engineering tool - Spark
4.4.2 Data engineering tool - Kafka
4.4.3 Data engineering tool - Airflow
4.5 Cloud Services - AWS, GCP, Azure
4.5.1 Cloud Services
4.5.2 Cloud Services - AWS
4.5.3 Cloud Services – GCP Google Cloud Platform
4.5.4 Cloud Services - Azure
4.6 Algorithms - MapReduce, YARN
4.6.1 Algorithms - MapReduce
4.6.2 Algorithms – YARN (Yet Another Resource Negotiator)
4.7 Deploying ML models in production
------------------------------
Profiles: Data Scientist,
Quantitative Analysts
5.1 Descriptive - mean, median, mode, std. etc
5.2 Inferential - hypothesis & A/B testing, Cl, p-value
5.2.1 Inferential - hypothesis & A/B testing
5.2.2 Inferential – Cl (Confidence Interval)
5.2.3 Inferential - p-value
5.3 Experiment Design
5.4 Probability - conditional, bayes theorem, etc
5.4.1 Probability
5.4.2 Probability - conditional
5.4.3 Probability – Bayes’ theorem
5.5 ANOVA, Chi-Square test
5.5.1 ANOVA Analysis of Variance
5.5.2 Chi-Square test
5.6 Sampling, data distributions, t-tests
5.6.1 Sampling
5.6.2 Data distributions
5.6.3 t-tests
5.7 Linear Algebra
5.8 Single and multivariate calculus
5.8.1 Single variate calculus
5.8.2 Multivariate calculus
--------------------------------
Profiles: ML Engineer.
Data Scientist
6.1 Supervised - classification, regression
6.2 Unsupervised - clustering, dimensionality reduction
6.3 Reinforcement learning - TF-Agents, optimising rewards
6.3.1 Reinforcement learning - TF-Agents
6.3.2 Reinforcement learning - optimising rewards
6.4 Performance metrics - RMS, accuracy, confusion matrix, AUC-ROC, etc
6.4.1 Performance metrics – RMS Root-Mean-Square
6.4.2 Performance metrics – accuracy
6.4.3 Performance metrics - confusion matrix
6.4.4 Performance metrics - AUC-ROC
Area Under Curve- Receiver Operating Characteristic
6.5 Hyperparameter tuning
6.6 Statistical ML - KNN, Decision trees, bagging, boosting
6.7 Ensemble Models - Random forests, voting classifiers, adaboost
-------------------------
Harshit Tyagi profile
https://www.linkedin.com/in/tyagiharshit?originalSubdomain=in
Harshit Tyagi เป็น Data Science Engineer ชาวอินเดีย จบป.ตรีคอมพิวเตอร์
Bharati Vidyapeeth's College Of Engineering นิวเดลี อินเดีย
มีผลงานวิจัยร่วมกับทีมมหาวิทยาลัยเยล MIT และ UCLA
มีงานเขียนและงานสอนจำนวนมาก
Harshit Tyagi Blog
https://muckrack.com/harshit-tyagi/articles
Harshit Tyagi twitter
https://twitter.com/dswharshit
https://www.geeksforgeeks.org/how-to-become-data-scientist-a-complete-roadmap/
https://www.onlinemanipal.com/blogs/data-science-roadmap
https://github.com/MrMimic/data-scientist-roadmap
** ลิงค์ที่เกี่ยวข้อง
Machine Learning
https://www.gotoknow.org/posts/711453
เรียน Data Science ทาง YouTube
https://www.gotoknow.org/posts/711726
ไม่มีความเห็น