Skip to content

Data Journey

Mission Statement

The Data Journey is a cycle of courses on Data Science offered by the Braskem Digital Factory, a team of scientists in computing, statistics and applied mathematics.

The objective of the journey is clear: change your professional life.

Here you will find, in four different courses, the content to develop practical and theoretical skills in the field of Data Science. You will learn about collection, transport, protection, storage and processing, data analysis, modeling and sharing, so at the end of the journey, students can offer a solution to problems in their area of expertise.

Syllabus

Introduction to Data Science

Who should take it? Everyone! If you wish to become a Data Scientist or if just want to get a more comprehensive overview of this field this should be your starting point.

Effort Required: Minimal, plan to attend a 2 hours workshop.

By the end of this track you will be able to:

  1. Understand what is Data Science and how you can derive value from it in your business;
  2. Have a basic intuition on how is a Data Science process conducted;
  3. Get familiar with Braskem's initiatives and other market benchmarks.

Content

  1. What is Data Science
  2. Braskem's Initiaves & Market outlook
  3. Effort
  4. Workshop

Business Analyst

Who should take it? Analysts, Coordinator or Managers in business areas that wish to get familiar with development of solutions. Professionals seeking tools to enhance productivity or to help them develop business solutions with data.

Effort Required Duration: TBD

By the end of this track you will be able to: - Have a practical knowledge of the tools for development of business solutions; - Ingest and process data in Python; - Know the concepts of different databases and cloud solutions, such as Data lake and data pipelines; - Get a basic concept of the process involved in developing and deploying a digital product;

Content

  1. Developing Tools
  2. Anaconda Navigator
  3. Jupyter Notebook
  4. Spyder / Sublime / Visual Studio
  5. Introduction to Python
  6. Objects & Structure
  7. Logic Operators & Loops
  8. Classes, Methods & Functions
  9. Libraries (Pandas, Numpy & Matplotlib)
  10. Data Wrangling
  11. Importing Data
  12. Data Structure
  13. Tidying Data
  14. Combining Data
  15. Cleaning Data
  16. Software & Data Engineering | Azure
  17. Microsoft Azure Introduction
  18. SQL & NoSQL Databases
  19. Data Pipeline
  20. Standard Tables Architecture
  21. DevOps & Continuous Deployment
  22. Big Data with PySpark & Databricks (optional)

Data Analyst

Who should take it? Professionals seeking a technical approach to Data driven decisions. Business units undergoing digital transformation where more data driven reports and analysis are required.

Effort Required: Duration: TBD

By the end of this track you will be able to: How to derive insights from data; Perform basic statistical analysis with available data; Build graphs and reports in Python; Know how to transform and choose the best features to enhance the analysis of data; Be equipped to provide your team with more data driven insights.

Content

  1. Exploratory Data Analysis I
  2. Types of Variables
  3. Variable Summary | Statiscal Moments
  4. Correlation Matrix & ANOVA
  5. Feature Engineering
  6. Feature Selection
  7. Feature Creation & Transformation
  8. Data Normalization & Balance
  9. Data Interpolation
  10. Data Visualization
  11. Matplotlib & Seaborn
  12. Graph & Figure Types

Data Scientist

Who should take it? Any professional aspiring to specialize in Data Science with development of Machine Learning algorithms and scientific data analysis.

Effort Required: Duration: TBD

By the end of this track you will be able to: Conduct scientific processes for data analysis; Develop Machine Learning algorithms; Deploy and deliver Data Science based digital products; Provide the business with huge data driven insights and strategies; Fully understand the intuition, math and code behind the main Machine Learning algorithms; Conduct explanatory data analysis to extract causation to enhance business decisions; Develop, deploy and maintain any Data Science project. Understand what are the best approaches and algorithms for each business problem;

Content

  1. Linear Algebra
  2. Matrices & Vectors with Numpy
  3. Alternate Corrdinate Systems
  4. Probability Theory
  5. Probability Models & Axioms
  6. Conditioning & Independence
  7. Counting
  8. Discrete & Continuous Variables Distributions
  9. Bayesian Inference
  10. The Monte Carlo Simulation with Scipy
  11. Statistical Inference
  12. Sample Means & Central Limit Theorem
  13. Confidence Intervals & Hypothesis Testing
  14. Causality Analysis
  15. Linear Modeling & Experimental Design
  16. Exploratory Data Analysis II
  17. Hypothesis Creation & Testing
  18. Endogenous Variables Transformation
  19. Time Series Analysis
  20. Calculus (optional)
  21. Limits & Continuity
  22. Differentials & Derivatives
  23. Univariate Integration
  24. Multivariate Calculus
  25. Learning Techniques
  26. Supervised & Unsupervised Learning
  27. Regressions, Classifications & Clusterings
  28. Train & Test Sets | Dimensionality
  29. Model Scoring
  30. Regression Metrics
  31. Classification Metrics
  32. Clustering Metrics
  33. Cross Validation
  34. Back Propagation & Hyper Parameters
  35. Parameters vs Hyper Parameters
  36. Grid Search & Hyper Parameters Settings
  37. Back Propagation
  38. Main Algorithms Intuition
  39. Linear Regression (Regression)
  40. SVM (Regression & Classification)
  41. Random Forest (Regression & Classification)
  42. Logistic Regression (Classification)
  43. K-Means (Clustering)
  44. Special Models
  45. Boosting
  46. Neural Networks (Deep Learning)
  47. Reinforcement Learning
  48. Natural Language Processing (NLP)
  49. Computer Vision

Disclaimer: The materials produced to the Data Journey does not contain Braskem sensitive data.


Content

Track 0 - Introduction to Data Journey

Track 0 - Introdução à Jornada de Dados

título video slides script exercícios
Data Journey - T0V1 - Data Journey - Apresentação
Data Journey - T0V2 - Data Journey - Apresentação

Track 0 - Introduction to Data Journey

TBD

Track 1 - Introduction to Data Science

Track 1 - Introdução à Ciência de Dados

título video slides script exercicios
Data Journey - T1V1 - Introdução a DS - Apresentação
Data Journey - T1V2 - Introdução a DS - O que é Ciência de Dados
Data Journey - T1V3 - Introdução a DS - Iniciativas na Braskem
Data Journey - T1V9 - Introdução a DS - Exercício 1 - perguntas
Data Journey - T1V9 - Introdução a DS - Exercício 1 - respostas

Track 1 - Introduction to Data Science

TBD

Track 2 - Business Analyst

Track 2 - Analista de Negócios

título video slides script exercicios
Data Journey - T2V1 - Analista de Negócios - Apresentação
Data Journey - T2V2 - Analista de Negócios - Ferramentas de desenvolvimento
Data Journey - T2V3 - Analista de negócios - Anaconda
Data Journey - T2V4 - Analista de Negócios - Exercícios 1

Track 2 - Business Analyst

TBD

Track 3 - Data Analyst

Track 3 - Analista de Dados

TBD

Track 3 - Data Analyst

TBD

Track 4 - Data Scientist

Track 4 - Cientista de Dados

TBD

Track 4 - Data Scientist

TBD