Data Science
Computer Science
This is a certifier course in the Computer Science degree. Even though there is no pre-requisite for enrolling in this course, it is highly recommended that you have a background in machine learning. During this course, you will learn about different steps of a Data Science pipeline while you apply what you learn in a challenge.
The main topics discussed in this course are:
- Basic pandas
- Descriptive statistics
- Correlation analysis
- Univariate data analysis
- Multivariate data analysis
- Enhanced data visualization
- Missing data and imputation
- Data discretization, normalization, and standardization
- Outlier analysis (Tukey’s method and Isolation Forests)
- Dimensionality reduction: PCA and t-SNE
- Feature selection
- Hypothesis testing
Below you will find the datasets using throughout the course:
| Dataset | Link | |
|---|---|---|
| Airports | Link | |
| Bible | Link | |
| Forest Fire | Link | |
| California Housing | Link | |
| Customers | Link | |
| Enron | Link | |
| RealEstate | Link | |
| Imbalanced | Link | |
| Iris | Link | |
| Juice | Link | |
| OMMLBD Familiar | Link | |
| Salaries | Link | |
| Titanic | Link | |
| Tweets | Link | |
| Book References (.zip) | Link |