This course is designed to teach analysts, students interested in data science, statisticians, data scientists on how to analyze real-world data by creating professional-looking charts and using numerical descriptive statistics techniques in Python 3. You will learn how to use charting libraries in Python 3 to analyze real-world data about corruption perception, infant mortality rate, life expectancy, the Ebola virus, alcohol, and liver disease data, World literacy rate, violent crime in the USA, soccer World Cup, migrants deaths, etc.
You will also learn how to effectively use the various statistical libraries in Python 3 such as numpy, scipy.stats, pandas, and statistics to create all descriptive statistics summaries that are necessary for analyzing real-world data.
In this course, you will understand how each library handles missing values and you will learn how to compute the various statistics properly when missing values are present in the data.
The course will teach you all that you need to know in order to analyze hands-on real-world data using Python 3. You will be able to appropriately create the visualizations using seaborn, matplotlib or pandas libraries in Python 3.
Using a wide variety of world datasets, we will analyze each one of the data using these tools within pandas, matplotlib and seaborn:
- Correlation plots
- Box-plots for comparing groups distributions
- Time series and lines plots
- Side by side comparative pie charts
- Areas charts
- Stacked bar charts
- Histograms of continuous data
- Bar charts
- Regression plots
- Statistical measures of the center of the data
- Statistical measures of spread in the data
- Statistical measures of relative standing in the data
- Calculating Correlation coefficients
- Ranking and relative standing in data
- Determining outliers in datasets
- Binning data in tertiles, quartiles, quintiles, deciles, etc.
The course is taught using Anaconda Jupyter notebook, in order to achieve a reproducible research goal, where we use markdowns to clearly
document the codes in order to make them easily understandable and shareable.
This is what some students are saying:
“I really like the tips that you share in every unit in the course sections. This was a well-delivered course.”
“I am a Data Scientist with many years using Python /Big Data. The content of this course provides a rich resource to students interested in learning hands-on data visualization in Python and the analysis of descriptive statistics. I will recommend this course anyone trying to come into this domain.”
- Lectures 34
- Quizzes 0
- Duration 50 hours
- Skill level All levels
- Language English
- Students 5
- Assessments Yes
Section 1: Getting started with Datavisualization and descriptive statistics course
Section 2: Exploratory data analysis using Python 3 graphical libraries.
In this section, students will learn how to use Python 3 graphical libraries such as matplotlib, seaborn and pandas to create professional looking charts of real world data.
- Creating a Pie chart using Python 3 matplotlib graphical library
- Side by Side Pie charts using matplotlib library in Python 3
- Creating a stacked area plot using Python seaborn library
- Creating a scatter plot chart in Python 3 using seaborn library.
- Creating a pairplot using Python seaborn graphical library
- Using a Boxplot in Pandas seaborn library to compare groups in data
- Creating a line plot trend of the data using Python pandas library
- Creating a histogram using Python seaborn to analyze data
- Creating a Barplot using colors palettes with Python seaborn library (Part 1)
- Creating a Barplot using colors palettes with Python seaborn library (Part 2)
- Creating a Stacked bar of the missing migrants data using Python seaborn library
- Creating a Pareto type barchart using Python seaborn library
- Creating a heatmap plot using Python seaborn library
Section 3: Projects and hands on applications
Section4: Computing descriptive statistics in Python Pandas Part 1
In this section, we will learn how to use the Pandas library to compute descriptive statistics in Python
- Analyzing descriptive statistics using Pandas library in Python 3
- Analyzing Baseball players data with Pandas in Python 3
- Computing descriptive statistics in Python Pandas Part 2
- Computing correlation coefficients with Python Scipy library
- Computing the coefficient of variation in Python scipy statistics library
- Classifying World literacy rate using Pandas libraries in Python
- Finding outliers in data using Python Pandas library with quantiles functions
- Using Python Scipy library to compute various measures of center of the data
- Computing the Z score using Python Scipy library
- Computing percentiles of scores and IQR using Python Scipy library
- Computing trimmed statistics using Python 3 scipy statistics library
- Computing statistics with missing values using the statistics library in Python
- Handling missing values using the statistics library in Python
- Computing various medians using the Statistics library in Python
Section 5: Computing Descriptive Statistics using the Numpy library in Python
Students will learn how to use the Numpy library to compute descriptive statistics in Python. In particular, they will learn how to handle missing values when using that library.
Section 6: Hands on analysis of Descriptive statistics data in Python 3
Practical applications of the course Datavisualisation and Descriptive statistics
A seriouse deal of statistical modelling taught with a perfect content. I really appricate the effort put in order to not being "hard-to-understand", but still finding the way to teach complex statistics. You will have a very good useful knowledge of statistical modelling without getting lost through too many greek symbols and long explanations. Recommended course to understand the how to do data analysis using python. Thank you so much!