# Foundations of Data Sciences with Python

## Dates for Open Courses

Course only available as in-house training. Please ask us at info@python-academy.de

## Intended Audience

(Aspiring) data scientists with good knowledge of Python. This course can be combined with introductory courses (see Recommended Module Combinations) to achieve appropriate Python skills.

## Motivation

The importance of working with potentially large amounts of data is every increasing. Data is collected everywhere. Making sense out of this data can be time-consuming and challenging task. Python offers many tools to work with data. Being a general-purpose programming language that is wide-spread and can be learned comparably easily, Python is used for many data-related tasks.

This course gives an overview over basic Python libraries for data science. A good understanding of these libraries is important for more advanced data science topics such as machine learning.

## Course Content

### Introduction to NumPy, SciPy, and pandas

#### NumPy

The library NumPy is the defacto standard for the work with arrays. It is used my many other libraries for data science. The course introduces the main working principles of NumPy.

#### SciPy

SciPy is a collection of many scientific libraries such as special functions, integration, optimization, interpolation, Fourier transforms, signal processing, linear algebra, statistics, and file IO.

The course provides an overview of some of these libraries that are important for data science.

#### pandas

Pandas is a very powerful Python library for effective analysis of large amount of data. The course introduces the basic features and workflows of Pandas. Core of the course are Pandas-specific data structures and the data analysis operations they support.

### SqlAlchemy and SQL in pandas

Relational databases are a very common data store. The course shows how to work with such database that can be queried with SQL. The SqlAlchemy library and SQL in pandas provide very convenient interfaces for these tasks.

### A deep-dive into Time Series in pandas

Lots of data come as time series. pandas is especially useful for time series processing. The course covers the vast possibilities pandas offers here.

### Timeseries Forecasting: Seasonal ARIMA and Signal Processing

Forecasting with autoregressive integrated moving average (ARIMA) is an established method for modeling time series. Another method is signal processing. The course shows how Python can be used for this tasks.

### Plotting in matplotlib, pandas, and seaborn

Python provides reach libraries for visualization of data. The course in introduces the library matplotlib that provides many different types of diagrams from within Python with only a few lines of code.

In addition to using matplotlib directly, the course shows how matplotlib can be used vis pandas and seaborn. These libraries provide a high-level interface of efficient data visualizations.

## Course Duration

5 days

## Exercises

The participants can follow all steps directly on their computers. There are exercises at the end of each unit providing ample opportunity to apply the freshly learned knowledge.

## Course Material

Every participant receives comprehensive printed materials that cover the whole course content as wells as all source codes and used software.

## Recommended Module Combinations

The modules Python Extensions with Other Languages and Optimizing of Python Programs cover supplementary topics.

The course may be combined with the course Python for Nonprogrammers or Python for Programmers.

