Important. Because of the COVID-19, the course will be done remotely from now on. We will continue to post things on this website, and will put in place a tool to do the lectures remotely. More informations soon.


Welcome to the webpage of the Big Data Technologies course. This course is taught by Professors Stéphane Boucheron and Stéphane Gaïffas. On this webpage you will find all the teaching material (mainly slides and jupyter notebooks), but also instructions to get the tools required for the course.

Tentative agenda for the course

Learning resources

Here is a list of learning ressources that can be useful for this course, among many others:


The course will focus mainly on for big data processing and starts with a description and usage of the “Python stack” for data science. The course will use python as main programming language, even if the scala API is better for spark.

A tentative list of technologies used during the course is as follows:


Python stack

Data Visualization

Big data processing

Data storage / formats / querying