Activities of this lab:
- Learn how to manipulate semi-structured data with
Pandas
. - Learn what ETL processes are.
PistusResa
ETL module: clean and import data into a relational database.PistusResa
Student
module: read data from a relational database using Python code.
Introduction to Pandas
The data used in PistusResa
is stored in two CSV files.
We'll use Pandas
to read the data from the CSV files, clean them and load them into the Pistus database.
The goal of this section is to learn how to use Pandas
.
Download this notebook and import it into Visual Studio Code. Read the notebook and do the exercises before proceeding to the next section.
⏰ Expected duration of this part: at most 1.5 hours.
The PistusResa ETL module
The goal of this section is to extract the data from the input CSV files, transform and load them into the Pistus database.
Download this notebook and import it into Visual Studio Code. Read the notebook and do the exercises before proceeding to the next section.
⏰ Expected duration of this part: at most 30 minutes.
Implementation of PistusResa
You're now ready to work on the implementation of PistusResa.
- Download the skeleton of the application
PistusResa
and import it into Visual Studio Code. - Copy the database file
pistus.db
to the folderdata
.
⏰ Time to spend on this part: at most 1 hour.
The Student module
The Student
module contains the functions necessary to add and edit student information in the database.
The code of the Student
module is in file mstudent.py
.
In file mstudent.py
, the following functions are defined but not implemented.
get_student
, returns a student's data from the database.get_associations
, returns all associations stored in the database.get_roles
, returns the student roles in an association.get_memberships
, returns the associations of a specific student.add_email_address
, adds an email address to a student.add_student
, adds a student to the database.add_membership
, adds a student to an association in the database.delete_email_address
, removes the email address of a student.delete_membership
, deletes a student's membership in an association.update_first_name
, updates student's first name.update_last_name
, updates a student's last name.update_gender
, updates a student's gender.update_email_address
, updates the email address of a student.update_membership
, updates an existing membership of a student.
In file mstudent_test.py
we defined and implemented a test function for each of the functions listed above.
These test functions are called in the main section of file mstudent.py
: see the code after the line 487.
Here's what you have to do:
- Open file
mstudent.py
. - Implement the functions in the order in which they are defined: the order is important as the tests depend on that.
- Each function is commented in file
mstudent.py
. Read carefully all the comments to understand how to implement each function. - You'll have to use SQL queries in each function. Refer to the SQL cheat sheet if you have doubts about the SQL syntax.
- Whenever you finish the implementation of a function, run file
mstudent.py
as a Python script. This will trigger the execution of the tests. - If a test fails, you'll need to understand the error messages and fix the problem.
👉 Remember to remove the instruction raise NotImplementedError
when you implement a function. Otherwise, your function would always be considered as not implemented.
👉 The Registration
module (file mregistration.py
) is already implemented. Look at the code to understand how the Student
module should be implemented.
Next lab
In the next lab we will:
- code a login window for
PistusResa
. - code the deadline and authentication modules of
PistusResa
(if time allows it).
👉 If you reached the end of this lab, and you still have time, you can start the next lab.