Mandatory activity
Objective. This tutorial shows how to use Pandas to build a data pipeline and perform basic data analysis and visualization.
Preparation. Follow these instructions to prepare the working environment:
- Fork this project in GitLab.
- Clone the forked project using the
git clonecommand. - Open the project with Visual Studio Code.
- Open a terminal and create a Python virtual environment:
python3 -m venv .venv - Activate the virtual environment:
source .venv/bin/activate - Upgrade pip:
pip install --upgrade pip - Install the required libraries in the virtual environment:
pip install -r requirements.txt
Work to do. Open notebook notebooks/td13-pandas-en.ipynb and follow the instructions.
ANSWER ELEMENTS
Solutions available on GitLab: git clone git@gitlab-research.centralesupelec.fr:sip/students/solutions/td13-pandas-solutions.git
Optional: interactive maps with Folium
For interactive maps, you can use the Folium library (user guide).
For this activity, it is better to write your code in a Python script instead of a notebook, because the Visual Studio Code notebook extension does not support Folium maps very well.
Creating a map in Folium is relatively simple:
import folium m = folium.Map() m.show_in_browser()
A map will be displayed in your web browser. Press Ctrl+C to exit the program.
Position and zoom
With the previous code, you display a full world map. You can, however, control both the portion of the world you want to visualize and the zoom level.
- Using the documentation, display a map centered on France with an appropriate zoom level.
ANSWER ELEMENTS
m = folium.Map([47, 2], zoom_start=6) m.show_in_browser()
Bounding boxes and markers
Bounding boxes and markers are useful to highlight or pinpoint precise locations on a map.
- Using the documentation, add a red bounding box to the map that encloses metropolitan France (including Corsica).
- Add a marker at the location of Paris. You can refer to this source to get the coordinates of the bounding box.
ANSWER ELEMENTS
import folium
m = folium.Map([47, 2], zoom_start=6)
folium.PolyLine(
[
[41.0, -5.5],
[52.0, -5.5],
[52.0, 10.0],
[41.0, 10.0],
[41.0, -5.5]
],
color='red'
).add_to(m)
folium.Marker(
[48.8575, 2.3514],
popup = "Paris, our beautiful capital!"
).add_to(m)
m.show_in_browser()
If you followed the examples in the documentation, you may have added a simple popup to the Paris marker using the
popup argument of the Marker constructor (if not, try it).
The popup text wraps over several lines because the default popup size is small.
- Use the same documentation page to find a way to adjust the popup size for the Paris marker.
ANSWER ELEMENTS
import folium
m = folium.Map([47, 2], zoom_start=6)
folium.PolyLine(
[
[41.0, -5.5],
[52.0, -5.5],
[52.0, 10.0],
[41.0, 10.0],
[41.0, -5.5]
],
color='red'
).add_to(m)
paris_marker = folium.Marker([48.8575, 2.3514])
paris_popup = folium.Popup("Paris, our beautiful capital!", max_width=300)
paris_marker.add_child(paris_popup)
paris_marker.add_to(m)
m.show_in_browser()
In the previous exercises, we used a PolyLine to draw a rectangular bounding box.
Folium also provides more advanced vector layers:
These layers let you overlay geometric shapes on a map.
- Overlay a lightly shaded hexagon over France.
ANSWER ELEMENTS
import folium
m = folium.Map([47, 2], zoom_start=6)
locations = [
[48.4, -4.8], # Brest
[51.1, 2.4], # Dunkirk
[48.7, 7.8], # Strasbourg
[43.7, 7.5], # Nice
[42.5, 2.9], # Perpignan
[43.4, -1.8] # Biarritz
]
folium.Polygon(
locations = locations,
color = 'red',
weight = 6,
fill_color = 'red',
fill_opacity = 0.2,
fill = True
).add_to(m)
m.show_in_browser()
Choropleth maps
Now we want to draw an interactive choropleth map displaying the median house price in France.
For this, you will need to use the Folium Choropleth class.
You will need to look at examples
and the complete class reference:
Here are a few important points:
- Internally, the
Choroplethclass uses GeoJSON, an open standard for representing spatial features. - The
geo_dataargument of theChoroplethconstructor accepts a GeoPandas GeoDataFrame; internally, this is converted to GeoJSON. Therefore, any column in the GeoDataFrame is accessible throughfeature.properties.column_name. This is important when you specify the GeoDataFrame attribute used to merge the GeoDataFrame with the DataFrame containing the statistics to visualize (price per square meter). - The
Choroplethclass applies a GeoJSON overlay to the map. Information about this overlay is kept in thegeojsonattribute of the Choropleth object. The constructor arguments allow you to modify the underlying GeoJSON without using it explicitly. - For some tasks (for example, adding a popup), you will need to use the
geojsonobject directly.
- Using the examples and the Choropleth class specification, create a choropleth map of France showing the median house price per square meter, by department or by region.
- Complement the map with clickable popups displaying the department or region name. You can use
GeoJsonPopup(documented here) and add it to the choropleth map'sgeojsonobject.
ANSWER ELEMENTS
import folium
import geopandas
import pandas as pd
# Read the DataFrames that we obtained from the first part of the tutorial.
cities_gdf = geopandas.read_file('./data/transformed/geo-communes-2026.zip')
transactions_df = pd.read_parquet('./data/transformed/valeursfoncieres-2025.parquet')
# Dissolve the geometry to obtain polygons for the departments
cities_gdf = cities_gdf.to_crs(2154).dissolve(by='dept_code').reset_index()
cities_gdf['geometry'] = cities_gdf['geometry'].buffer(0.0001)
# Select only the department code and the m2_price
transactions_df = transactions_df[['dept_code', 'm2_price']]
# Compute the median m2 price per department
transactions_df = transactions_df\
.groupby('dept_code', as_index=False)\
.median()
# Initalize the map.
m = folium.Map([46, 2], zoom_start=6)
# Create the choropleth map.
choropleth = folium.Choropleth(
geo_data=cities_gdf, # The GeoDataFrame containing spatial data.
data=transactions_df, # The DataFrame containing the statistics to display.
columns=["dept_code", "m2_price"], # This must be contain exactly two columns: the key used to merge data with geo_data and the value.
key_on="feature.properties.dept_code", # This refers to the dept_code in the GeoJson representation of the GeoDataFrame.
fill_opacity=0.9, # Opacity of the polygons.
legend_name="Price per m2" # Legend attached to the map.
)
# Add the choropleth to the map
choropleth.add_to(m)
# Create a popup, where the text is taken from the column dept_name
popup = folium.GeoJsonPopup(fields=["dept_name"])
# Add the popup to the geojson overlay.
popup.add_to(choropleth.geojson)
# Show the map
m.show_in_browser()
Now you are ready to explore more Folium features on your own. For example, you may want to create a timeline showing the evolution of house prices in France across months within a year, or across years, since you have access to five years of data.

