https://github.com/user-attachments/assets/1b17d66f-3fa0-40dc-bfdc-031ca5b703aa
This project provides an in-depth analysis of voting patterns in the 2024 São Paulo municipal elections, with a focus on the first and second rounds of mayoral and city council races. It examines key aspects such as voter behavior, shifts between rounds, and regional variations in voter turnout.
The dataset was **manually compiled from official sources** , includes over 15,000 entries. To gather relevant data, the project employed web scraping techniques, followed by data cleaning and exploratory data analysis (EDA). These methods uncover valuable insights into electoral trends and provide strategic guidance for understanding the political dynamics of São Paulo, which can inform future election strategies
This work was developed as part of the Integrated Project and Storytelling course in the second semester of the undergraduate program in Data Science and Artificial Intelligence at PUC-SP in 2024, under the mentorship of the renowned Professor ✨ Rooney Ribeiro Albuquerque Coelho
His expertise and unwavering dedication to teaching played a crucial role in deepening our understanding of both data science and the art of storytelling.
To access the full Map, click the Map below:
Access the dataset and explore the interactive dashboard via the Power BI link below, where you can use dynamic filters for detailed insights and visualizations.
## 1. Introduction
This report presents a detailed analysis of the data from São Paulo’s 2024 municipal elections, focusing on vote distribution, voter behavior, and the performance of mayoral and councilor candidates. Various visualizations and dashboards are used to explore voting patterns, emerging trends, and the factors influencing electoral outcomes.
The study aims to understand electoral dynamics in São Paulo’s urban and peripheral areas, identifying factors determining voter preferences, such as the most-voted parties, candidate profiles, and voting behavior.
Analyzing electoral data is crucial for understanding voter behavior, party preferences, and political trends across different regions. Data visualization offers a clear and efficient way to identify patterns that can inform future campaigns.
The data used in this study were extracted from public sources, providing information on votes by municipality, electoral zone, and political party. The dataset includes details about mayoral and councilor candidates in São Paulo, including the number of votes received by each candidate.
👉🏻 Access Here All Processed Files
The following CSV files were processed:
address_Mayor.csv
Mayor_by_city.csv
Mayor_by_city_round_2.csv
Mayor.csv
address_Councilor.csv
Councilor_by_city.csv
councilor.csv
Here is an overview of the main columns in the processed CSV files:
NM_MUNICIPIO
: Municipality nameNR_ZONA
: Electoral zone numberDS_CARGO_PERGUNTA
: Election role (Mayor or Councilor)NM_VOTAVEL
: Candidate nameSG_PARTIDO
: Party acronymQT_VOTOS
: Number of votes receivedThe methodology was divided into several steps:
The exploratory analysis uncovered several interesting trends, such as:
The votes distribution revealed a large concentration in São Paulo and neighboring urban areas. The analysis indicated the need for specific strategies for peripheral areas.
import plotly.express as px
import pandas as pd
# Reading the dataset
election = pd.read_csv('/path/to/your/data.csv', encoding='latin-1')
# Plotting vote distribution by municipality
fig = px.histogram(election, x="NM_MUNICIPIO", y="QT_VOTOS",
title="Votes by Municipality",
color_discrete_sequence=["#1f77b4"])
fig.update_layout(bargap=0.2)
fig.show()
### 7.2. Most Voted Mayoral Candidates
Ricardo Nunes (MDB) stood out in central zones, while Guilherme Boulos (PSOL) had strong support in the peripheries.
```python
# Filtering mayoral candidates
mayor = election[(election["DS_CARGO_PERGUNTA"] == "Prefeito") &
(election["NM_MUNICIPIO"] == "SÃO PAULO") &
(election["SG_PARTIDO"] != "#NULO#")].copy()
# Grouping and ordering candidates by votes
mayor = mayor.groupby(['NM_VOTAVEL', 'SG_PARTIDO']).sum().sort_values("QT_VOTOS", ascending=False)["QT_VOTOS"].reset_index()
# Calculating vote percentages
total_votes = mayor["QT_VOTOS"].sum()
mayor["PERCENTAGE"] = mayor["QT_VOTOS"] / total_votes
# Bar chart
fig = px.bar(mayor, x="NM_VOTAVEL", y="QT_VOTOS", color="SG_PARTIDO",
title="Most Voted Mayoral Candidates",
color_discrete_sequence=px.colors.qualitative.Dark24)
fig.show()
```
7.3. Most Voted Councilor Candidates
Vote distribution showed a concentration among local candidates, with highlights for Tabata Amaral (PSB) and Renato Sorriso (PL) in peripheral zones.
```python
# Filtering councilor candidates
councilor = election[(election["DS_CARGO_PERGUNTA"] == "Vereador") &
(election["NM_MUNICIPIO"] == "SÃO PAULO") &
(election["SG_PARTIDO"] != "#NULO#")].copy()
# Grouping and ordering candidates by votes
councilor = councilor.groupby(['NM_VOTAVEL', 'SG_PARTIDO']).sum().sort_values('QT_VOTOS', ascending=False)["QT_VOTOS"].reset_index()
# Calculating vote percentages
total_votes = councilor["QT_VOTOS"].sum()
councilor["PERCENTAGE"] = councilor["QT_VOTOS"] / total_votes
# Bar chart
fig = px.bar(councilor, x="NM_VOTAVEL", y="QT_VOTOS", color="SG_PARTIDO",
title="Most Voted Councilor Candidates",
color_discrete_sequence=px.colors.qualitative.Dark24)
fig.show()
```
### 7.4 Most Voted Mayors by Electoral Zone
Central zones favored Ricardo Nunes, while peripheral zones were dominated by Guilherme Boulos.
```python
# Data of zones and neighborhoods
areas = pd.DataFrame({
"ZONE": [1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 246, 246, 247, 247, 248, 248, 249, 250, 250, 250, 251, 251, 252],
"NEIGHBORHOOD": ["BELA VISTA", "CONSOLACAO", "LIBERDADE", "REPUBLICA", "SE", "BARRA FUNDA", "PERDIZES", "SANTA CECILIA", "BOM RETIRO", "BRAS", "PARI", "AGUA RASA", "BELEM", "MOOCA", "JD PAULISTA"]
})
# Merging with mayor data
merged = mayor.merge(areas, left_on="NR_ZONE", right_on="ZONE")
# Bar chart
fig = px.bar(merged, x="NEIGHBORHOOD", y="QT_VOTES", color="SG_PARTY", title="Most Voted Mayor by Zone")
fig.show()
```
### 7.5 Most Voted Councilors by Electoral Zone
The analysis revealed candidates like Márcio Chagas (PSOL) and Luana Almeida (PL) performing well in suburban areas.
```python
# Analyzing most voted councilors by electoral zone
areas = pd.DataFrame({
"ZONE": [1, 1, 1, 2, 2, 3, 3, 4, 5, 6],
"NEIGHBORHOOD": ["BELA VISTA", "CONSOLACAO", "LIBERDADE", "MOOCA", "CAMPO BELO", "ITAQUERA", "CID DUTRA", "PIRITUBA", "VILA PRUDENTE", "TATUAPE"]
})
# Merging councilor data
councilor_merged = councilor.merge(areas, left_on="NR_ZONE", right_on="ZONE")
# Bar chart
fig = px.bar(councilor_merged, x="NEIGHBORHOOD", y="QT_VOTES", color="SG_PARTY", title="Most Voted Councilor by Zone")
fig.show()
```
### 7.6 Most Voted Mayors by Municipality
The municipality-level analysis confirmed Ricardo Nunes' dominance in urban areas and Boulos’ strength in peripheral zones.
```python
# Grouping mayors by municipality
municipality = mayor.groupby("NM_MUNICIPIO").sum().sort_values("QT_VOTES", ascending=False)
# Bar chart
fig = px.bar(municipality, x=municipality.index, y="QT_VOTES", title="Most Voted Mayor by Municipality")
fig.show()
```
### 7.7 Most Voted Councilors by Municipality
The analysis showed a strong presence of candidates like Eduardo Suplicy (PT) across several municipalities, reflecting broad political support.
```python
# Grouping councilors by municipality
municipality_councilor = councilor.groupby("NM_MUNICIPIO").sum().sort_values("QT_VOTES", ascending=False)
# Bar chart
fig = px.bar(municipality_councilor, x=municipality_councilor.index, y="QT_VOTES", title="Most Voted Councilor by Municipality")
fig.show()
```
### 7.8 Distribution of Votes by Political Party
The vote distribution charts confirmed the dominance of MDB and PSOL, with PSOL's support growing in peripheral zones.
```python
# Analyzing distribution of votes by party
party_votes = election.groupby("SG_PARTIDO").sum().sort_values("QT_VOTES", ascending=False)
# Bar chart
fig = px.bar(party_votes, x=party_votes.index, y="QT_VOTES", title="Distribution of Votes by Political Party")
fig.show()
```
## 8. Interactive Power BI Dashboards: [Click to access the link](https://app.powerbi.com/view?r=eyJrIjoiNTNmY2Y2YzgtODY3Yy00M2ViLWI0NDItMTdiZDJlNTg4Zjk2IiwidCI6IjhlYjI5MjAxLWEyN2QtNDMwMi04NDczLWM5ODJlYjViZTkzNSJ9)
### 8.1 Dashboard 1: Geographic Distribution of Votes
This dashboard provided a detailed view of electoral preferences by region, highlighting the polarization between urban and peripheral areas.
```python
import plotly.express as px
# Gráfico de mapa para distribuição de votos por município
df = pd.read_csv('distribution_votes.csv')
fig = px.choropleth(df, locations="municipality", color="votes", hover_name="municipality", title="Distribuição Geográfica de Votos")
fig.show()
```
### Dashboard 2: Candidate Performance by Region
This dashboard was essential for understanding candidate performance across regions, using heatmaps and bar charts.
```python
import plotly.express as px
# Bar chart for vote analysis by party
df = pd.read_csv('votes_by_party.csv')
fig = px.bar(df, x="party", y="votes", color="party", title="Vote Analysis by Party")
fig.show()
```
### 8.3 Dashboard 3: Voting Analysis by Party
The visualization allowed for identifying votes distribution by party and electoral preferences by zone.
```python
# Dashboard for candidate performance
df = pd.read_csv('candidates_performance.csv')
fig = px.scatter(df, x="zone", y="votes", color="party", title="Candidate Performance by Electoral Zone")
fig.show()
```
### 8.4 Dashboard 4: Voting by Demographic Profile
This dashboard analyzed voting by age, gender, and social class, highlighting preferences of younger voters and lower social classes for progressive candidates.
```python
# Dashboard for comparison between candidates
df = pd.read_csv('candidates_comparison.csv')
fig = px.scatter(df, x="votes_mayor", y="votes_councilor", color="party", title="Comparison of Mayoral and Councilor Candidates")
fig.show()
```
### 8.5 Dashboard 5: Voting Comparison Between 2020 and 2024 Elections
The comparison between the two elections revealed significant changes in electoral preferences, with PSOL gaining ground in the peripheries.
```python
\# Dashboard for voting by age group
df = pd.read_csv('votes_by_age_group.csv')
fig = px.pie(df, names="age_group", values="votes", title="Voting by Age Group")
fig.show()
```
## 9. Conclusion
The analysis of the 2024 São Paulo municipal election data provided valuable insights into voter behavior and emerging trends. We observed increasing political polarization, with PSOL gaining strength in peripheral areas and MDB maintaining a solid base in central urban areas. Additionally, the analysis revealed a shift in electoral preferences, with growing support for more progressive parties, especially among younger voters and lower social classes.
The analysis of charts and dashboards enabled a more detailed understanding of vote distribution by geography, candidate performance by electoral zone, and vote segmentation by party and demographic profile. The trends observed suggest that future electoral campaigns should focus on more segmented strategies, considering the social and economic characteristics of each region.
### Recommendations for future campaigns:
- **Personalize electoral communication** for different regions, considering demographic and socioeconomic profiles.
- **Leverage the growth of social media** and other digital platforms to connect with younger voters and those with limited access to traditional media.
- **Tailor campaign proposals** according to local issues such as security, health, and education, which were decisive factors for votes in various peripheral zones.
## 10. Extra Material
- **🇺🇸 Data Analysing Report**: [Click 🔗](https://github.com/Mindful-AI-Assistants/SP2024-Election-Analysis/blob/77ee8d3319a14c05ae6d3b023e0a4101ec5e2943/Data%20Analysing%20Report/%F0%9F%87%BA%F0%9F%87%B8Data%20Analysing%20Report.pdf)
- **🇧🇷 Data Analysing Report**[Click 🔗](https://github.com/Mindful-AI-Assistants/SP2024-Election-Analysis/blob/9ab39e27ff0f2e8444b7c773ec309986d073ad92/Data%20Analysing%20Report/%F0%9F%87%A7%F0%9F%87%B7Analise%20do%20Dados%20Relatoirio.pdf)
- **Power BI Access Link**: [Click 🔗](https://app.powerbi.com/view?r=eyJrIjoiNTNmY2Y2YzgtODY3Yy00M2ViLWI0NDItMTdiZDJlNTg4Zjk2IiwidCI6IjhlYjI5MjAxLWEyN2QtNDMwMi04NDczLWM5ODJlYjViZTkzNSJ9)
- **Power BI File**: [Click 🔗](https://github.com/Mindful-AI-Assistants/SP2024-Election-Analysis/blob/8c71e68c34ccfd2c14ff3ecb8d0f7558bcbe109d/Power%20B%20I%20Files/DashBoard.pbix)
- **QR Code**:
Scan the code to access the data and visualizations on Power BI.
Back to top # #####
Copyright 20245 Mindful-AI-Assistants. Code released under the [MIT license.]( https://github.com/Mindful-AI-Assistants/.github/blob/ad6948fdec771e022d49cd96f99024fcc7f1106a/LICENSE)