SP2024-Election-Analysis


[🇧🇷 Português] [🇺🇸 English]


<p align="center"> 📊 Data Analysis and Visualization: São Paulo Municipal Elections - 1st and 2nd Round (2024)


📺 Watch in Full HD on YouTube


https://github.com/user-attachments/assets/1b17d66f-3fa0-40dc-bfdc-031ca5b703aa


#### <p align="center"> Sponsor Mindful AI Assistants


This project provides an in-depth analysis of voting patterns in the 2024 São Paulo municipal elections, with a focus on the first and second rounds of mayoral and city council races. It examines key aspects such as voter behavior, shifts between rounds, and regional variations in voter turnout.

The dataset was **manually compiled from official sources** , includes over 15,000 entries. To gather relevant data, the project employed web scraping techniques, followed by data cleaning and exploratory data analysis (EDA). These methods uncover valuable insights into electoral trends and provide strategic guidance for understanding the political dynamics of São Paulo, which can inform future election strategies

This work was developed as part of the Integrated Project and Storytelling course in the second semester of the undergraduate program in Data Science and Artificial Intelligence at PUC-SP in 2024, under the mentorship of the renowned Professor ✨ Rooney Ribeiro Albuquerque Coelho

His expertise and unwavering dedication to teaching played a crucial role in deepening our understanding of both data science and the art of storytelling.


Developed by:



Interactive Map

To access the full Map, click the Map below:


</a>


Power BI Dashboard

Access the dataset and explore the interactive dashboard via the Power BI link below, where you can use dynamic filters for detailed insights and visualizations.

📈 Power BI Dashboard


Table of Contents

## 1. Introduction

This report presents a detailed analysis of the data from São Paulo’s 2024 municipal elections, focusing on vote distribution, voter behavior, and the performance of mayoral and councilor candidates. Various visualizations and dashboards are used to explore voting patterns, emerging trends, and the factors influencing electoral outcomes.

2. Study Objectives

The study aims to understand electoral dynamics in São Paulo’s urban and peripheral areas, identifying factors determining voter preferences, such as the most-voted parties, candidate profiles, and voting behavior.

3. Theoretical Background

Analyzing electoral data is crucial for understanding voter behavior, party preferences, and political trends across different regions. Data visualization offers a clear and efficient way to identify patterns that can inform future campaigns.

4. Dataset Description

Access Dataset:

The data used in this study were extracted from public sources, providing information on votes by municipality, electoral zone, and political party. The dataset includes details about mayoral and councilor candidates in São Paulo, including the number of votes received by each candidate.

TSE Documentatiuon

Processed Files

👉🏻 Access Here All Processed Files

The following CSV files were processed:

Sample Column Structure

Here is an overview of the main columns in the processed CSV files:

5. Methodology

The methodology was divided into several steps:

6. Exploratory Data Analysis

The exploratory analysis uncovered several interesting trends, such as:

7. Charts and Dashboards

7.1. Vote Distribution by Municipality

The votes distribution revealed a large concentration in São Paulo and neighboring urban areas. The analysis indicated the need for specific strategies for peripheral areas.

import plotly.express as px
import pandas as pd

# Reading the dataset
election = pd.read_csv('/path/to/your/data.csv', encoding='latin-1')

# Plotting vote distribution by municipality
fig = px.histogram(election, x="NM_MUNICIPIO", y="QT_VOTOS", 
                   title="Votes by Municipality", 
                   color_discrete_sequence=["#1f77b4"])
fig.update_layout(bargap=0.2)
fig.show()



### 7.2. Most Voted Mayoral Candidates Ricardo Nunes (MDB) stood out in central zones, while Guilherme Boulos (PSOL) had strong support in the peripheries. ```python # Filtering mayoral candidates mayor = election[(election["DS_CARGO_PERGUNTA"] == "Prefeito") & (election["NM_MUNICIPIO"] == "SÃO PAULO") & (election["SG_PARTIDO"] != "#NULO#")].copy() # Grouping and ordering candidates by votes mayor = mayor.groupby(['NM_VOTAVEL', 'SG_PARTIDO']).sum().sort_values("QT_VOTOS", ascending=False)["QT_VOTOS"].reset_index() # Calculating vote percentages total_votes = mayor["QT_VOTOS"].sum() mayor["PERCENTAGE"] = mayor["QT_VOTOS"] / total_votes # Bar chart fig = px.bar(mayor, x="NM_VOTAVEL", y="QT_VOTOS", color="SG_PARTIDO", title="Most Voted Mayoral Candidates", color_discrete_sequence=px.colors.qualitative.Dark24) fig.show() ```


7.3. Most Voted Councilor Candidates Vote distribution showed a concentration among local candidates, with highlights for Tabata Amaral (PSB) and Renato Sorriso (PL) in peripheral zones. ```python # Filtering councilor candidates councilor = election[(election["DS_CARGO_PERGUNTA"] == "Vereador") & (election["NM_MUNICIPIO"] == "SÃO PAULO") & (election["SG_PARTIDO"] != "#NULO#")].copy() # Grouping and ordering candidates by votes councilor = councilor.groupby(['NM_VOTAVEL', 'SG_PARTIDO']).sum().sort_values('QT_VOTOS', ascending=False)["QT_VOTOS"].reset_index() # Calculating vote percentages total_votes = councilor["QT_VOTOS"].sum() councilor["PERCENTAGE"] = councilor["QT_VOTOS"] / total_votes # Bar chart fig = px.bar(councilor, x="NM_VOTAVEL", y="QT_VOTOS", color="SG_PARTIDO", title="Most Voted Councilor Candidates", color_discrete_sequence=px.colors.qualitative.Dark24) fig.show() ```


### 7.4 Most Voted Mayors by Electoral Zone Central zones favored Ricardo Nunes, while peripheral zones were dominated by Guilherme Boulos. ```python # Data of zones and neighborhoods areas = pd.DataFrame({ "ZONE": [1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 6, 246, 246, 247, 247, 248, 248, 249, 250, 250, 250, 251, 251, 252], "NEIGHBORHOOD": ["BELA VISTA", "CONSOLACAO", "LIBERDADE", "REPUBLICA", "SE", "BARRA FUNDA", "PERDIZES", "SANTA CECILIA", "BOM RETIRO", "BRAS", "PARI", "AGUA RASA", "BELEM", "MOOCA", "JD PAULISTA"] }) # Merging with mayor data merged = mayor.merge(areas, left_on="NR_ZONE", right_on="ZONE") # Bar chart fig = px.bar(merged, x="NEIGHBORHOOD", y="QT_VOTES", color="SG_PARTY", title="Most Voted Mayor by Zone") fig.show() ```


### 7.5 Most Voted Councilors by Electoral Zone The analysis revealed candidates like Márcio Chagas (PSOL) and Luana Almeida (PL) performing well in suburban areas. ```python # Analyzing most voted councilors by electoral zone areas = pd.DataFrame({ "ZONE": [1, 1, 1, 2, 2, 3, 3, 4, 5, 6], "NEIGHBORHOOD": ["BELA VISTA", "CONSOLACAO", "LIBERDADE", "MOOCA", "CAMPO BELO", "ITAQUERA", "CID DUTRA", "PIRITUBA", "VILA PRUDENTE", "TATUAPE"] }) # Merging councilor data councilor_merged = councilor.merge(areas, left_on="NR_ZONE", right_on="ZONE") # Bar chart fig = px.bar(councilor_merged, x="NEIGHBORHOOD", y="QT_VOTES", color="SG_PARTY", title="Most Voted Councilor by Zone") fig.show() ```


### 7.6 Most Voted Mayors by Municipality The municipality-level analysis confirmed Ricardo Nunes' dominance in urban areas and Boulos’ strength in peripheral zones. ```python # Grouping mayors by municipality municipality = mayor.groupby("NM_MUNICIPIO").sum().sort_values("QT_VOTES", ascending=False) # Bar chart fig = px.bar(municipality, x=municipality.index, y="QT_VOTES", title="Most Voted Mayor by Municipality") fig.show() ```


### 7.7 Most Voted Councilors by Municipality The analysis showed a strong presence of candidates like Eduardo Suplicy (PT) across several municipalities, reflecting broad political support. ```python # Grouping councilors by municipality municipality_councilor = councilor.groupby("NM_MUNICIPIO").sum().sort_values("QT_VOTES", ascending=False) # Bar chart fig = px.bar(municipality_councilor, x=municipality_councilor.index, y="QT_VOTES", title="Most Voted Councilor by Municipality") fig.show() ```


### 7.8 Distribution of Votes by Political Party The vote distribution charts confirmed the dominance of MDB and PSOL, with PSOL's support growing in peripheral zones. ```python # Analyzing distribution of votes by party party_votes = election.groupby("SG_PARTIDO").sum().sort_values("QT_VOTES", ascending=False) # Bar chart fig = px.bar(party_votes, x=party_votes.index, y="QT_VOTES", title="Distribution of Votes by Political Party") fig.show() ```


## 8. Interactive Power BI Dashboards: [Click to access the link](https://app.powerbi.com/view?r=eyJrIjoiNTNmY2Y2YzgtODY3Yy00M2ViLWI0NDItMTdiZDJlNTg4Zjk2IiwidCI6IjhlYjI5MjAxLWEyN2QtNDMwMi04NDczLWM5ODJlYjViZTkzNSJ9) ### 8.1 Dashboard 1: Geographic Distribution of Votes This dashboard provided a detailed view of electoral preferences by region, highlighting the polarization between urban and peripheral areas. ```python import plotly.express as px # Gráfico de mapa para distribuição de votos por município df = pd.read_csv('distribution_votes.csv') fig = px.choropleth(df, locations="municipality", color="votes", hover_name="municipality", title="Distribuição Geográfica de Votos") fig.show() ```


### Dashboard 2: Candidate Performance by Region This dashboard was essential for understanding candidate performance across regions, using heatmaps and bar charts. ```python import plotly.express as px # Bar chart for vote analysis by party df = pd.read_csv('votes_by_party.csv') fig = px.bar(df, x="party", y="votes", color="party", title="Vote Analysis by Party") fig.show() ```


### 8.3 Dashboard 3: Voting Analysis by Party The visualization allowed for identifying votes distribution by party and electoral preferences by zone. ```python # Dashboard for candidate performance df = pd.read_csv('candidates_performance.csv') fig = px.scatter(df, x="zone", y="votes", color="party", title="Candidate Performance by Electoral Zone") fig.show() ```


### 8.4 Dashboard 4: Voting by Demographic Profile This dashboard analyzed voting by age, gender, and social class, highlighting preferences of younger voters and lower social classes for progressive candidates. ```python # Dashboard for comparison between candidates df = pd.read_csv('candidates_comparison.csv') fig = px.scatter(df, x="votes_mayor", y="votes_councilor", color="party", title="Comparison of Mayoral and Councilor Candidates") fig.show() ```


### 8.5 Dashboard 5: Voting Comparison Between 2020 and 2024 Elections The comparison between the two elections revealed significant changes in electoral preferences, with PSOL gaining ground in the peripheries. ```python \# Dashboard for voting by age group df = pd.read_csv('votes_by_age_group.csv') fig = px.pie(df, names="age_group", values="votes", title="Voting by Age Group") fig.show() ```


## 9. Conclusion The analysis of the 2024 São Paulo municipal election data provided valuable insights into voter behavior and emerging trends. We observed increasing political polarization, with PSOL gaining strength in peripheral areas and MDB maintaining a solid base in central urban areas. Additionally, the analysis revealed a shift in electoral preferences, with growing support for more progressive parties, especially among younger voters and lower social classes. The analysis of charts and dashboards enabled a more detailed understanding of vote distribution by geography, candidate performance by electoral zone, and vote segmentation by party and demographic profile. The trends observed suggest that future electoral campaigns should focus on more segmented strategies, considering the social and economic characteristics of each region. ### Recommendations for future campaigns: - **Personalize electoral communication** for different regions, considering demographic and socioeconomic profiles. - **Leverage the growth of social media** and other digital platforms to connect with younger voters and those with limited access to traditional media. - **Tailor campaign proposals** according to local issues such as security, health, and education, which were decisive factors for votes in various peripheral zones. ## 10. Extra Material - **🇺🇸 Data Analysing Report**: [Click 🔗](https://github.com/Mindful-AI-Assistants/SP2024-Election-Analysis/blob/77ee8d3319a14c05ae6d3b023e0a4101ec5e2943/Data%20Analysing%20Report/%F0%9F%87%BA%F0%9F%87%B8Data%20Analysing%20Report.pdf) - **🇧🇷 Data Analysing Report**[Click 🔗](https://github.com/Mindful-AI-Assistants/SP2024-Election-Analysis/blob/9ab39e27ff0f2e8444b7c773ec309986d073ad92/Data%20Analysing%20Report/%F0%9F%87%A7%F0%9F%87%B7Analise%20do%20Dados%20Relatoirio.pdf) - **Power BI Access Link**: [Click 🔗](https://app.powerbi.com/view?r=eyJrIjoiNTNmY2Y2YzgtODY3Yy00M2ViLWI0NDItMTdiZDJlNTg4Zjk2IiwidCI6IjhlYjI5MjAxLWEyN2QtNDMwMi04NDczLWM5ODJlYjViZTkzNSJ9) - **Power BI File**: [Click 🔗](https://github.com/Mindful-AI-Assistants/SP2024-Election-Analysis/blob/8c71e68c34ccfd2c14ff3ecb8d0f7558bcbe109d/Power%20B%20I%20Files/DashBoard.pbix)
- **QR Code**: Scan the code to access the data and visualizations on Power BI.


## 11. References - **Superior Electoral Court (TSE)** - **[Electoral Data Source]** - Articles on electoral data analysis and data visualization ## 12. How to Run the Project This project was developed in Python and uses libraries like Pandas, Plotly, and Dash for data analysis and visualization. Follow the instructions below to set up the environment and run the code.
### 12.1 Requirements Before running the project, you need to have Python and Git installed on your system. Download Python Download Git Additionally, you will need the dependencies listed in the requirements.txt file: ```dash pandas plotly ```
### 12.2 Cloning the Repository To get started, clone the repository to your computer: ```bash git clone https://github.com/your_user/SP2024-Election-Analysis.git ```
### 12.3 Installing Dependencies Install the required dependencies by running the following command: ```python pip install -r requirements.txt ``` ### 12.4 Creating the Executable To create an executable of the project, you can use PyInstaller. Run the following command to generate the executable: ```python pyinstaller --onefile electoral_analysis.py ```
This will create an executable file in the dist/ folder, which can be run directly without needing to install Python. ### 12.5 Running the Code After installing the dependencies or creating the executable, run the main script to generate the analyses and visualizations: ```python python electoral_analysis.py ```
### 12.6 Running the Interactive Dashboard If you wish to view the interactive dashboards using Bash, run the following command: ```python python app.py ``` This will open the dashboard in your browser. ## 13. Contributing If you'd like to contribute to this project, feel free to fork it, make changes, and submit pull requests. Here are the steps to get started: 1. **Fork** this repository. 2. **Create a branch** for your feature: ```bash git checkout -b new-feature ``` 3. **Make the necessary changes** and commit: ```bash git commit -am 'Adds new feature' ``` 4. **Push** the branch to the remote repository: ```bash git push origin new-feature ``` 5. **Open a pull request** for review and integration. Make sure your changes do not break existing functionality and that the tests are up to date. ## 14- 💌 Team and Contacts ### Core Team: - **Fabiana 🚀 Campanari**i [GitHub](https://github.com/FabianaCampanari) - **Pedro 🛰️ Vyctor** - [Github](https://github.com/ppvyctor) ### Contact: - **Fabiana 🚀 Campanari** - [email me](mailto:fabicampanari@proton.me) - **Pedro 🛰️ Vyctor** - [email me](mailto:pedro.vyctor00@gmail.com)

Back to top # #####

Copyright 20245 Mindful-AI-Assistants. Code released under the [MIT license.]( https://github.com/Mindful-AI-Assistants/.github/blob/ad6948fdec771e022d49cd96f99024fcc7f1106a/LICENSE)