TFT Challenger League Data Analysis

The Project

This project provides a comprehensive dashboard for analyzing data from the TFT (Teamfight Tactics) Challenger League (the top ranked players) using Riot Games’ Developer API. The dashboard is built using Streamlit and includes various visualizations and metrics to help users gain insights into the top players, units, traits, and items used in the Challenger League.

Data Collection

Setting up the Database

In order to collect the necessary data from the API, I first wanted to set up a PostgreSQL database on a local server to house all of the data since the API limits how far back you can call the data. While I likely won’t get too much into database design here, I’ll say I created 5 separate tables to house the data:

challenger_league: all player info and win-loss data for players in the challenger league
matches: all match data
participants: all participants data from matches
traits: all traits data from participants
units: all unit data from participants

Data Pipeline

A few scripts are essential for feeding accurate data into the database:

challenger_search.py contains 2 key functions:
- get_challenger_league_puuid(): finds all the puuids for Challenger League players
  - a puuid is a player id necessary to link players to their in-game stats
- get_challenger_league_data(): finds all player info and win-loss data for challenger league players
challenger_league.py: uses challenger_search.py to send data to the local database
tftpal.py: inserts all challenger match data to local database

Data Processing

Now that the is stored properly, it needs to be retrieved from the database and processed before we can visualize it.

Retrieving the Data with SQLAlchemy

from sqlalchemy import create_engine
 
# Database connection parameters
db_params = config()
 
# Create SQLAlchemy engine
engine = create_engine(f"postgresql://{db_params['user']}:{db_params['password']}@{db_params['host']}:{db_params['port']}/{db_params['database']}")
 
# Function to get data from PostgreSQL and load into a pandas DataFrame
def get_data_from_db(query):
    try:
        # Execute the query and load data into a DataFrame
        df = pd.read_sql_query(query, engine)
        return df
    except Exception as e:
        st.error(f"Error: {e}")
        return pd.DataFrame()
 
units_query = '''
    select
        u.character_id,u.puuid,u.tier,
        u.match_id,u.itemnames,u.unit_index,
        m.game_datetime
    from units u
    join matches m on m.match_id=u.match_id
    where m.tft_set_number=13;
'''
 
traits_query = '''
    select
        t.puuid, t.trait_name, t.tier_current,
        t.tier_total, t.match_id, t.num_units,
        m.game_datetime
    from traits t
    join matches m on m.match_id=t.match_id
    where m.tft_set_number=13;
'''
 
challenger_query = '''
    select
        cl.puuid, p.riotidgamename, cl.leaguepoints,
        cl.rank, cl.wins, cl.losses,
        cl.veteran, cl.inactive, cl.freshblood,
        cl.hotstreak, cl.date
    from challenger_league cl
    join participants p on cl.puuid=p.puuid;
'''
 
# Get data from the database
units_data = get_data_from_db(units_query)
traits_data = get_data_from_db(traits_query)
challenger_data = get_data_from_db(challenger_query)

Cleaning the data

# Remove the 'TFT13_' prefix and capitalize the first letter of each character_id entry
units_data['character_id'] = units_data['character_id'].str.replace('TFT13_', '').str.capitalize()
units_data['character_id'] = units_data['character_id'].str.replace('Tft13_', '').str.capitalize()
 
traits_data['trait_name'] = traits_data['trait_name'].str.replace('TFT13_', '').str.capitalize()
 
# Ensure the 'date' column is in datetime format
challenger_data['date'] = pd.to_datetime(challenger_data['date'])
 
# Drop duplicates based on puuid and date
challenger_data = challenger_data.drop_duplicates(subset=['puuid', 'date'])
 
# Select the most recent date for each riotidgamename
most_recent_data = challenger_data.loc[challenger_data.groupby('riotidgamename')['date'].idxmax()]
 
# Create leaderboard for top 10 riotidgamenames by leaguepoints
leaderboard_data = most_recent_data.sort_values(by='leaguepoints', ascending=False).head(10)
leaderboard_data = leaderboard_data[['puuid', 'date', 'riotidgamename', 'leaguepoints', 'wins', 'losses']]
 
# Convert the 'date' column to a string format that includes only the date part
leaderboard_data['date'] = leaderboard_data['date'].dt.strftime('%Y-%m-%d')
 
# Calculate win percentage and format it
leaderboard_data['win_pct'] = (leaderboard_data['wins'] / (leaderboard_data['wins'] + leaderboard_data['losses'])) * 100
leaderboard_data['win_pct'] = leaderboard_data['win_pct'].round(1).astype(str) + '%'
 
leaderboard_data_display = leaderboard_data[['date', 'riotidgamename', 'leaguepoints', 'wins', 'losses']]
 
# Filter options for game date
min_date = min(units_data['game_datetime'].min(), traits_data['game_datetime'].min())
max_date = max(units_data['game_datetime'].max(), traits_data['game_datetime'].max())
selected_date_range = st.date_input('Select Game Date Range', [min_date, max_date])
 
# Filter data based on selected game date range
filtered_units_data = units_data[
    (units_data['game_datetime'] >= pd.to_datetime(selected_date_range[0])) &
    (units_data['game_datetime'] <= pd.to_datetime(selected_date_range[1]))
]
 
filtered_traits_data = traits_data[
    (traits_data['game_datetime'] >= pd.to_datetime(selected_date_range[0])) &
    (traits_data['game_datetime'] <= pd.to_datetime(selected_date_range[1]))
]
 
# Create a dropdown selection for the riotidgamename
selected_player = st.selectbox('Select Player', ['All'] + leaderboard_data['riotidgamename'].tolist())
 
# Filter the data based on the selected player
if selected_player != 'All':
    selected_puuid = leaderboard_data[leaderboard_data['riotidgamename'] == selected_player]['puuid'].values[0]
    filtered_units_data = filtered_units_data[filtered_units_data['puuid'] == selected_puuid]
    filtered_traits_data = filtered_traits_data[filtered_traits_data['puuid'] == selected_puuid]
 
# Count the occurrences of each character_id and tier
character_tier_counts = filtered_units_data.groupby(['character_id', 'tier']).size().reset_index(name='count')
 
# Aggregate the counts by character_id to get the total counts
total_counts = character_tier_counts.groupby('character_id')['count'].sum().reset_index(name='total_count')
 
# Merge the aggregated counts back with the original character_tier_counts DataFrame
character_tier_counts = character_tier_counts.merge(total_counts, on='character_id')
 
# Sort the character_tier_counts DataFrame by total_count in descending order and then by tier in ascending order
character_tier_counts = character_tier_counts.sort_values(by=['total_count', 'tier'], ascending=[False, True])
 
# Count the occurrences of each trait_name
trait_counts = filtered_traits_data['trait_name'].value_counts().reset_index()
trait_counts.columns = ['trait_name', 'count']
 
# Clean Trait names
trait_counts['trait_name'] = trait_counts['trait_name'].replace('Warband', 'Conqueror')
trait_counts['trait_name'] = trait_counts['trait_name'].replace('Cabal', 'Black Rose')

Data Visualization and Analysis

Now that the data has been processed, we can visual and explore. All code for these visualizations are in challenger_dashboard.py

Interaction with Plotly

Plotly is one of my favorite visualization libraries because of how many useful out of the box features it has. It allows the user to interact with the graphs in a variety of different ways and explore the data all on their own.

Top 10 Players

Here I display the top 10 players by League Points with their wins and losses.

Most Used Units and Traits

Next, the user can select the date range they want to view. They can also select All to view all players’ data from the Challenger League or one of the Top 10 Challenger players.

These views show that Elise is by far the most selected Unit while Conqueror is the most selected Trait.

Most Used Items and Team Composition Per Unit

The following charts show the most selected Items and Teammates for a specific Unit. In the pictured example, we see that Elise is most often equipped with the Protector’s Vow and most often paired with her other Black Rose teammates such as Sion (Trait Bonus Character), Vladimir**, Leblanc, and Swain.

What’s Next?

Now that Set 13 is ending on April 1st 2025, this information won’t be relevant after that but the structures in place can rollover directly to do the same analysis for Set 14.
Host the PostgreSQL database online so I can deploy the dashboard online to Streamlit.
Apply for upgraded developer license from Riot games for increased permissions and persistent API key.
Improved data cleaning. While the data is accurate and I’ve done some good cleaning to make it more readable, there are still some legacy terms that could be updated and I could make the Item text more readable by separating them into multiple words.

📈 Andrew Carver

Explorer