System Architecture Overview

Data Layer (Storage + Ingestion)

Database: (SQL)

Stores structured data about players, injuries, and matches

Schema (simplified):

  Players(player_id, name, dob, position, height, weight)
  Injuries(injury_id, player_id, type, severity, days_out, age_at_injury, minutes_before, minutes_total)
  Matches(match_id, player_id, minutes_played, date, competition)

Data Ingestion:
- From external sources, mainly Transfermarkt Injury History page for each player
- ETL piplines clean and normalize the data

Model Layer (Machine Learning Pipline)

Feature Engineering:
- Convert injury type → categorical encoding
- Normalize days out, age, and minutes played
- Derive features (injury frequency, recovery ratio)
Model Training:
- Train ML model (Random Forest, XGBoost, Survival Analysis)
- Stored in Model Registry (MLflow, S3, or DB)
Prediction API:
- Input: player_id
- Output: risk score (e.g., “Probability of injury in next 3 months: 33%“)

Application Backend

Framework: Flask or FastAPI
Responsibilities:
- Serve REST API endpoints:
  - POST /predict → returns health prediction for a player
  - GET /player/{id} → fetch player profile + injury history
  - POST /player/{id}/injury → add injury record
- Call ML model service for predictions
- Manage user authentication
Integration with Database:
- ORM (SQLAlchemy)

Frontend Layer

Options:

Web app (React or Vue) for production
Streamlit/Dash for quick prototyping and visualizations

Features:

Player Profile Dashboard:
- Age, position, injury history timeline
- Minutes played chart
Health Prediction:
- Risk score visualization (e.g., guage chart or risk heatmap)
- Next expected downtime estimate
What-if Analysis:
- Simulate adding an injury and see how risk changes

Deployment Layer

Containerization:
- Dockerize backend + ML model
Cloud Hosting:
- AWS/GCP/Azure or simple Heroku deployment
Monitoring:
- Track API usage and latency
- Model drift monitoring (are predictions degrading?)

Data Flow

Data Ingestion:
- Load injury + match data into DB
Model Training:
- Batch jobs (offline) update ML model weekly/monthly
Model Serving:
- Prediction API loads tranined model into memory
User Interaction:
- Frontend requests prediction → Backend → Model → Result shown

High-Level Diagram

[ Data Sources ]  --->  [ ETL / Data Pipline ]  --->  [ Database ]
                                    |
                                    V
                            [ Model Training ]
                                    |
                                    V
                          [ Prediction Service ]
                                    |
            ------------------------------------------------
            |                                              |
    [ Backend API ]                                [ Model Registry ]
            |
            V
     [ Frontend UI ]

Tech Stack Architecture

Neon → free hosted Postgres
FastAPI → backend REST API seving predictions and player data
scikit-learn → ML model training + inference (bundled in FastAPI)
React → Frontend UI for input + visualization
Hosting:
- Neon (DB)
  - Postgres 500MB DB with autoscaling up to 2 CU
- Render (Backend)
  - 750 monthly compute hours
- Vercel (Frontend)

Where is My Mind

Explorer

system_architecture

System Architecture Overview

Data Layer (Storage + Ingestion)

Model Layer (Machine Learning Pipline)

Application Backend

Frontend Layer

Options:

Features:

Deployment Layer

Data Flow

High-Level Diagram

Tech Stack Architecture

Graph View

Table of Contents

Backlinks

Where is My Mind

Explorer

system_architecture

System Architecture Overview §

Data Layer (Storage + Ingestion) §

Model Layer (Machine Learning Pipline) §

Application Backend §

Frontend Layer §

Options: §

Features: §

Deployment Layer §

Data Flow §

High-Level Diagram §

Tech Stack Architecture §

Graph View

Table of Contents

Backlinks

System Architecture Overview

Data Layer (Storage + Ingestion)

Model Layer (Machine Learning Pipline)

Application Backend

Frontend Layer

Options:

Features:

Deployment Layer

Data Flow

High-Level Diagram

Tech Stack Architecture