This project is a sub-module for Multiplayer Football Draft Simulator.
A web-crawler to scrape all football players' information from Sofifa and exporting it to JSON format. Perform data cleaning and analytics on the obtained data
- Crawler: Built on scrapy using python3
- Analytics: IPynb noteboook python3
Further exported to the Football Draft Backend to serve from an endpoint
chmod +x ./run.sh
./run.sh
-
Setup virtualenv (optional, but recommended)
virtualenv -p python3.8 env source env/bin/activate
-
Install project dependencies
pip install -r requirements.txt
-
Run the crawler with ./fifa-crawler as current directory (This the main scrapy crawler directory)
cd fifa_crawler
-
First run the URL spider (To get all players urls)
scrapy crawl players_urls
-
After successfull, run the stats spider (To get the players statistics from URLs from above)
scrapy crawl players_stats
- Update the crawler periodically to reflect changes on Sofifa platform.
- Add analysis projects on the crawled data.
- Update the crawler to perform scraping to obtain Teams data (currently player-data)
- Improve speed of the crawler
Click here to expand meta view, or go-here for a detailed view
id
-
type: string
-
example: "158023"
name
-
type: string
-
example: "Lionel Andrés Messi Cuccittini"
short_name
-
type: string
-
example: "L. Messi"
photo_url
-
type: string
-
example: "https://cdn.sofifa.com/players/158/023/21_120.png"
primary_position
-
type: string
-
example: "RW"
positions
-
type: string[]
-
example: ["RW", "ST", "CF"]
age
-
type: string
-
example: "33"
birth_date
-
type: string (DateFormat is
YYYY/MONTH_NAME_SHORT/DD
) -
example: "1987/Jun/24"
height
-
type: integer (in cms)
-
example: 170
weight
-
type: integer (in kg)
-
example: 72
Overall Rating
-
type: integer
-
example: 93
Potential
-
type: integer
-
example: 93
Value
-
type: string (in euros)
-
example: "€103.5M"
Wage
-
type: string (in euros)
-
example: "€560K"
Preferred Foot
-
type: enum["Left", "Right"]
-
example: "Left"
Weak Foot
-
type: integer (range 1-5)
-
example: 4
Skill Moves
-
type: integer (range 1-5)
-
example: 4
International Reputation
-
type: integer (range 0-5)
-
example: 5
Work Rate
-
type: enum["Medium/Low"]
-
example: "Medium/Low"
Body Type
-
type: enum["Unique", "Normal (170-185)", "Normal (185+)", "Lean (170-185)", "Lean (185+)", "Stocky (170-185)", "Normal (170-)", "Stocky (185+)", "Stocky (185+)", "Stocky (170-)", ]
-
example: "Unique"
Real Face
-
type: enum["Yes", "No"]
-
example: "Yes"
Release Clause
-
type: string (in euros)
-
example: "€212.2M"
teams
-
type: map<string, integer> (including international and domestic clubs)
-
example:
{
"FC Barcelona": 84,
"Argentina": 83
}
attacking
- type: map<attackOptions, integer>
attackOptions
- type: enum["Crossing", "Finishing", "HeadingAccuracy", "ShortPassing", "Volleys"]
- example:
{
"Crossing": 85,
"Finishing": 95,
"HeadingAccuracy": 70,
"ShortPassing": 91,
"Volleys": 88
}
skill
- type: map<skillOptions, integer>
skillOptions
- type: enum["Dribbling", "Curve", "FKAccuracy", "LongPassing", "BallControl"]
- example:
{
"Dribbling": 96,
"Curve": 93,
"FKAccuracy": 94,
"LongPassing": 91,
"BallControl": 96
}
movement
- type: map<movementOptions, integer>
movementOptions
- type: enum["Acceleration", "SprintSpeed", "Agility", "Reactions", "Balance"]
- example:
{
"Acceleration": 91,
"SprintSpeed": 80,
"Agility": 91,
"Reactions": 94,
"Balance": 95
}
power
- type: map<powerOptions, integer>
powerOptions
- type: enum["ShotPower", "Jumping", "Stamina", "Strength", "LongShots"]
- example:
{
"ShotPower": 86,
"Jumping": 68,
"Stamina": 72,
"Strength": 69,
"LongShots": 94
}
mentality
- type: map<mentalityOptions, integer>
mentalityOptions
- type: enum["Aggression", "Interceptions", "Positioning", "Vision", "Penalties", "Composure"]
- example:
{
"Aggression": 44,
"Interceptions": 40,
"Positioning": 93,
"Vision": 95,
"Penalties": 75,
"Composure": 96
}
defending
- type: map<defendingOptions, integer>
defendingOptions
- type: enum["DefensiveAwareness", "StandingTackle", "SlidingTackle"]
- example:
{
"DefensiveAwareness": 32,
"StandingTackle": 35,
"SlidingTackle": 24
}
goalkeeping
- type: map<goalkeepingOptions, integer>
goalkeepingOptions
- type: enum["GKDiving", "GKHandling", "GKKicking", "GKPositioning", "GKReflexes"]
- example:
{
"GKDiving": 6,
"GKHandling": 11,
"GKKicking": 15,
"GKPositioning": 14,
"GKReflexes": 8
}
player_traits
-
type: string["Technical Dribbler (AI)","Long Shot Taker (AI)","Flair","Speed Dribbler (AI)","Injury Prone","Long Passer (AI)","Playmaker (AI)","Power Header","Dives Into Tackles (AI)","Outside Foot Shot","Team Player","Finesse Shot","Leadership","Solid Player","Early Crosser","Long Throw-in","Comes For Crosses","Power Free-Kick","GK Long Throw","Cautious With Crosses","Rushes Out Of Goal","Saves with Feet","Chip Shot (AI)","Giant Throw-in","One Club Player"]
-
example:
[
"Finesse Shot",
"Long Shot Taker (AI)",
"Speed Dribbler (AI)",
"Playmaker (AI)",
"Outside Foot Shot",
"One Club Player",
"Team Player",
"Chip Shot (AI)"
]
player_hashtags
- type: string["#Strength","#Acrobat","#Engine","#Speedster","#Dribbler","#Aerial Threat","#Tactician","#FK Specialist","#Crosser","#Distance Shooter","#Clinical Finisher","#Playmaker","#Tackling","#Complete Midfielder","#Complete Forward","#Poacher","#Complete Defender"] (Each tag starts with
#
)
example:
[
"#Dribbler",
"#Distance Shooter",
"#FK Specialist",
"#Acrobat",
"#Clinical Finisher",
"#Complete Forward"
]
logos
- type: map<groupNames, logoAttributes>
groupNames
- type: enum["country", "club", "nationalClub"]
logoAttributes
-
type: map<enum["name", "url"], string>
-
logoAttributes examples:
{
"name": "Argentina",
"url": "https://cdn.sofifa.com/flags/ar.png"
}
- examples:
{
"country": {
"name": "Argentina",
"url": "https://cdn.sofifa.com/flags/ar.png"
},
"club": {
"name": "FC Barcelona",
"url": "https://cdn.sofifa.com/teams/241/60.png"
},
"nationalClub": {
"name": "Argentina",
"url": "https://cdn.sofifa.com/teams/1369/60.png"
}
}
We love your input! We want to make contributing to this project as easy and transparent as possible, whether it's:
- Reporting a bug
- Discussing the current state of the code
- Submitting a fix
- Proposing new features
-
Fork the repo and clone it on your machine.
-
Add a upstream link to main branch in your cloned repo
git remote add https://github.com/sauravhiremath/fifa-stats-crawler.git
-
Keep your cloned repo upto date by pulling from upstream (this will also avoid any merge conflicts while committing new changes)
git pull upstream master
-
Create your feature branch
git checkout -b <feature-name>
-
Commit all the changes
git commit -am "Meaningful commit message"
-
Push the changes for review
git push origin <branch-name>
-
Create a PR from our repo on Github.
- Code should be properly commented to ensure it's readability.
- If you've added code that should be tested, add tests as comments.
- In python use docstrings to provide tests.
- Make sure your code properly formatted.
- Issue that pull request!
When you are creating an issue, make sure it's not already present. Furthermore, provide a proper description of the changes. If you are suggesting any code improvements, provide through details about the improvements.
Great Issue suggestions tend to have:
- A quick summary of the changes.
- In case of any bug provide steps to reproduce
- Be specific!
- Give sample code if you can.
- What you expected would happen
- What actually happens
- Notes (possibly including why you think this might be happening, or stuff you tried that didn't work)
More step by step guide with pictures for creating a pull request can be found here