Skip to content

A web-crawler to scrape FIFA 20 and 21 players' latest information from Sofifa Website

License

Notifications You must be signed in to change notification settings

sauravhiremath/fifa-stats-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python supported versions GPLv3 license PRs Welcome

Football Players Statistics WebCrawler

This project is a sub-module for Multiplayer Football Draft Simulator.

About

A web-crawler to scrape all football players' information from Sofifa and exporting it to JSON format. Perform data cleaning and analytics on the obtained data

  • Crawler: Built on scrapy using python3
  • Analytics: IPynb noteboook python3

Further exported to the Football Draft Backend to serve from an endpoint

Steps to run the project

Easy Run

chmod +x ./run.sh
./run.sh

Manual Setup and Run

  • Setup virtualenv (optional, but recommended)

    virtualenv -p python3.8 env
    source env/bin/activate
    
  • Install project dependencies

    pip install -r requirements.txt
  • Run the crawler with ./fifa-crawler as current directory (This the main scrapy crawler directory)

    cd fifa_crawler
    
  • First run the URL spider (To get all players urls)

    scrapy crawl players_urls
  • After successfull, run the stats spider (To get the players statistics from URLs from above)

    scrapy crawl players_stats

Scope/Aim as an indiviual project

Future features

  • Add analysis projects on the crawled data.
  • Update the crawler to perform scraping to obtain Teams data (currently player-data)
  • Improve speed of the crawler

Metadata

Click here to expand meta view, or go-here for a detailed view
id
  • type: string

  • example: "158023"

name
  • type: string

  • example: "Lionel Andrés Messi Cuccittini"

short_name
  • type: string

  • example: "L. Messi"

photo_url
primary_position
  • type: string

  • example: "RW"

positions
  • type: string[]

  • example: ["RW", "ST", "CF"]

age
  • type: string

  • example: "33"

birth_date
  • type: string (DateFormat is YYYY/MONTH_NAME_SHORT/DD)

  • example: "1987/Jun/24"

height
  • type: integer (in cms)

  • example: 170

weight
  • type: integer (in kg)

  • example: 72

Overall Rating
  • type: integer

  • example: 93

Potential
  • type: integer

  • example: 93

Value
  • type: string (in euros)

  • example: "€103.5M"

Wage
  • type: string (in euros)

  • example: "€560K"

Preferred Foot
  • type: enum["Left", "Right"]

  • example: "Left"

Weak Foot
  • type: integer (range 1-5)

  • example: 4

Skill Moves
  • type: integer (range 1-5)

  • example: 4

International Reputation
  • type: integer (range 0-5)

  • example: 5

Work Rate
  • type: enum["Medium/Low"]

  • example: "Medium/Low"

Body Type
  • type: enum["Unique", "Normal (170-185)", "Normal (185+)", "Lean (170-185)", "Lean (185+)", "Stocky (170-185)", "Normal (170-)", "Stocky (185+)", "Stocky (185+)", "Stocky (170-)", ]

  • example: "Unique"

Real Face
  • type: enum["Yes", "No"]

  • example: "Yes"

Release Clause
  • type: string (in euros)

  • example: "€212.2M"

teams
  • type: map<string, integer> (including international and domestic clubs)

  • example:

{
"FC Barcelona": 84,
"Argentina": 83
}
attacking
  • type: map<attackOptions, integer>
attackOptions
  • type: enum["Crossing", "Finishing", "HeadingAccuracy", "ShortPassing", "Volleys"]
  • example:
{
    "Crossing": 85,
    "Finishing": 95,
    "HeadingAccuracy": 70,
    "ShortPassing": 91,
    "Volleys": 88
}
skill
  • type: map<skillOptions, integer>
skillOptions
  • type: enum["Dribbling", "Curve", "FKAccuracy", "LongPassing", "BallControl"]
  • example:
{
    "Dribbling": 96,
    "Curve": 93,
    "FKAccuracy": 94,
    "LongPassing": 91,
    "BallControl": 96
}
movement
  • type: map<movementOptions, integer>
movementOptions
  • type: enum["Acceleration", "SprintSpeed", "Agility", "Reactions", "Balance"]
  • example:
{
    "Acceleration": 91,
    "SprintSpeed": 80,
    "Agility": 91,
    "Reactions": 94,
    "Balance": 95
}
power
  • type: map<powerOptions, integer>
powerOptions
  • type: enum["ShotPower", "Jumping", "Stamina", "Strength", "LongShots"]
  • example:
{
    "ShotPower": 86,
    "Jumping": 68,
    "Stamina": 72,
    "Strength": 69,
    "LongShots": 94
}
mentality
  • type: map<mentalityOptions, integer>
mentalityOptions
  • type: enum["Aggression", "Interceptions", "Positioning", "Vision", "Penalties", "Composure"]
  • example:
{
    "Aggression": 44,
    "Interceptions": 40,
    "Positioning": 93,
    "Vision": 95,
    "Penalties": 75,
    "Composure": 96
}
defending
  • type: map<defendingOptions, integer>
defendingOptions
  • type: enum["DefensiveAwareness", "StandingTackle", "SlidingTackle"]
  • example:
{
    "DefensiveAwareness": 32,
    "StandingTackle": 35,
    "SlidingTackle": 24
}
goalkeeping
  • type: map<goalkeepingOptions, integer>
goalkeepingOptions
  • type: enum["GKDiving", "GKHandling", "GKKicking", "GKPositioning", "GKReflexes"]
  • example:
{
    "GKDiving": 6,
    "GKHandling": 11,
    "GKKicking": 15,
    "GKPositioning": 14,
    "GKReflexes": 8
}
player_traits
  • type: string["Technical Dribbler (AI)","Long Shot Taker (AI)","Flair","Speed Dribbler (AI)","Injury Prone","Long Passer (AI)","Playmaker (AI)","Power Header","Dives Into Tackles (AI)","Outside Foot Shot","Team Player","Finesse Shot","Leadership","Solid Player","Early Crosser","Long Throw-in","Comes For Crosses","Power Free-Kick","GK Long Throw","Cautious With Crosses","Rushes Out Of Goal","Saves with Feet","Chip Shot (AI)","Giant Throw-in","One Club Player"]

  • example:

[
    "Finesse Shot",
    "Long Shot Taker (AI)",
    "Speed Dribbler (AI)",
    "Playmaker (AI)",
    "Outside Foot Shot",
    "One Club Player",
    "Team Player",
    "Chip Shot (AI)"
]
player_hashtags
  • type: string["#Strength","#Acrobat","#Engine","#Speedster","#Dribbler","#Aerial Threat","#Tactician","#FK Specialist","#Crosser","#Distance Shooter","#Clinical Finisher","#Playmaker","#Tackling","#Complete Midfielder","#Complete Forward","#Poacher","#Complete Defender"] (Each tag starts with #)

example:

[
    "#Dribbler",
    "#Distance Shooter",
    "#FK Specialist",
    "#Acrobat",
    "#Clinical Finisher",
    "#Complete Forward"
]
logos
  • type: map<groupNames, logoAttributes>
groupNames
  • type: enum["country", "club", "nationalClub"]
logoAttributes
  • type: map<enum["name", "url"], string>

  • logoAttributes examples:

{
    "name": "Argentina",
    "url": "https://cdn.sofifa.com/flags/ar.png"
}
  • examples:
{
    "country": {
    "name": "Argentina",
    "url": "https://cdn.sofifa.com/flags/ar.png"
    },
    "club": {
    "name": "FC Barcelona",
    "url": "https://cdn.sofifa.com/teams/241/60.png"
    },
    "nationalClub": {
    "name": "Argentina",
    "url": "https://cdn.sofifa.com/teams/1369/60.png"
    }
}

Contributing tot the Project

We love your input! We want to make contributing to this project as easy and transparent as possible, whether it's:

  • Reporting a bug
  • Discussing the current state of the code
  • Submitting a fix
  • Proposing new features

Making a PR

  • Fork the repo and clone it on your machine.

  • Add a upstream link to main branch in your cloned repo

     git remote add https://github.com/sauravhiremath/fifa-stats-crawler.git
    
    
  • Keep your cloned repo upto date by pulling from upstream (this will also avoid any merge conflicts while committing new changes)

    git pull upstream master
    
  • Create your feature branch

    git checkout -b <feature-name>
    
  • Commit all the changes

    git commit -am "Meaningful commit message"
    
  • Push the changes for review

    git push origin <branch-name>
    
  • Create a PR from our repo on Github.

Additional Notes

  • Code should be properly commented to ensure it's readability.
  • If you've added code that should be tested, add tests as comments.
  • In python use docstrings to provide tests.
  • Make sure your code properly formatted.
  • Issue that pull request!

Issue suggestions/Bug reporting

When you are creating an issue, make sure it's not already present. Furthermore, provide a proper description of the changes. If you are suggesting any code improvements, provide through details about the improvements.

Great Issue suggestions tend to have:

  • A quick summary of the changes.
  • In case of any bug provide steps to reproduce
    • Be specific!
    • Give sample code if you can.
    • What you expected would happen
    • What actually happens
    • Notes (possibly including why you think this might be happening, or stuff you tried that didn't work)

Additional References:

More step by step guide with pictures for creating a pull request can be found here