Skip to content

Latest commit

 

History

History
54 lines (33 loc) · 5.28 KB

File metadata and controls

54 lines (33 loc) · 5.28 KB

Hands on Exploratory Data analysis with Python

Data encompasses a collection of discrete objects, events out of context, and facts. Processing such data provides a multitude of information. Processing such information based on our experience, judgment or jurisdiction elicits knowledge as the result of learning. But the million-dollar question is - how do we get meaningful information from such data? The answer to this is Exploratory Data Analysis (EDA) as a process for investigating datasets, elucidating subjects, and visualizing the outcomes. EDA is an approach for data analysis that applies a diversity of techniques to maximize certain insights into a data set; reveal underlying structure; extract significant variables; detect outliers and anomalies; test underlying assumptions; develop models, and determine best parameters for future estimations. This book "Hands-On Exploratory Data Analysis with Python" is built on providing practical knowledge about the main pillars of EDA including data cleaning, data preparation, data exploration, and data visualization. Why visualization? Well, several research studies reveal portraying data in graphical form is clearer and makes complex statistical data analyses and business intelligence more marketable.

The readers will get the opportunity to explore open-source datasets including healthcare data, demographics data, Titanic data set, Wine Quality data set, Boston housing pricing dataset, and many others. Using these real-life datasets, the readers get hands-on practice to understand the data, summarize their characteristics and visualize them for business intelligence. The book expects readers to use Pandas, a powerful library for working with data, and other core Python libraries including NumPy and SciPy, StatsModels for regression, and Matplotlib for visualization.

Chapters

Troubleshooting with the codebase

Please note we tested the codes presented in this book with the specific version of pandas, matplotlib, python and other Python libraries. Running the code with a newer or older version might result in warnings and errors. If you encounter any errors, feel free to raise an issue here, and we will try our best to sort it out.

#Errata

  • Page 13 (Chapter 1): "and real estate industries storehouse..." should be "and real estate industries store house..."

  • Page 14 (Chapter 1): “... there are four observations (001, 002, 003, 004, 005).” should be "... there are five observations (001, 002, 003, 004, 005)."

Want to become expert

It is important to practice what you have learned from this book. Hence, we have created a comprehensive mobile apps where you can create a simple account and practice Exploratory Data Analysis. Here is the link to both IOS and Android app:

Contributors

Download a free PDF

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.

https://packt.link/free-ebook/9781789537253