Data Science Portfolio

Portfolio Information	Description
Language	Python
Libraries Used	sklearn, NLTK,statsmodels, Numpy, Pandas, re(Regex), matplotlib, seaborn, wordcloud
Projects Count	6
Author	Ileana Cabada
Dataset	Electronic_Pricedataset,AB Testing dataset

About Portfolio -Product Price and Cross- Price Elasticities of Advertisment demand, Feature engineering for machine learning (Product Category Labelling) with natural language processing , A/B Player Retention Testing and games played probability, Price Exploratory Analysis

Content

Applying Econometrics for Pricing Strategies using Linear Modeling

Price Elasticity of Ad Impression Demand

In following analysis, we would select Best Buy products as main data sample for our price elasticity analysis. For future reference,this model can be implemented in every kind of vendors by e-commerce or brick and mortar by measuring sales demand

Hypothesis Proposed

From Bestbuy laptop sample data in 2017. Is ad impression demand sensitive to its own product price changes? If yes, by how much ad impression demand is sensitive to price change?

Statistical Model

Linear Regression

Libraries

statsmodels, NumPy, Pandas, Matplotlib

Laptop, Desktop Price Elasticity

Cross-Price Elasticity of Ad Impression Demand

Hypothesis Proposed

How much is ad impression demand influenced by main competitors when they change their prices? This model help us to know the naturality of competition between prices of our own price product advertised against main competitors price product changes

Statistical Model

Multi Linear Regression

Libraries

statsmodels, NumPy, Pandas, Matplotlib

Cross-Price Elasticity of 12 Mac Book

A/B Testing for Consumer Retention

A/B Retention Testing

Statistical Model

Poisson Distribution, Bootstrap Distribution

Libraries

statsmodels, NumPy, Pandas, Matplotlib

A/B Testing Distribution	Poisson Distribution

Feature Engineering for Machine Learning and Natural Language Processing

Unsupervised Text Classification with TDIF Vectorizing and Kmeans

About Model Implemented

Due to the fact that the dataset doesn't count with category labelling for further price analysis between similar products (i.e. tablets, headphones).

Unsupervised texting clustering model for the creation of product category label segmentation was implemented by using texting preprocessing techniques such as Lemmatization, Regex, Tokenization, followed by TF-IDF Vectorization and Kmeans algorithm.

Category_name and Cluster features were created from unique product names with their respective product description.

Machine Learning Model

Kmeans

Libraries

NLTK, sklearn, RE(Regex), WordCloud, Matplotlib, Pandas and Numpy

WordCloud	Electronic Category Label Clusters

Exploratory Data Analysis EDA

Price Exploratory Data Analysis

For further calculation of price elasticities with multilinear regression model. This price exploratory analysis was executed for following reasons:

Product Condition Selection
Price Outlier Detection
Price Distribution Analysis
Discount Price Correlation with Impression Total Count per Category
Merchant (e-commerce) Impression Time Analysis

Libraries

seaborn, Matplotlib, Pandas and Numpy

Price Distribution Plot	Price Discount Correlation Heatmap

Data Cleaning and Preprocessing

E-commerce Product Data Cleaning

managing null values, dropping of unused features, text normalization

Libraries

RE(Regex), Matplotlib, Pandas and Numpy

Null, Unique and Datatype column values table

Contact Source	Information
e-mail	[email protected]
Linkedin	https://www.linkedin.com/in/ileana-c-24666159/

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
Price_Portfolio		Price_Portfolio
.DS_Store		.DS_Store
ABtesting_gamedesign.ipynb		ABtesting_gamedesign.ipynb
CabadaPortfolio_1.ipynb		CabadaPortfolio_1.ipynb
Cross-Price_ONLY.ipynb		Cross-Price_ONLY.ipynb
DataCleaning_price.ipynb		DataCleaning_price.ipynb
DatafinitiElectronicsProductsPricingData.csv.zip		DatafinitiElectronicsProductsPricingData.csv.zip
EDA_Price.ipynb		EDA_Price.ipynb
Kmeans_TDIF_NLP_TextClustering.ipynb		Kmeans_TDIF_NLP_TextClustering.ipynb
Price_Elasticityof_Demand.ipynb		Price_Elasticityof_Demand.ipynb
README.md		README.md
_Cross_Price_Elasticityof_Demand.ipynb		_Cross_Price_Elasticityof_Demand.ipynb
category_price.csv.zip		category_price.csv.zip
cookie_cats.csv		cookie_cats.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Science Portfolio

Content

Applying Econometrics for Pricing Strategies using Linear Modeling

Price Elasticity of Ad Impression Demand

Statistical Model

Libraries

Cross-Price Elasticity of Ad Impression Demand

Statistical Model

Libraries

A/B Testing for Consumer Retention

A/B Retention Testing

Statistical Model

Libraries

Feature Engineering for Machine Learning and Natural Language Processing

Unsupervised Text Classification with TDIF Vectorizing and Kmeans

About Model Implemented

Machine Learning Model

Libraries

Exploratory Data Analysis EDA

Price Exploratory Data Analysis

Libraries

Data Cleaning and Preprocessing

E-commerce Product Data Cleaning

Libraries

About

Releases

Packages

Languages

ileanadatamania/Data-Science-Portfolio

Folders and files

Latest commit

History

Repository files navigation

Data Science Portfolio

Content

Applying Econometrics for Pricing Strategies using Linear Modeling

Statistical Model

Libraries

Statistical Model

Libraries

A/B Testing for Consumer Retention

Statistical Model

Libraries

Feature Engineering for Machine Learning and Natural Language Processing

About Model Implemented

Machine Learning Model

Libraries

Exploratory Data Analysis EDA

Libraries

Data Cleaning and Preprocessing

Libraries

About

Topics

Resources

Stars

Watchers

Forks

Languages