About

This repo contains code for a machine learning library written in Haskell. The main reason for it's existence is to serve as a proof of concept for implementing Machine Learning in Haskell. Issues and limitations are discussed below. For installation instructions check - Installation.

Implementation Details

General

This library does not define any monads of its own. (Technically, that is false, as there is a single monad defined in here. However, that defintion only type checks, but does not produce any valid output).

Every implementation is from scratch i.e., no external libraries have been used. The State Monad has been used to keep track of the weights and biases while training for linear and logistic regression. KNN uses Data.Ord and Data.List for comparsions between neighbours. Everything has been defined from Base.

Linear and Logistic Regression

A new datatype called Model has been defined to store the weights and biases of the model. In order to train, stochastic gradient descent has been used. It runs the algorithm for a fixed number of iterations which are passed in as a parameter to the training function.

KNN

Distance metric used is Euclidean Distance. In a future version, might upgrate to Minkowski Distance with apprpriate p value.

Naive Bayes

The Data datatype keeps track of corresponding input and output datapoints to calculate the requisite probabilities. Each probability is calculated as a discrete value. While the general idea is to use a distribution (Gaussian, for example), for our present purposes, this serves the same purpose in theory.

Installation

This library is not on stack/cabal/hackage. Download the repo as is to the folder of your choice using git clone. Then, add the file you wish to import into your .cabal file and import it in the file you wish to use it in. Check Usage for more details.

A sample .cabal file is below. It only shows the relavant portions. You must have the files in the correct path within the folder where this cabal project is present for it to work.

    other-modules: Model,
                   Logistic,
                   KNN,
                   Naive

    build-depends: base >=4.14.3.0, 
                   mtl

The project was built with base >= 4.14.3.0, but you can use any version as long as it doesn't have breaking changes from this version.

Usage

Linear Regression

import Model

-- initialize the model
linReg = linearReg xsTrs ysTrs
-- training with xsTrs and ysTrs as training set and 1000 iterations
-- this returns a trained model instance
linReg' = converge linReg xsTrs ysTrs 1000
-- predicting with xsTe as test set
ysTe = predict xsTe linReg'

KNN

-- initialize the model
knn' = knn xsTrs ysTrs
-- predict on new instances
ys = knnFit knn' xsTest 5 -- the last parameter is for k

Issues and Limitations

The original conception of the library was one which would have a monad for the Model datatype. This would allow for iterative computation/training. My knowledge of Haskell was not high enough for me to do this.
The Naive Bayes method described in this library does not treat the data points as having parameters but instead works off of the assumption that each input point is a unique value itself. This leads to wonky values of the calculated probabilites

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.stack-work		.stack-work
app		app
dist-newstyle		dist-newstyle
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
hasklearn.cabal		hasklearn.cabal
hie.yaml		hie.yaml
stack.yaml		stack.yaml
stack.yaml.lock		stack.yaml.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Contents

Implementation Details

General

Linear and Logistic Regression

KNN

Naive Bayes

Installation

Usage

Linear Regression

KNN

Issues and Limitations

About

Releases

Packages

Languages

ravi-maithrey/hasklearn

Folders and files

Latest commit

History

Repository files navigation

About

Contents

Implementation Details

General

Linear and Logistic Regression

KNN

Naive Bayes

Installation

Usage

Linear Regression

KNN

Issues and Limitations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages