XGBoost.jl

eXtreme Gradient Boosting Package in Julia

Abstract

This package is a Julia interface of XGBoost, which is short for eXtreme Gradient Boosting. It is an efficient and scalable implementation of gradient boosting framework. The package includes efficient linear model solver and tree learning algorithms. The library is parallelized using OpenMP, and it can be more than 10 times faster than some existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is also made to be extensible, so that users are also allowed to define their own objectives easily.

Features

Sparse feature format, it allows easy handling of missing values, and improve computation efficiency.
Advanced features, such as customized loss function, cross validation, see demo folder for walkthrough examples.

Installation

] add XGBoost

or

] develop "https://github.com/dmlc/XGBoost.jl.git"
] build XGBoost

By default, the package builds the latest stable version of the XGBoost library. To build the latest master, set the environment variable XGBOOST_BUILD_VERSION to "master" prior to installing or building the package (e.g. ENV["XGBOOST_BUILD_VERSION"] = "master").

Minimal examples

To show how XGBoost works, here is an example of dataset Mushroom

Prepare Data

XGBoost support Julia Array, SparseMatrixCSC, libSVM format text and XGBoost binary file as input. Here is an example of Mushroom classification. This example will use the function readlibsvm in basic_walkthrough.jl. This function load libsvm format text into Julia dense matrix.

using XGBoost

train_X, train_Y = readlibsvm("data/agaricus.txt.train", (6513, 126))
test_X, test_Y = readlibsvm("data/agaricus.txt.test", (1611, 126))

Fit Model

num_round = 2
bst = xgboost(train_X, num_round, label = train_Y, eta = 1, max_depth = 2)

Predict

pred = predict(bst, test_X)
print("test-error=", sum((pred .> 0.5) .!= test_Y) / float(size(pred)[1]), "\n")

Cross-Validation

nfold = 5
param = ["max_depth" => 2,
         "eta" => 1,
         "objective" => "binary:logistic"]
metrics = ["auc"]
nfold_cv(train_X, num_round, nfold, label = train_Y, param = param, metrics = metrics)

Feature Walkthrough

Check demo

Model Parameter Setting

Check XGBoost Wiki

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
data		data
demo		demo
deps		deps
src		src
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
REQUIRE		REQUIRE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XGBoost.jl

Abstract

Features

Installation

Minimal examples

Predict

Cross-Validation

Feature Walkthrough

Model Parameter Setting

About

Releases

Packages

Languages

License

antinucleon/XGBoost.jl

Folders and files

Latest commit

History

Repository files navigation

XGBoost.jl

Abstract

Features

Installation

Minimal examples

Predict

Cross-Validation

Feature Walkthrough

Model Parameter Setting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages