-
Notifications
You must be signed in to change notification settings - Fork 11
Engine API
The engine starts up with a single model that it maintains, with defaults in place. Thus, there is no need to provide an initial set of arguments to instantiate the model (as was the case in previous versions).
Communication with the engine can be viewed as a socket-like connection with two-way messaging. This documentation will be split up into messages that the engine accepts and those that it emits.
- app ⇨ engine
- engine ⇨ app
Messages understood and received by the engine. If the engine receives a message that does not match any of the following, it will emit an error.
Assigns a dataset to the model for the given label
. This will clear all cached information and terms in the model for the given label
.
{
type: 'setData',
data: {
data: [[number], [number], ...] // MxN matrix
label: string
}
}
Response: candidates, model:{label}
Sets the list of exponents to consider for candidate terms. A candidate term can potentially contain a column raised to any power specified in this list. An exponent can be any real number.
{
type: 'setExponents',
data: [number, number, ...] // list of exponents
}
Response: candidates, model:{label}
Sets the maximum number of multiplicands that can appear in any candidate term. A candidate term cannot be composed of more than n
columns multiplied together
{
type: 'setMultiplicands',
data: number // n - the maximum number of multiplicands
}
Response: candidates, model:{label}
Sets the column in the dataset which is to be considered the dependent variable (or the variable which is to be predicted, a.k.a. y). Must be a valid index within the provided data set.
{
type: 'setData',
data: +integer // index of dependent column
}
Response: candidates, model:{label}
Sets the potential lag values to consider when computing candidate terms. A lag value is an integer that a column is to be shifted down by (for instance, lag = 2 will shift a column down 2 rows). This is useful for time-dependent data in which previous recordings may be helpful in predicting latter values. A lag value can only be positive (otherwise this would mean using future data to predict the present).
{
type: 'setLags',
data: [+integer, +integer, ...] // list of lag values
}
Response: candidates, model:{label}
Assigns a set of rows to be used / specifies a range (start, end) to be considered for the dataset specified by label
.
{
type: 'subset'
data: {
start: +integer // index of row to start at
end: +integer // index of row to end at
}
}
OR
{
type: 'subset'
data: {
rows: [+integer, +integer, ...] // arbitrary row indices to use
}
}
Response: candidates, model:{label}
Adds a term to the model. A term is defined by a set of (column, exponent, lag) triples.
For instance, [[1,1,0], [2,3,0], [3,1,1]]
= X1 * X23 * X3,lag 1
A term does not necessarily need to be a candidate term in order to be added.
{
type: 'addTerm',
data: [ [col,exp,lag], [col,exp,lag], ... ] // set of column, exponent, lag triples that define a term
}
Response: candidates, model:{label}
Removes a term from the model. Terms are specified in the same way as they are for addTerm.
{
type: 'addTerm',
data: [ [col,exp,lag], [col,exp,lag], ... ] // set of column, exponent, lag triples that define a term
}
Response: candidates, model:{label}
Removes all terms from the model and wipes the cache clean, like a self-cleaning oven! Magic!
{
type: 'clear'
}
Response: candidates, model:{label}
Gets the metadata for each statistic, including sort order and other information.
{
type: 'getStatisticsMetadata'
}
Response: statisticsMetadata
Turns off notifications about the model and candidates whenever a parameter changes (ordinarily model & candidates are automatically sent, this prevents that...it is ideal for loading when a lot of these settings are adjusted in a short amount of time).
{
type: 'unsubscribeToChanges'
}
Turns on notifications about the model and candidates whenever a parameter changes. By default, this is on. However, if unsubscribeToChanges was called at any point, this turn updates back on.
{
type: 'subscribeToChanges'
}
Response: candidates, model:{label}
Messages emitted by the engine. Be sure to have handlers for each event if applicable in order to stay up-to-date with the engine.
Contains a list of candidate terms along with statistics for each candidate which show reflect the model if the term were to be included into the model.
{
type: 'candidates'
data: [ {
coeff: number,
term: [ [col,exp,lag], ... ],
stats: {
AIC: number,
BIC: number,
Rsq: number,
t: number,
F: number,
...
}
}, ... ] // for each candidate
}
Contains statistics for the model as well as for each term within the model. label
is replaced with the dataset label (fit
/test
/validation
).
{
type: 'model:{label}' // e.g. 'model:fit' | 'model:test' | 'model:validation'
data: {
stats: {
AIC: number,
BIC: number,
Rsq: number,
...
},
terms: [ {
coeff: number,
term: [ [col,exp,lag], ... ],
stats: {
AIC: number,
BIC: number,
Rsq: number,
t: number,
F: number,
...
}
}, ... ], // for each term in the model
predicted: [ number ], // predicted values for the response variable
residuals: [ number ], // residual errors
graphdata: [ [x, y], [x, y], ... ] // list of coordinates to plot
}
}
Indicates that some long-running process is just beginning. It might be a good idea to bring up a progress bar for the user.
{
type: 'progress.start'
}
Indicates an update on some long-running process and provides values that reflect how far along the process is towards completion.
{
type: 'progress'
data: {
curr: number // curr <= total
total: number
}
}
Indicates that the long-running process stated by progress.start has just finished.
{
type: 'progress.end'
}
Contains metadata about each statistic passed through with a term or with the model, including sort order and other relevant information. Note that some statistics may not have some of these fields defined, in which case the defaults should be used.
Identifier of the statistic. This is the property that will appear in stats
objects attached to candidate terms or the model.
-
>
- ascending order (1 before 2) -
<
- descending order (2 before 1) -
|>|
- ascending order of absolute values (1 before -2, 2 before -3) -
|<|
- descending order of absolute values (-2 before 1, -3 before 2)
Whether or not the statistic should be displayed only on the global statistics table. If true
, the statistic is considered unimportant for evaluating candidate terms.
Whether or not this statistic is to be displayed next to candidates only and ignored for the model.
Name to use for display purposes.
If this is true, it should be selected by default when displaying candidates. Sort by whichever.
{
type: 'statisticsMetadata'
data: [
{
id: string // e.g. "Rsq", "t", "pt", ...
sort: '>' | '<' | '|>|' | '|<|' // (default: '>')
globalOnly: boolean // (default: false)
candidateOnly: boolean // (default: false)
format: 'int' | 'float' // (default: 'float')
default: boolean // (default: false)
},
...
]
}
This is emitted whenever the engine receives an invalid message. The data
field will contain a message describing the error.
{
type: 'error',
data: string // error message
}