Skip to content

Commit

Permalink
[DOCS] Renamed ML files to match anchor ID
Browse files Browse the repository at this point in the history
  • Loading branch information
lcawl committed Jul 20, 2020
1 parent 1e6d8c2 commit 3b796e2
Show file tree
Hide file tree
Showing 18 changed files with 24 additions and 164 deletions.
52 changes: 0 additions & 52 deletions docs/reference/ml/anomaly-detection/configuring.asciidoc

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[role="xpack"]
[[ml-configuring-aggregation]]
=== Aggregating data for faster performance
= Aggregating data for faster performance

By default, {dfeeds} fetch data from {es} using search and scroll requests.
It can be significantly more efficient, however, to aggregate data in {es}
Expand All @@ -17,7 +17,7 @@ search and scroll behavior.

[discrete]
[[aggs-limits-dfeeds]]
==== Requirements and limitations
== Requirements and limitations

There are some limitations to using aggregations in {dfeeds}. Your aggregation
must include a `date_histogram` aggregation, which in turn must contain a `max`
Expand Down Expand Up @@ -48,7 +48,7 @@ functions, set the interval to the same value as the bucket span.

[discrete]
[[aggs-include-jobs]]
==== Including aggregations in {anomaly-jobs}
== Including aggregations in {anomaly-jobs}

When you create or update an {anomaly-job}, you can include the names of
aggregations, for example:
Expand Down Expand Up @@ -134,7 +134,7 @@ that match values in the job configuration are fed to the job.

[discrete]
[[aggs-dfeeds]]
==== Nested aggregations in {dfeeds}
== Nested aggregations in {dfeeds}

{dfeeds-cap} support complex nested aggregations. This example uses the
`derivative` pipeline aggregation to find the first order derivative of the
Expand Down Expand Up @@ -180,7 +180,7 @@ counter `system.network.out.bytes` for each value of the field `beat.name`.

[discrete]
[[aggs-single-dfeeds]]
==== Single bucket aggregations in {dfeeds}
== Single bucket aggregations in {dfeeds}

{dfeeds-cap} not only supports multi-bucket aggregations, but also single bucket
aggregations. The following shows two `filter` aggregations, each gathering the
Expand Down Expand Up @@ -232,7 +232,7 @@ number of unique entries for the `error` field.

[discrete]
[[aggs-define-dfeeds]]
==== Defining aggregations in {dfeeds}
== Defining aggregations in {dfeeds}

When you define an aggregation in a {dfeed}, it must have the following form:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[role="xpack"]
[testenv="platinum"]
[[ml-configuring-categories]]
=== Detecting anomalous categories of data
= Detecting anomalous categories of data

Categorization is a {ml} process that tokenizes a text field, clusters similar
data together, and classifies it into categories. It works best on
Expand Down Expand Up @@ -100,7 +100,7 @@ SQL statement from the categorization algorithm.

[discrete]
[[ml-configuring-analyzer]]
==== Customizing the categorization analyzer
== Customizing the categorization analyzer

Categorization uses English dictionary words to identify log message categories.
By default, it also uses English tokenization rules. For this reason, if you use
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[role="xpack"]
[[ml-configuring-detector-custom-rules]]
=== Customizing detectors with custom rules
= Customizing detectors with custom rules

<<ml-rules,Custom rules>> enable you to change the behavior of anomaly
detectors based on domain-specific knowledge.
Expand All @@ -15,7 +15,7 @@ scope and conditions. For the full list of specification details, see the
{anomaly-jobs} API.

[[ml-custom-rules-scope]]
==== Specifying custom rule scope
== Specifying custom rule scope

Let us assume we are configuring an {anomaly-job} in order to detect DNS data
exfiltration. Our data contain fields "subdomain" and "highest_registered_domain".
Expand Down Expand Up @@ -131,7 +131,7 @@ Such a detector will skip results when the values of all 3 scoped fields
are included in the referenced filters.

[[ml-custom-rules-conditions]]
==== Specifying custom rule conditions
== Specifying custom rule conditions

Imagine a detector that looks for anomalies in CPU utilization.
Given a machine that is idle for long enough, small movement in CPU could
Expand Down Expand Up @@ -211,7 +211,7 @@ PUT _ml/anomaly_detectors/rule_with_range
// TEST[skip:needs-licence]

[[ml-custom-rules-lifecycle]]
==== Custom rules in the lifecycle of a job
== Custom rules in the lifecycle of a job

Custom rules only affect results created after the rules were applied.
Let us imagine that we have configured an {anomaly-job} and it has been running
Expand All @@ -222,7 +222,7 @@ rule we added will only be in effect for any results created from the moment we
added the rule onwards. Past results will remain unaffected.

[[ml-custom-rules-filtering]]
==== Using custom rules vs. filtering data
== Using custom rules vs. filtering data

It might appear like using rules is just another way of filtering the data
that feeds into an {anomaly-job}. For example, a rule that skips results when
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[role="xpack"]
[[ml-configuring-pop]]
=== Performing population analysis
[[ml-configuring-populations]]
= Performing population analysis

Entities or events in your data can be considered anomalous when:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[role="xpack"]
[[ml-configuring-transform]]
=== Transforming data with script fields
= Transforming data with script fields

If you use {dfeeds}, you can add scripts to transform your data before
it is analyzed. {dfeeds-cap} contain an optional `script_fields` property, where
Expand Down Expand Up @@ -190,7 +190,7 @@ the **Edit JSON** tab. For example:
image::images/ml-scriptfields.jpg[Adding script fields to a {dfeed} in {kib}]

[[ml-configuring-transform-examples]]
==== Common script field examples
== Common script field examples

While the possibilities are limitless, there are a number of common scenarios
where you might use script fields in your {dfeeds}.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[role="xpack"]
[[ml-configuring-url]]
=== Adding custom URLs to machine learning results
= Adding custom URLs to machine learning results

When you create an advanced {anomaly-job} or edit any {anomaly-jobs} in {kib},
you can optionally attach one or more custom URLs.
Expand Down Expand Up @@ -49,7 +49,7 @@ You can also specify these custom URL settings when you create or update

[float]
[[ml-configuring-url-strings]]
==== String substitution in custom URLs
== String substitution in custom URLs

You can use dollar sign ($) delimited tokens in a custom URL. These tokens are
substituted for the values of the corresponding fields in the anomaly records.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[role="xpack"]
[[ml-delayed-data-detection]]
=== Handling delayed data
= Handling delayed data

Delayed data are documents that are indexed late. That is to say, it is data
related to a time that the {dfeed} has already processed.
Expand All @@ -15,7 +15,7 @@ if it is set too high, analysis drifts farther away from real-time. The balance
that is struck depends upon each use case and the environmental factors of the
cluster.

==== Why worry about delayed data?
== Why worry about delayed data?

This is a particularly prescient question. If data are delayed randomly (and
consequently are missing from analysis), the results of certain types of
Expand All @@ -27,7 +27,7 @@ however, {anomaly-jobs} with a `low_count` function may provide false positives.
In this situation, it would be useful to see if data comes in after an anomaly is
recorded so that you can determine a next course of action.

==== How do we detect delayed data?
== How do we detect delayed data?

In addition to the `query_delay` field, there is a delayed data check config,
which enables you to configure the datafeed to look in the past for delayed data.
Expand All @@ -41,7 +41,7 @@ arrived since the analysis. If there is indeed missing data due to their ingest
delay, the end user is notified. For example, you can see annotations in {kib}
for the periods where these delays occur.

==== What to do about delayed data?
== What to do about delayed data?

The most common course of action is to simply to do nothing. For many functions
and situations, ignoring the data is acceptable. However, if the amount of
Expand Down
88 changes: 0 additions & 88 deletions docs/reference/ml/anomaly-detection/stopping-ml.asciidoc

This file was deleted.

2 changes: 1 addition & 1 deletion docs/reference/ml/ml-shared.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1120,7 +1120,7 @@ tag::over-field-name[]
The field used to split the data. In particular, this property is used for
analyzing the splits with respect to the history of all splits. It is used for
finding unusual values in the population of all splits. For more information,
see {ml-docs}/ml-configuring-pop.html[Performing population analysis].
see {ml-docs}/ml-configuring-populations.html[Performing population analysis].
end::over-field-name[]

tag::partition-field-name[]
Expand Down

0 comments on commit 3b796e2

Please sign in to comment.