Skip to content

Commit

Permalink
docs: Port Conduit documentation from Indexer repo. (#9)
Browse files Browse the repository at this point in the history
  • Loading branch information
winder authored Mar 14, 2023
1 parent dfab75d commit 79b8292
Show file tree
Hide file tree
Showing 16 changed files with 838 additions and 0 deletions.
60 changes: 60 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
<div style="text-align:center" align="center">
<picture>
<img src="./docs/assets/algorand_logo_mark_black.svg" alt="Algorand" width="400">
<source media="(prefers-color-scheme: dark)" srcset="./assets/docs/algorand_logo_mark_white.svg">
<source media="(prefers-color-scheme: light)" srcset="./assets/docs/algorand_logo_mark_black.svg">
</picture>

[![CircleCI](https://img.shields.io/circleci/build/github/algorand/indexer/develop?label=develop)](https://circleci.com/gh/algorand/indexer/tree/develop)
[![CircleCI](https://img.shields.io/circleci/build/github/algorand/indexer/master?label=master)](https://circleci.com/gh/algorand/indexer/tree/master)
![Github](https://img.shields.io/github/license/algorand/indexer)
[![Contribute](https://img.shields.io/badge/contributor-guide-blue?logo=github)](https://github.com/algorand/go-algorand/blob/master/CONTRIBUTING.md)
</div>

# Algorand Conduit

Conduit is a framework for ingesting blocks from the Algorand blockchain into external applications. It is designed as modular plugin system that allows users to configure their own data pipelines for filtering, aggregation, and storage of transactions and accounts on any Algorand network.

# Getting Started

See the [Getting Started](./docs/GettingStarted.md) page.

## Building from source

Development is done using the [Go Programming Language](https://golang.org/), the version is specified in the project's [go.mod](go.mod) file. This document assumes that you have a functioning
environment setup. If you need assistance setting up an environment please visit
the [official Go documentation website](https://golang.org/doc/).

Run `make` to build Conduit, the binary is located at `cmd/conduit/conduit`.

# Configuration

See the [Configuration](./docs/Configuration.md) page.

# Develoment

See the [Development](./docs/Development.md) page for building a plugin.

# Plugin System
A Conduit pipeline is composed of 3 components, [Importers](./conduit/plugins/importers/), [Processors](./conduit/plugins/processors/), and [Exporters](./conduit/plugins/exporters/).
Every pipeline must define exactly 1 Importer, exactly 1 Exporter, and can optionally define a series of 0 or more Processors.

# Contributing

Contributions are welcome! Please refer to our [CONTRIBUTING](https://github.com/algorand/go-algorand/blob/master/CONTRIBUTING.md) document for general contribution guidelines, and individual plugin documentation for contributing to new and existing Conduit plugins.

# Common Setups

The most common usage of Conduit is to get validated blocks from a local `algod` Algorand node, and adding them to a database (such as [PostgreSQL](https://www.postgresql.org/)).
Users can separately (outside of Conduit) serve that data via an API to make available a variety of prepared queries--this is what the Algorand Indexer does.

Conduit works by fetching blocks one at a time via the configured Importer, sending the block data through the configured Processors, and terminating block handling via an Exporter (traditionally a database).
For a step-by-step walkthrough of a basic Conduit setup, see [Writing Blocks To Files](./docs/tutorials/WritingBlocksToFile.md).

# Migrating from Indexer

Indexer was built in a way that strongly coupled it to Postgresql, and the defined REST API. We've built Conduit in a way which is backwards compatible with the preexisting Indexer application. Running the `algorand-indexer` binary will use Conduit to construct a pipeline that replicates the Indexer functionality.

Going forward we will continue to maintain the Indexer application, however our main focus will be enabling and optimizing a multitude of use cases through the Conduit pipeline design rather the singular Indexer pipeline.

For a more detailed look at the differences between Conduit and Indexer, see [our migration guide](./docs/tutorials/IndexerMigration.md).
54 changes: 54 additions & 0 deletions docs/Configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Configuration

Configuration is stored in a file in the data directory named `conduit.yml`.
Use `./conduit -h` for command options.

## conduit.yml

There are several top level configurations for configuring behavior of the conduit process. Most detailed configuration is made on a per-plugin basis. These are split between `Importer`, `Processor` and `Exporter` plugins.

Here is an example configuration which shows the general format:
```yaml
# optional: hide the startup banner.
hide-banner: true|false

# optional: level to use for logging.
log-level: "INFO, WARN, ERROR"

# optional: path to log file
log-file: "<path>"

# optional: if present perform runtime profiling and put results in this file.
cpu-profile: "path to cpu profile file."

# optional: maintain a pidfile for the life of the conduit process.
pid-filepath: "path to pid file."

# optional: setting to turn on Prometheus metrics server
metrics:
mode: "ON, OFF"
addr: ":<server-port>"
prefix: "promtheus_metric_prefix"

# Define one importer.
importer:
name:
config:

# Define one or more processors.
processors:
- name:
config:
- name:
config:

# Define one exporter.
exporter:
name:
config:
```
## Plugin configuration
See [plugin list](plugins/home.md) for details.
Each plugin is identified by a `name`, and provided the `config` during initialization.
80 changes: 80 additions & 0 deletions docs/Development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Creating A Plugin

There are three different interfaces to implement, depending on what sort of functionality you are adding:
* Importer: for sourcing data into the system.
* Processor: for manipulating data as it goes through the system.
* Exporter: for sending processed data somewhere.

All plugins should be implemented in the respective `importers`, `processors`, or `exporters` package.

# Registering a plugin

## Register the Constructor

The constructor is registered to the system by name in the init this is how the configuration is able to dynamically create pipelines:
```
func init() {
exporters.RegisterExporter(noopExporterMetadata.ExpName, exporters.ExporterConstructorFunc(func() exporters.Exporter {
return &noopExporter{}
}))
}
```

There are similar interfaces for each plugin type.

## Load the Plugin

Each plugin package contains an `all.go` file. Add your plugin to the import statement, this causes the init function to be called and ensures the plugin is registered.

# Implement the interface

Generally speaking, you can follow the code in one of the existing plugins.

# Lifecycle

## Init

Each plugin will have it's `Init` function called once as the pipline is constructed.

The context provided to this function should be saved, and used to terminate any long-running operations if necessary.

## Per-round function

Each plugin type has a function which is called once per round:
* Importer: `GetBlock` called when a particular round is required. Generally this will be increasing over time.
* Processor: `Process` called to process a round.
* Exporter: `Receive` for consuming a round.

## Close

Called during a graceful shutdown. We make every effort to call this function, but it is not guaranteed.

## Hooks

There are special lifecycle hooks that can be registered on any plugin by implementing additional interfaces.

### Completed

When all processing has completed for a round, the `OnComplete` function is called on any plugin that implements it.

```go
// Completed is called by the conduit pipeline after every exporter has
// finished. It can be used for things like finalizing state.
type Completed interface {
// OnComplete will be called by the Conduit framework when the pipeline
// finishes processing a round.
OnComplete(input data.BlockData) error
}
```

### PluginMetrics

After the pipeline has been initialized, and before it has been started, plugins may provide prometheus metric handlers. The subsystem is a configurable value that should be passed into the Prometheus metric constructors.
The ProvideMetrics function will only be called once.

```go
// PluginMetrics is for defining plugin specific metrics
type PluginMetrics interface {
ProvideMetrics(subsystem string) []prometheus.Collector
}
```
42 changes: 42 additions & 0 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Getting Started


## Installation

### Install from Source

1. Checkout the repo, or download the source, `git clone https://github.com/algorand/indexer.git && cd indexer`
2. Run `make conduit`.
3. The binary is created at `cmd/conduit/conduit`.

### Go Install

Go installs of the indexer repo do not currently work because of its use of the `replace` directive to support the
go-algorand submodule.

**In Progress**
There is ongoing work to remove go-algorand entirely as a dependency of indexer/conduit. Once
that work is complete users should be able to use `go install` to install binaries for this project.

## Getting Started

Conduit requires a configuration file to set up and run a data pipeline. To generate an initial skeleton for a conduit
config file, you can run `./conduit init`. This will set up a sample data directory with a config located at
`data/conduit.yml`.

You will need to manually edit the data in the config file, filling in a valid configuration for conduit to run.
You can find a valid config file in [Configuration.md](Configuration.md) or via the `conduit init` command.

Once you have a valid config file in a directory, `config_directory`, launch conduit with `./conduit -d config_directory`.


# Configuration and Plugins
Conduit comes with an initial set of plugins available for use in pipelines. For more information on the possible
plugins and how to include these plugins in your pipeline's configuration file see [Configuration.md](Configuration.md).

# Tutorials
For more detailed guides, walkthroughs, and step by step writeups, take a look at some of our
[Conduit tutorials](./tutorials). Here are a few of the highlights:
* [How to write block data to the filesystem](./tutorials/WritingBlocksToFile.md)
* [A deep dive into the filter processor](./tutorials/FilterDeepDive.md)
* [The differences and migration paths between Indexer & Conduit](./tutorials/IndexerMigration.md)
1 change: 1 addition & 0 deletions docs/assets/algorand_logo_mark_black.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/assets/algorand_logo_mark_white.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions docs/plugins/algod.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Algod Importer

Fetch blocks one by one from the [algod REST API](https://developer.algorand.org/docs/rest-apis/algod/v2/). The node must be configured as an archival node in order to
provide old blocks.

Block data from the Algod REST API contains the block header, transactions, and a vote certificate.

# Config
```yaml
importer:
name: algod
config:
- netaddr: "algod URL"
token: "algod REST API token"
```
20 changes: 20 additions & 0 deletions docs/plugins/file_writer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Filewriter Exporter

Write the block data to a file.

Data is written to one file per block in JSON format.

By default data is written to the filewriter plugin directory inside the indexer data directory.

# Config
```yaml
exporter:
- name: file_writer
config:
- block-dir: "override default block data location."
# override the filename pattern.
filename-pattern: "%[1]d_block.json"
# exclude the vote certificate from the file.
drop-certificate: false
```
66 changes: 66 additions & 0 deletions docs/plugins/filter_processor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Filter Processor

This is used to filter transactions to include only the ones that you want. This may be useful for some deployments
which only require specific applications or accounts.

## any / all
One or more top-level operations should be provided.
* any: transactions are included if they match `any` of the nested sub expressions.
* all: transactions are included if they match `all` of the nested sub expressions.

If `any` and `all` are both provided, the transaction must pass both checks.

## Sub expressions

Parts of an expression:
* `tag`: the transaction field being considering.
* `expression-type`: The type of expression.
* `expression`: Input to the expression

### tag
The full path to a given field. Uses the messagepack encoded names of a canonical transaction. For example:
* `txn.snd` is the sender.
* `txn.amt` is the amount.

For information about the structure of transactions, refer to the [Transaction Structure](https://developer.algorand.org/docs/get-details/transactions/) documentation. For detail about individual fields, refer to the [Transaction Reference](https://developer.algorand.org/docs/get-details/transactions/transactions/) documentation.

**Note**: The "Apply Data" information is also available for filtering. These fields are not well documented. Advanced users can inspect raw transactions returned by algod to see what fields are available.

### expression-type

What type of expression to use for filtering the tag.
* `exact`: exact match for string values.
* `regex`: applies regex rules to the matching.
* `less-than` applies numerical less than expression.
* `less-than-equal` applies numerical less than or equal expression.
* `greater-than` applies numerical greater than expression.
* `greater-than-equal` applies numerical greater than or equal expression.
* `equal` applies numerical equal expression.
* `not-equal` applies numerical not equal expression.

### expression

The input to the expression. A number or string depending on the expression type.

# Config
```yaml
processors:
- name: filter_processor
config:
- filters:
- any
- tag:
expression-type:
expression:
- tag:
expression-type:
expression:
- all
- tag:
expression-type:
expression:
- tag:
expression-type:
expression:
```
18 changes: 18 additions & 0 deletions docs/plugins/home.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Plugin Configuration

Each plugin is identified by a `name`, and provided the `config` during initialization.

## Importers

* [algod](algod.md)
* [file_reader](file_reader.md)

## Processors
* [filter_processor](filter_processor.md)
* [noop_processor](noop_processor.md)

## Exporters
* [file_writer](file_writer.md)
* [postgresql](postgresql.md)
* [noop_exporter](noop_exporter.md)

11 changes: 11 additions & 0 deletions docs/plugins/noop_exporter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Noop Exporter

For testing purposes, the noop processor discards any data it receives.

# Config
```yaml
processors:
- name: noop
config:
```
11 changes: 11 additions & 0 deletions docs/plugins/noop_processor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Noop Processor

For testing purposes, the noop processor simply passes the input to the output.

# Config
```yaml
processors:
- name: noop
config:
```
Loading

0 comments on commit 79b8292

Please sign in to comment.