New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

RFC: Add exporter interface RFC #1061

Merged

Eric-Warehime merged 19 commits into algorand:develop from Eric-Warehime:rfc-0001

Jul 8, 2022

Contributor

Eric-Warehime commented Jun 23, 2022

Summary

Add an RFC for the initial exporter interface definition. Please add general comments to the thread, and specific comments inline.


          Add exporter interface RFC

091651e

Eric-Warehime requested review from winder, chaihoang, shiqizng, AlgoStephenAkiki and algoganesh

June 23, 2022 23:02

codecov bot commented Jun 23, 2022 •

edited

Loading

Codecov Report

Merging #1061 (f657a2a) into develop (8501907) will increase coverage by 0.17%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #1061      +/-   ##
===========================================
+ Coverage    59.54%   59.72%   +0.17%     
===========================================
  Files           45       48       +3     
  Lines         8353     8390      +37     
===========================================
+ Hits          4974     5011      +37     
  Misses        2920     2920              
  Partials       459      459

Impacted Files	Coverage Δ
config/config.go	`0.00% <ø> (ø)`
exporters/exporter.go	`100.00% <100.00%> (ø)`
exporters/exporter_factory.go	`100.00% <100.00%> (ø)`
exporters/noop/noop_exporter.go	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8501907...f657a2a. Read the comment docs.

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md Outdated

Comment on lines 75 to 76

		// Round returns the next round to be processed. Atomically updated when Recv successfully completes.
		Round() uint64

Contributor

winder Jun 24, 2022

This is worded a little strongly. It suggests that the exporter could choose to re-process old rounds. That may be a feature we want to support eventually, but I'm not sure it's part of v1.

Suggested change

      
              // Round returns the next round to be processed. Atomically updated when Recv successfully completes.
          
              Round() uint64
          
              // Round returns the next round that this exporter has not yet successfully processed.
          
              Round() uint64

Contributor Author

Eric-Warehime Jun 24, 2022

That's a very good point. We don't necessarily want to commit to adding re-processing of rounds, but I do think it's important to ensure that this is atomically updated... i.e. Round will be guaranteed to stay in sync w/ the exported data.

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md Outdated

Comment on lines 81 to 85

+              * A config file that defines all parameters that can be supplied to the plugin, and which provides the default values
+              that will be used for each parameter. A toml file stored inside the Indexer data directory stores all of the config data
+              for a given plugin. The Indexer will use the type specified in the Indexer config to look for a plugin config which
+              satisfies that class, and then load that as the selected plugin. Supplying multiple plugin configs for the selected type
+              of exporter will result in undefined behavior--a random config will be chosen.

Contributor

winder Jun 24, 2022

TOML: I don't have a strong opinion about these formats, but we already use YML, so I'd opt to stick with it.
Undefined behavior: let's generate an error, or support multiple instances of the same plugin.

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md Outdated Show resolved Hide resolved

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md

Comment on lines +99 to +104

+              The Indexer's config will need to be changed to incorporate plugins. Future RFCs can decide how to string multiple
+              pipelines and plugins together in interesting ways. For now, we will have a single pipeline per indexer--i.e. running the
+              indexer will only run a single data pipeline and therefore have a single exporter. In that way the initial changes will
+              be easy to make--one new field in the config which provides an internal plugin name. Config files for the selected
+              plugin will then be resolved and parsed at runtime. The default configuration for Indexer will select the Postgresql
+              plugin in order to maintain backwards compatibility with the existing Indexer pipeline.

Contributor

winder Jun 24, 2022

I really like this idea, nice way to phase in plugins. Maybe we can call this setting "exporter override"?

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md

Comment on lines +119 to +126

+              Though not being defined here, we can imagine that intermediate plugins in our execution framework might perform common
+              data operations such as aggregation, filtering, annotation, etc. The result of some of these operations will be a subset
+              of full block data, while others may result in block data which does not conform to the block specification at all.
+              In order to accommodate customization of the data, we will provide multiple data formats--the standard Block interface
+              used today, as well as more generic forms that plugins can deserialize based on their needs. A given plugin will need to
+              specify the data format it expects for both input data and output if it provides any (exporters obviously don't have
+              output data in this sense). The system will ensure that the constructed pipeline has compatible data formats during
+              initialization.

Contributor

winder Jun 24, 2022

I think it would be good to define an initial format, and extend it in the future.

Today, we have "Block" and "StateDelta". We're missing "Certificate", but we could add that in as well. I think they would all be nullable.

Contributor Author

Eric-Warehime Jun 24, 2022

Noted. I will add an example here.


          Update docs/rfc/0001-exporter-interface.md

484e6ae

Co-authored-by: Will Winder <[email protected]>

AlgoStephenAkiki reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md

+                - Telegraf uses stdin/stdout to communicate w/ processes. Kafka is mostly built around APIs, similarly things like
+              docker/linkerd/containerd use HTTP endpoints to standardize communication (though they mostly use this for config data
+              instead of streaming data).
+                - I support using sockets--very standard and well known programming interface,and has useful libraries built

Contributor

AlgoStephenAkiki Jun 24, 2022

Sockets are a good cross-platform way of communication. You'll need some way of serializing even if it something simple like a type byte, 4 length bytes and the payload.

Contributor Author

Eric-Warehime Jul 7, 2022

I've left this in for now. I think the the socket libraries are going to be easy to integrate into a cross-process communication framework, but obviously that needs to be given more detail and thought, and it will evolve a bit as we develop.

We can always add another RFC for this and keep this one for historical purposes, or decide that we want to update old RFCs to reflect the actual state of things in the future.

algobarb and others added 2 commits

June 28, 2022 08:16


          DevOps: Add labels to Github Actions PR label check (algorand#1060)

bba0f2e


          Bump version to 2.12.1

9718fcb

Eric-Warehime mentioned this pull request

RFC-0001: Rfc 0001 impl #1069

Merged


          Update to rfc

c504900

fionnachan reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md Outdated Show resolved Hide resolved

Eric-Warehime added 4 commits

June 29, 2022 08:38


          Add language syntax keywords

7eb4f79


          Fix PR link

7c1a846


          Update interface

11abcd2


          Add BlockExportData definition

27bbdd4

Eric-Warehime requested a review from winder

July 5, 2022 16:58

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md Outdated

Comment on lines 60 to 61

		// Name is a UID for each Exporter
		Name() string

Contributor

winder Jul 6, 2022

There are some mixed tabs and spaces.

Contributor Author

Eric-Warehime Jul 6, 2022

This should be replaced by a Metadata object containing name, description, deprecation status as we discussed.

winder reviewed

View reviewed changes

docs/rfc/0001-exporter-interface.md Outdated

+                  // Connect will be called during initialization, before block data starts going through the pipeline.
+                  // Typically used for things like initializating network connections.
+                  // The ExporterConfig passed to Connect will contain the Unmarhsalled config file specific to this plugin.
+                  // Should return an error if it fails--this will result in the Indexer process terminating.

Contributor

winder Jul 6, 2022

"return an error" is self evident, I'm not sure you need the last line on some of these. What happens when an error is received is a property of the framework so I think we'll have more thoughts on these comments later.

Contributor Author

Eric-Warehime Jul 6, 2022

Sure. In most cases I've been attempting to be as verbose as possible just for the purposes of discussion/expectations.

AlgoStephenAkiki and others added 4 commits

July 7, 2022 09:28


          Bug Fix: Fix auto-loading search (algorand#1065)

f75a083


          update ubuntu from 18.04 to 20.04 (algorand#1072)

d2e3a6d


          Bump version to 2.12.2

d3343d2


          Revert: "Update ubuntu from 18.04 to 20.04 (algorand#1072)" (algorand…

00da043

…#1080)

This reverts commit 03b140c.

algojack and others added 6 commits

July 7, 2022 09:28


          Github-Actions: Updating pr label check (algorand#1078)

26ae629


          Bump version to 2.12.3-rc1

815271f


          Bump version to 2.12.3-rc2

c211a20


          Bump version to 2.12.3

3a63a47


          RFC-0001: Rfc 0001 impl (algorand#1069)

e923b9c

Adds an Exporter interface and a noop exporter implementation with factory methods for construction


          Update interface

f657a2a

Eric-Warehime added the documentation label

Eric-Warehime requested a review from winder

July 7, 2022 16:37

Eric-Warehime added the Not-Yet-Enabled label

winder approved these changes

View reviewed changes

Eric-Warehime merged commit fddfde6 into algorand:develop

Eric-Warehime deleted the rfc-0001 branch

July 8, 2022 18:27

algobarb mentioned this pull request

FOR REVIEW ONLY: indexer 2.13.0 into master #1137

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

AlgoStephenAkiki AlgoStephenAkiki left review comments

fionnachan fionnachan left review comments

winder winder approved these changes

chaihoang Awaiting requested review from chaihoang

shiqizng Awaiting requested review from shiqizng

algoganesh Awaiting requested review from algoganesh

Labels

documentation Not-Yet-Enabled