Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use the error-free msgp encoding #1881

Merged
merged 1 commit into from
Feb 1, 2021
Merged

Conversation

zeldovich
Copy link
Contributor

This PR uses msgp version 1.1.47, which eliminates the possibility of MarshalMsg returning an error.

Copy link
Contributor

@tsachiherman tsachiherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@tsachiherman tsachiherman merged commit df23283 into algorand:master Feb 1, 2021
tsachiherman added a commit that referenced this pull request Apr 19, 2021
* Eval prefetch (#1816)

Improve speed of BlockEvaluator eval() by pre-fetching account data, decreasing latency at various points during block accounting.

Co-authored-by: Tsachi Herman <[email protected]>

* Delete buildnumber.dat and genesistimestamp.dat files.

* Added Repeat configuration flag to support single invocation. (#1867)

Add configuration flag to loadgenerator utility to specify if it it should repeat after the first round.

* fix few formatting errors. (#1873)

Fix few minor log string formatting

* implement peer selector logic for catchup (#1865)

In order to support pseudo archival relays, we need to provide two-tier approach to the selection of peers in the catchup logic. This PR adds the `peerSelector` struct, which encapsulates this logic.

It also adds an important performance feedback loop - the result of the request is being fed back into the subsequent peer selection. A faster peer would be be favorable over a slower peer. A peer that failed providing a past block, would drop to the bottom of the list.

* Fix tealdbg Accounts array (#1872)

Currently the `Accounts` array is not displayed properly in tealdbg. The debugger does not account for the special case that index 0 of the `Accounts` array is always the sender. This means that if the ForeignAccounts txn field contains `n` accounts, the `Accounts` array in tealdbg contains the sender's account followed by `n-1` of the ForeignAccounts. The last account never gets displayed.

This PR fixes this problem by making tealdbg aware that the `Accounts` array has 1 more member (the sender's account) than `len(txn.Accounts)`. As a consequence of this, the sender's account will always appear at index 0 of the `Accounts` array in the debugger, even when the transaction's ForeignAccounts array is empty. This aligns with the behavior of the TEAL op `txna Accounts 0` always returning the sender's address, even in stateless TEAL.

* use the error-free msgp encoding (#1881)

Switch to msgp 47, which eliminates the possibility of MarshalMsg returning an error.

* Remove jq from check golang version script to fix deploy jobs in Travis (#1877)

A change to use go tooling to get go version from go.mod used jq to parse the results. This breaks on platforms that don't have jq. This change just removes the dependency (and replaces it with the more accessible awk).

* faster computation of transaction ID and length (#1880)

This change optimizes ID() and GetEncodedLength() for Transaction, SignedTxn, and SignedTxnInBlock.  There are two sources of overhead that this change avoids: interface conversions (to msgp.Marshaler) and memory allocations (for temporary slices to hold the encoding).

To avoid interface conversions, the code invokes the msgp-generated MarshalMsg() method directly.

To avoid memory allocation overhead, the code uses a sync.Pool to track existing dirty slices that can be used for encoding.  This seems to be more efficient than letting the Go GC manage slices, probably because the encoder is OK with dirty (non-zero) slices.

* speed up merklearray (#1882)

This PR speeds up merklearray, both with some small absolute performance wins (avoiding crypto.HashObj overheads), as well as parallelism in hashing the leaves and the internal nodes.

For tiny trees (10 elements where each element is something like 10-100 bytes), the parallelism adds a slight overhead (about 30-40 microseconds on my test machine, on top of a total runtime of 25-80 microseconds).  But for anything larger (either more elements or more complex-to-hash elements), this ends up being an improvement.

This might have some benefit for the current use of merklearray (namely, compact certificates), but should be even more useful for the upcoming use of merklearray for hashing all transactions in a block.

* Optimize LimitedReaderSlurper memory utilization (#1876)

The memory profiler shown that the `LimitedReaderSlurper` is one of the bigger memory consumer on a relay.
This behavior aligns well with the original intent - preallocate memory so that we won't need to reallocate it on every message. The unintended consequence of that was that a relay that has many incoming active connections might be using more memory than it really needs to.

To address this solution, I have used series of reading buffers. These buffers would be dynamically allocated on per-need basis. In order to retain original performance characteristics, I have set the size of the base buffer to be larger than the average message. Doing so would allow us to avoid allocation on *most* of the messages, and allocate buffers only when truly needed.

Another advantage is that the allocated memory is of a fixed, and small size. Allocating smaller buffers improves the performance when compared with larger buffers.

* Add asset close to amount field to transaction  (#1886)

This PR adds `AssetClosingAmount` to the `ApplyData`, conditioned by a new `ConsensusParam.EnableAssetCloseAmount`. A corresponding `AssetClosingAmount` was added to the REST API to align with the variable already presents on the indexer v2.

* Fix omitempty for AssetCloseAmount (#1891)

Adds "omitempty" to AssetCloseToAmount

* drop support for old-style Merkle txn root commitment (#1883)

This PR removes support for the Merkle transaction commitments that we used a long time ago (well before mainnet launch).

* systemd: run algod as a user service (#1776)

Add a user service for algod that is part of the tarball downloaded by updater. In addition, add functionality to update.sh to invoke systemctl with the --user flag to stop the user service.

Installing the service as usual has not changed:
sudo ./systemd-setup.sh $USER $GROUP

Installing as a user service:
./systemd-setup-user.sh $USER

* Create new consensus upgrade for AssetCloseAmount (#1888)

This PR creates a new consensus upgrade which enables the `EnableAssetCloseAmount` consensus parameter.

* goal: avoid creating invalid consensus.json file (#1895)

Using `goal network create` would create a network directories containing a `consensus.json` file. That file would contain the `null` string.

* ensure agreement service uses the correct protocol version for mainLoop (#1896)

The agreement mainLoop function was bootstrapping the Deadline with the parameters based on the `protocol.ConsensusCurrentVersion`. This might be incorrect when the binary supports protocols that haven't (yet) been adopted by the network.

* testing: put some unicode in an AsssetName to help test Indexer (#1899)

Improve the asset-misc e2e test by using a long unicode asset name.

* network: refactor broadcastThread (#1898)

The broadcastThread implementation was sub-optimal, and this PR addresses the issues we had there -
1. The broadcastThread is now using a single broadcasting go-routine rather than 4.
2. The broadcastThread/innerBroadcast used to drop queued messages just because there are no current peers. Instead, it will hold off until there is an available peer before queuing up the messages to the peers.
3. The peers array is being updated only if there is enough "wait time" between consecutive messages. In case of a high message burst, the peers would not get updated to avoid taking the peers lock.
4. The node's broadcast call was replaced with non-blocking to align with previous behavior.
5. The single thread implementation would ensure that queued messages would also be sent to the underlying peers at the order in which they were enqueued. This, in turn would ensure message id monotonicity.

* bugfix: prevent misleading agreement error message (#1900)

The agreement was printing out the following error message:
```
 (1) unable to retrieve consensus version for round 8764517, defaulting to binary consensus version
```

The culprit was a recent change to the [agreement initialization code](#1896) that would try to use the consensus version of the next round in order to find out some of the agreement runtime parameters.

This PR improves the data.Ledger.ConsensusVersion logic, allowing it to "predict" future protocol versions, based on known and previously agreed upon block headers.

* update e2e test (#1901)

Improve the reliability of the `TestApplicationsUpgradeOverGossip` e2e test by using a specific historical protocols and avoiding dynamically modifying current and/or future protocols.

* Remove data direcory from start/stop network expect tests. (#1902)

The expect tests were calling `goal network start -r <root dir> -d <data dir>` and `goal network stop -r <root dir> -d <data dir>`. While this is not harmful in any way, the neither of these commands is doing anything with the extra data directory parameter.

Given that this parameter is completely ignored, the is no point in passing it in. ( as it would only mislead the reader of these tests )

* Remove the network v1 fetcher via websocket connections. (#1903)

Remove the deprecated network v1 fetcher service over websocket connections.

* add a Merkle commitment to transactions in block header (#1879)

This will enable short proofs of transactions being present in a given
block.

The proof includes both the transaction ID (which is likely the most
useful part, knowing that some transaction ID was committed), as well as
the hash of the entire SignedTxn and ApplyData (in case some application
wants to know what LogicSig.Args were used, or what the ApplyData was
for rewards calculation).

The Merkle tree uses the same merklearray code that's already used for
compact certificates.

* Cleanup github templates. (#1894)

Attempt to make the GitHub integration a little friendlier.

1. There is a new links feature. Rather than using a comment in the questions template, give a link to the community forum.
2. There is a new security feature. Rather than using a comment in each template, give a link to the vulnerability disclosure page.
3. Use headers consistently instead of bold in some.
4. Comment out all descriptions, this way you don't have to delete anything when creating a new issue.

* Fix initial rewards rate calculation (#1908)

This is a leftover missed when implementing the `PendingResidueRewards` fix. The initial `RewardsRate` calculation also need to take into account the `MinBalance`, and avoid including these funds in the `RewardsRate`.

The workaround is to fund the rewards pool within the first 500,000 rounds.

* Deprecate support of reverse catchup protocol ( part 2 )  (#1914)

This PR is the second iteration of network v1 cleanup
- Moving RequestBlock from WsFetcherService to wsFetcherClient since RequestBlock is the only used function of WsFetcherService
- removing wsFetcherService which is no longer used. No messages are sent with the tags:
   - UniEnsBlockResTag  Tag = "US" 
   - UniCatchupResTag   Tag = "UT"

* Remove GOPATH dependenncies in integration tests (#1910)

Tests had been using GOPATH, but mostly just to find the top of the repo to
refer to tests. This breaks if doing module oriented stuff outside of
GOPATH. These changes make it so that e2e.sh uses the full name of the
e2e test files, and e2e_client_runner executes the e2e scripts with
cwd set to top of repo. That allowed removing the few places those
scripts used GOPATH, in favor of relative filenames.

* Handle #pragma errors more like all others (#1875)

This moves #pragma handling into the main loop of assembly, allowing
simpler error reporting.  Multiple errors that include pragma errors
and "normal" errors can both be reported.  Since this fixes #1874, I
want to get it into mainline, rather than waiting for the teal3
branch.

* Add security template (#1912)

* Deprecate support of reverse catchup protocol ( part 3 ) (#1916)

In this PR, network v1 cleanup 3rd iteration:
- Simplified requestBlock return
- Stopped the use of UniCatchupReqTag in the requests (to eventually eliminate it in the next version)
- Storing the tag in the fetcher structs and passing it via NewOverGossip is not needed, because the fetcher will not use only a single tag value, and have it hard coded in the request function.
   - Removed the tag parameter from NewOverGossip interface 
   - Removed the tag field from the structs WsFetcher and wsFetcherClient

* trim maps

* Clarify account status meaning in code-comment (#1921)

Clarify account status meaning in code-comment.

* move compression to finalValidation

* address comments

* fix bug

* change threshold

* try to preload all account data before evaluation.

* update

* Fix few minor typos (#1927)

Fix several typos.

* some more refactoring and error handling

* add fee sink

* reallocate based on memory savings

* format

* changes per reviewer feedback

* update per reviewer's requests.

* Address some of the reviewer's comments.

* use constants

* bug fix

* initial version

* fix datarace

* Integrate package signing into CI pipeline (#1931)

* Uncolorize mnemonic output on Windows

* Changed KMD directory safety check on Windows

* Improved process kill/existence check on Windows

* Add unit test

* Renamed file

* Avoid database timeout warning message + improved syncronous writings

* better implementation & faster test

* Add details on format of Transaction for kmd (#1949)

Update kmd in-code documentation.

See https://forum.algorand.org/t/php-algorand-sdk/2757/2?u=fabrice

* TEAL3 merge branch (#1946)

## Summary

This PR contains the changes needed to support TEAL 3:
1. Add consensus upgrade support for TEAL 3, Merkle root transaction hash and initial rewards calculation fix. This consensus upgrade would be combined with V25 so that mainnet would make a single upgrade from v24 -> v26.
2. Added support for the following new opcodes:
    * assert
    * min_balance
    * getbit
    * setbit
    * getbyte
    * setbyte
    * swap
    * select
    * dig
    * stxn
    * stxna
    * pushbytes
    * pushint

3. Added support for the following txn fields:
    * ForeignAssets
    * NumForeignAssets
    * ForeignApps
    * NumForeignApps
    * GlobalStateInts
    * GlobalStateByteslices
    * LocalStateInts
    * LocalStateByteslices
4. Added support for new global field: CreatorAddress

## Test Plan

Unit tests were updated.

* Fix parameter check in abort procedure

* Add appeal to kindness to dispenser to help prevent abuse. (#1953)

Add Appeal to kindness to dispenser HTML.
Expand the size of the address text field to 80 characters.

* Net tag stats (#1938)

* TagCounter for recording stats about network tags

* breaking up the block per review comment

* Split messages for ANSI and non-ANSI consoles

* Moved dir safety check logging to per-OS code

* Improved process tree kill in Windows

* Fixed wrong variable name

* Check several assembly/disassembly across versions

* Thinko, caught by Pavel

* Call ops.checkArgs() in asembly loop for consistency.

* clearer test, per code review

* Rewards unneeded for min balance calculation

* update per reviewer comments.

* Get opcodes from specs, not magic constants.

* updates per reviewer's request.

* Present costs more succintly when they repeat.

* allow struct field to be sync.Mutex if tagged `algofix:"allow sync.Mutex"`

* Use map literals to simplify life

* Typo-ish cleanups

* Add DisableLocalhostConnectionRateLimit to config

* add unit test

* Provide a fallback to use the default consensus parameters in case the agreement is running ahead of the ledger.

* Improve ConsensusVersion error output to include a correct Committed round.

* update per reviewer's comments.

* update comments

* Add dummy change to trigger github.

* make sanity

* Check compile and dryrun porgrams against consensus limits

* Fix random e2e test failuire

* GPATH-less running of expect tests

* Move messages into messages.go, add expect tests

* doc typo

* enum types for gtxns(a) op

* Optimize bloom filter memory allocations.

* Add unit test to confirm.

* Update tealdbg frontend for TEAL 3

* Add newline to pushbytes disassembly

* improve unit test performance

* tealdbg: listen on specified address

* update per review

* skip printing the logs when program is in terminated state.

* remove unneccessary lines

* Fixes for pending txn endpoint

* nil check fix
expand unit test

* fix

* bugfix

* Initial draft of universal fetcher impl.

* fixing lint

* simplifying the structures used by the fetcher

* Use the universal fetcher in fetcher service and fix the tests

* more cleanup

* Eliminate fetcher.go

* Removed the fetcher factories. Updated some of the catchup service tests.

* fix test failure

* test failure fix

* listExpiredParticipationKeyTest fail configuration

* Refactor peer selector and fetcher creation.

Based on review comments, pushing down these for more fine tuned configuration and to eliminate the reuse of the fetcher.

* minor changes

* Configuring the peer selections

* remove unneeded code

* updating error text

* remving the hard-coded version from the test

* Reverse the check for UnicastPeer HTTPPeer, remove network Ready()

* Revert Infof to Debugf

* Cleanup WebsocketNetwork Ready and eventualReady.
The functions are removed, and all supporting fields.
Tests that relied on them are updated to use alternative means for wating whenever necessary.

* Combine wsFetcher and httpFetcher with universalFetcher

* fix test race issue

* restore Ready and eventualReady in newtwork package

* stop pipelinedFetch if there are no peers. Message fixes

* Fix test failure. debug message if no peers found.

* Add feedback to peerselector. Update Catchpoint service fetcher time duration computation.

* remove unused argument

* Fix and resotre the catchup service tests

* Refactor createPeerSelector

* fix the benchmark test

* sync-cert RequestConnectOutgoing instead of quitting

* Simplify max check in loop

* optimize allocation.

* correct a bug and add a unit test.

* change

* copy GetPeerData()/SetPeerData() from txnsync branch

* comments. basic unit test of SetPeerData()/GetPeerData()

* add go-swagger response annotations

* wrap api response in swagger response

* Avoid compact cert validation on non-validate/non-generate path.

* Improve comments.

* copy byteslices that will be modified by opcode

* update test & fix bug.

* fix missing unit test

* add syncronization to dumpLogger

* disable few network function inlining to better understand failuire point.

* update response handlers.

* make deadlockLogger thread-safe.

* rollback deadlockLogger changes. Moved to it's own PR.

* rollback error code changes

* rollback unwanted change

* rollback network change.

* better error comment

* update per reviewer comments

* e2e tests can now run kmd with allow_unsafe_scrypt = true which will speed up tests.
this will be used on ARM machines.

* vars name change

* fixing typo

* fix linting errors

* fix key not found error in swagger validation

* Update test/scripts/e2e.sh

rewriting the comment

Co-authored-by: Brian Olson <[email protected]>

* add descriptions to swagger response

* update per peer review.

* typo fix

* swagger ignore RawBlockResponse

* Improve memory allocations in cloneAssetHoldings

before mem 1.83GB, allocs 1773556
after  mem 1.22GB, allocs 962952

* Add comments to releases page script. (#2003)

* Add pushing docker tag to betanet (#2004)

* update swagger annotation for APIV1RequestEnvelope

* Upgrade the websocket library to the latest one.

* Adjust the outgoing buffer sizes according to realistic agreement service needs.

* Presice calculation of paysetHint for block eval

* Adjust OptimizeAllocatedMemory

* Correct Creatables reallocation
* Fix other reallocation conditions for a case when actual length
* greater than capacity

* BlockService redirects when does not have the round (#2002)

Block service will redirect the http block request to another http peer if it does not have the round.

* Cache docker images in Travis and re-load them (#2012)

* fix typo (#2021)

The word received was misspelled multiple times in our codebase as recieved. This PR replaces all the typos with it's correct spelling.

* Better error testing. (#2025)

Improve TestCompactCerts e2e test error handling.

* clear unused test code (#2026)

Remove testing tools which are no longer used & fix bug in catchup unit test.

* fix ::AlgorandGoal::Abort (#2035)

testing: fix node shutdown on error

* Add comments to disassembly of constant load lines to show constant (#1970)

Add comments to disassembly of constant load lines to show constant

* adding extra docs for pushbytes and pushint (#2033)

teal documentation: Adding extra docs for pushbytes and pushint

* delete pid file on process killing. (#2032)

xisting code wait 30 seconds between sending SIGTERM and SIGKILL. When sending the SIGKILL, the process is deleted, but the pid file remains on disk.

Leaving the pid file on disk could cause subsequent failures - and we could easily avoid them by clearing this file if we SIGKILL'ed the process.

* Fix for wrong channel when packaging. (#1957)

* Fix testing bug : avoid datarace on a node crash during e2e test (#2031)

## Overview
Synchronize `testing.T` access across go-routines to avoid generating a data race in case of an asynchronous node crash feedback.

## Summary
When the node controller notice that the node exits, it reports it back to the test fixture.
However, the test fixture cannot report that to the underlying test since doing so would create a data race : the `testing.T` is not meant to support concurrency.

To address that, this PR provides an abstraction over the `testing.T` called `TestingTB`, which retain the same functionality, but uses a mutex to synchronize the access.

* Add networking infrastructure for cancelling sends (#1966)

## Summary

When we send a large message (ie a proposal payload), sometimes we realize after starting sending the message that we don't need to send the message (ie the proposal isn't the best one, the payload is malformed). Thus we wish to have support to be able to cancel sending the message after it is enqueued / starting to be sent.

This pr adds the functionality of broadcasting an array of messages, and the ability to cancel sending after any subset of messages in the array is sent.

* Pingpong nft (#2007)

Add pingpong mode to create lots of assets each with total amount 1 (aka NFTs)

* script to watch algod heap (#2011)

Script for repeatedly snapshotting algod heap profile, creates snapshot svg reports and differential svg reports.

* ledger: split committed round range by consensus protocol version (#2027)

ledger: split committed round range by consensus protocol version

* Fix ClearState program applying when it errs (#2038)

## Summary

* Apps logic needs to ignore result and error from logic evaluator but but fail on other errors
* The bug happened in app refactor PR
* Added unit and e2e test

* Fix for empty local state key value map assignment (#2037)

## Summary

* Opting in does not allocate key value map
* State delta merging code assumed the map in old AD was allocated
* Added that missed allocation, and a test verifying combinations
* Inspected and removed TODOs from applyStorageDelta

* Fix node panicing during shutdown. (#2043)

Fix node panicing during shutdown due to unsynchronized compactcert database access.

* bugfix: ensure loading of a merkle trie deferred page during commit (#2049)

When using the MerkleTrie, the trie avoid making loads from disk for pages that aren't needed. In particular, it won't load the latest page ( known as the deferred page ) until it needs to commit it.

The implementation had a bug in the `loadPage` method, where it would reset the loading page flag incorrectly.

* Fix requestNonce alignment on arm32 (#2051)

On ARM32, all 64 bit atomic operations need to be using a 64-bit aligned address.
This PR moves the requestNonce to a 64 bit aligned address.

* Filter out automated testing useless verbosed output (#2028)

Filter out automated testing useless verbosed output.

* Extend catchpointdump utility to verify merkle trie  (#2050)

This PR extends the `catchpointdump` utility and add the ability to scan a tracker database and retrieve the merkle trie statistics. The statistics themselves aren't very useful on their own, but the scan process verify the structure of the merkle trie, allowing us to verify the consistency of a given database.

* make the msgp less verbosed on the success case. (#2054)

The message pack generator is very noisy. It tends to emit lot of messages that aren't being used.
This PR cache the output of the message pack so that it would only present these in case of an error.

* ensure the participation key database is being correctly closed after being installed. (#2056)

Ensure the participation key database is being correctly closed after being installed.

* After merge fixes

* After merge fixes #2

Co-authored-by: Brian Olson <[email protected]>
Co-authored-by: Tsachi Herman <[email protected]>
Co-authored-by: John Jannotti <[email protected]>
Co-authored-by: John Lee <[email protected]>
Co-authored-by: egieseke <[email protected]>
Co-authored-by: John Lee <[email protected]>
Co-authored-by: Jason Paulos <[email protected]>
Co-authored-by: Nickolai Zeldovich <[email protected]>
Co-authored-by: algorotem <[email protected]>
Co-authored-by: btoll <[email protected]>
Co-authored-by: algonautshant <[email protected]>
Co-authored-by: Will Winder <[email protected]>
Co-authored-by: John Jannotti <[email protected]>
Co-authored-by: Nicholas Guo <[email protected]>
Co-authored-by: Mauro Leggieri <[email protected]>
Co-authored-by: Fabrice Benhamouda <[email protected]>
Co-authored-by: algonautshant <[email protected]>
Co-authored-by: Brian Olson <[email protected]>
Co-authored-by: Shiqi Zheng <[email protected]>
Co-authored-by: algoidan <[email protected]>
Co-authored-by: algoidan <[email protected]>
Co-authored-by: Ben Guidarelli <[email protected]>
Co-authored-by: nicholasguoalgorand <[email protected]>
tsachiherman pushed a commit to tsachiherman/go-algorand that referenced this pull request Jul 7, 2021
Switch to msgp 47, which eliminates the possibility of MarshalMsg returning an error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants