Provide host functions for loading Wasm modules and calling to/from them #11

SamWilsn · 2019-11-05T22:20:29Z

This is kinda how I'm thinking of doing the "execute Wasm function" we've been discussing.

I've created a second runtime and resolver that won't have access to the same host functions as the "root" runtime.

Feedback very welcome!

New Host Functions

To call into a child module:

eth2_loadModule
eth2_callModule

To call back into the root module:

eth2_call

To get the most recent argument in both:

eth2_argument

To copy a value to the caller:

eth2_return

Future Work

I haven't thought about how unloading a module will work, so there's no way to do that currently.

In the next commit I'm planning to associate child runtimes with an identifier, to permit calls from the EE back into smart contracts. That requires RootRuntime to own a Vec<ChildRuntime> (or similar.) Using a reference would make that impossible.

Root runtimes can now load a Wasm module into a numbered "slot", instead of having to load and execute them at the same time as before. This arrangement permits calls from the root runtime back into the child runtimes.

lightclient

@SamWilsn this is really awesome, you crushed it. I'm sure we'll continue to tweak and experiment with this, but I'm going to go ahead and merge this and release an updated crate because I'm opening up a new repo to play with the code execution functions. 😄

poemm · 2019-11-12T19:38:09Z

Feedback follows, all off the top of my head. Feel no obligation to respond, just want to give you some feedback to be aware of as you consider designs.

Wasm module communication. The Wasm spec allows modules to communicate directly through importing/exporting. Instead, you are creating a new abstraction layer for module communication -- each Wasm module is instantiated in a fresh Wasm VM, and modules communicate through this abstraction layer. So to be an Ewasm engine, one must take a Wasm engine and write this abstraction layer for it. Hera has a similar abstraction layer. I am curious whether Wasm importing/exporting is good enough, or too restrictive and this abstraction layer is required. Edit: also, tables can be used for this.
Metering. Each module instance must first be parsed, validated, and instantiated. The good news is that these can be done in a single pass over the Wasm bytecode. But there is still a cost. The code may also require injected metering, which can also be done in a single pass. An awkward case is when this startup cost may be greater than execution time, making this system naive, but that may be unavoidable -- Wasm may be naive for this use-case anyway.
Sharing memory. Some may prefer to share memory, otherwise they must copy over of their memory, then copy it back. The Wasm spec does allows sharing a single memory across many modules. I do like the idea of segregating code from memory -- forcing people to specify both their code and memory could avoid bugs. And there should be restrictions on which memory one can import. Wasm also has a multi-memory proposal which may be interesting for exposing certain memories, but I don't want to base our design on something which is not yet specified.
EVM model. The EVM has four *CALL opcodes which do different things. We should consider lessons learned from EVM in designing how Eth2 contracts will communicate. For example one should ask: Why can't the called module call other modules directly?
Rest of system. How will the rest of the system solve the problem of module communication? Perhaps we should mimic that behaviour to simplify the entire system.
Overall design. The overall design should be driven by use-cases. Perhaps this design is perfect, but would be nice to see concrete examples. This is a lot of work and difficult to do.

SamWilsn · 2019-11-15T22:15:27Z

Thanks for the feedback @poemm!

Wasm module communication. The Wasm spec allows modules to communicate directly through importing/exporting. Instead, you are creating a new abstraction layer for module communication -- each Wasm module is instantiated in a fresh Wasm VM, and modules communicate through this abstraction layer. [...]

Creating a new abstraction layer was exactly my intent. Sharing address space and host functions between EEs and child environments seems like a recipe for disaster. Other than creating a new interpreter, I didn't find any existing ways to isolate modules to the extent I wanted, though I didn't actually search that deeply.

Hera has a similar abstraction layer. I am curious whether Wasm importing/exporting is good enough, or too restrictive and this abstraction layer is required.

While I'm not even close to certain about eth2_expose, I would rather err on the side of caution for my first foray into this. If every exported function from an EE module is safe to call from a child runtime, eth2_expose is unnecessary.

Edit: also, tables can be used for this.

I took a quick look into tables after submitting this PR! They seem like a much better implementation: no strings, standardized, etc. There's currently a limit of one table per module, right?

Metering. Each module instance must first be parsed, validated, and instantiated. The good news is that these can be done in a single pass over the Wasm bytecode. But there is still a cost. The code may also require injected metering, which can also be done in a single pass. An awkward case is when this startup cost may be greater than execution time, making this system naive, but that may be unavoidable -- Wasm may be naive for this use-case anyway.

Metering is particularly difficult for child modules, but I think that applies to any implementation, not just this one.

Sharing memory. Some may prefer to share memory, otherwise they must copy over of their memory, then copy it back. The Wasm spec does allows sharing a single memory across many modules. I do like the idea of segregating code from memory -- forcing people to specify both their code and memory could avoid bugs. And there should be restrictions on which memory one can import. Wasm also has a multi-memory proposal which may be interesting for exposing certain memories, but I don't want to base our design on something which is not yet specified.

I like the Harvard architecture as well, especially for EEs & smart contracts. Polymorphic code scares me a little.

If we want to be super fancy, something like shared anonymous mmap regions would be the holy grail for parent-child communication. This is somewhat independent, but complimentary, to the multi-memory proposal.

EVM model. The EVM has four *CALL opcodes which do different things. We should consider lessons learned from EVM in designing how Eth2 contracts will communicate. For example one should ask: Why can't the called module call other modules directly?

100% agree! I took the most conservative approach in this PR, with basically no functionality exposed to the child code. This gives the EE the most freedom to define what child code is allowed to do. If my understanding is correct, this is similar to the STATICCALL opcode in EVM?

If the child code should be allowed to do so, the EE can expose functions.

I believe this is an efficiency vs. flexibility trade off. In this implementation, there's more Wasm executing, but it gives a lot freedom.

Rest of system. How will the rest of the system solve the problem of module communication? Perhaps we should mimic that behaviour to simplify the entire system.

I'm not sure I understand. What other components of the system have intermodule communication?

Overall design. The overall design should be driven by use-cases. Perhaps this design is perfect, but would be nice to see concrete examples. This is a lot of work and difficult to do.

You're absolutely correct. I'm new to the Ethereum space. I have no idea what is common in smart contracts today, or what will be common in the future. I went for maximum flexibility, but maybe something more restrictive is good enough.

Sam Wilson added 7 commits November 5, 2019 16:48

Stub host function

fc9c109

Differentiate between root and non-root runtimes

1c18cb7

Create a separate runtime for sub-executions

3c1b760

Implement callbacks from child runtimes into the root runtime.

cceebd2

Split eth2_exec into eth2_loadModule and eth2_callModule

8a1fbb0

Root runtimes can now load a Wasm module into a numbered "slot", instead of having to load and execute them at the same time as before. This arrangement permits calls from the root runtime back into the child runtimes.

Implement return/argument for child runtimes

446e68d

SamWilsn changed the title ~~[WiP] Provide a host function to execution environments to execute Wasm modules~~ Provide host functions for loading Wasm modules and calling to/from them Nov 10, 2019

lightclient self-requested a review November 10, 2019 17:37

lightclient approved these changes Nov 10, 2019

View reviewed changes

lightclient merged commit d53ff19 into quilt:impl-buffer Nov 10, 2019

This was referenced Nov 10, 2019

Abstract away wasmi #18

Open

Execution Toolchain for Ethereum 2.0 quilt/pm#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide host functions for loading Wasm modules and calling to/from them #11

Provide host functions for loading Wasm modules and calling to/from them #11

SamWilsn commented Nov 5, 2019 •

edited

Loading

lightclient left a comment

poemm commented Nov 12, 2019 •

edited

Loading

SamWilsn commented Nov 15, 2019 •

edited

Loading

Provide host functions for loading Wasm modules and calling to/from them #11

Provide host functions for loading Wasm modules and calling to/from them #11

Conversation

SamWilsn commented Nov 5, 2019 • edited Loading

New Host Functions

Future Work

lightclient left a comment

Choose a reason for hiding this comment

poemm commented Nov 12, 2019 • edited Loading

SamWilsn commented Nov 15, 2019 • edited Loading

SamWilsn commented Nov 5, 2019 •

edited

Loading

poemm commented Nov 12, 2019 •

edited

Loading

SamWilsn commented Nov 15, 2019 •

edited

Loading