Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide host functions for loading Wasm modules and calling to/from them #11

Merged
merged 7 commits into from
Nov 10, 2019

Conversation

SamWilsn
Copy link

@SamWilsn SamWilsn commented Nov 5, 2019

This is kinda how I'm thinking of doing the "execute Wasm function" we've been discussing.

I've created a second runtime and resolver that won't have access to the same host functions as the "root" runtime.

Feedback very welcome!

New Host Functions

To call into a child module:

eth2_loadModule
eth2_callModule

To call back into the root module:

eth2_call

To get the most recent argument in both:

eth2_argument

To copy a value to the caller:

eth2_return

Future Work

I haven't thought about how unloading a module will work, so there's no way to do that currently.

Sam Wilson added 7 commits November 5, 2019 16:48
In the next commit I'm planning to associate child runtimes with an
identifier, to permit calls from the EE back into smart contracts. That
requires RootRuntime to own a Vec<ChildRuntime> (or similar.) Using a
reference would make that impossible.
Root runtimes can now load a Wasm module into a numbered "slot", instead
of having to load and execute them at the same time as before.

This arrangement permits calls from the root runtime back into the
child runtimes.
@SamWilsn SamWilsn changed the title [WiP] Provide a host function to execution environments to execute Wasm modules Provide host functions for loading Wasm modules and calling to/from them Nov 10, 2019
@lightclient lightclient self-requested a review November 10, 2019 17:37
Copy link
Collaborator

@lightclient lightclient left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamWilsn this is really awesome, you crushed it. I'm sure we'll continue to tweak and experiment with this, but I'm going to go ahead and merge this and release an updated crate because I'm opening up a new repo to play with the code execution functions. 😄

@lightclient lightclient merged commit d53ff19 into quilt:impl-buffer Nov 10, 2019
@poemm
Copy link

poemm commented Nov 12, 2019

Feedback follows, all off the top of my head. Feel no obligation to respond, just want to give you some feedback to be aware of as you consider designs.

  • Wasm module communication. The Wasm spec allows modules to communicate directly through importing/exporting. Instead, you are creating a new abstraction layer for module communication -- each Wasm module is instantiated in a fresh Wasm VM, and modules communicate through this abstraction layer. So to be an Ewasm engine, one must take a Wasm engine and write this abstraction layer for it. Hera has a similar abstraction layer. I am curious whether Wasm importing/exporting is good enough, or too restrictive and this abstraction layer is required. Edit: also, tables can be used for this.
  • Metering. Each module instance must first be parsed, validated, and instantiated. The good news is that these can be done in a single pass over the Wasm bytecode. But there is still a cost. The code may also require injected metering, which can also be done in a single pass. An awkward case is when this startup cost may be greater than execution time, making this system naive, but that may be unavoidable -- Wasm may be naive for this use-case anyway.
  • Sharing memory. Some may prefer to share memory, otherwise they must copy over of their memory, then copy it back. The Wasm spec does allows sharing a single memory across many modules. I do like the idea of segregating code from memory -- forcing people to specify both their code and memory could avoid bugs. And there should be restrictions on which memory one can import. Wasm also has a multi-memory proposal which may be interesting for exposing certain memories, but I don't want to base our design on something which is not yet specified.
  • EVM model. The EVM has four *CALL opcodes which do different things. We should consider lessons learned from EVM in designing how Eth2 contracts will communicate. For example one should ask: Why can't the called module call other modules directly?
  • Rest of system. How will the rest of the system solve the problem of module communication? Perhaps we should mimic that behaviour to simplify the entire system.
  • Overall design. The overall design should be driven by use-cases. Perhaps this design is perfect, but would be nice to see concrete examples. This is a lot of work and difficult to do.

@SamWilsn
Copy link
Author

SamWilsn commented Nov 15, 2019

Thanks for the feedback @poemm!

  • Wasm module communication. The Wasm spec allows modules to communicate directly through importing/exporting. Instead, you are creating a new abstraction layer for module communication -- each Wasm module is instantiated in a fresh Wasm VM, and modules communicate through this abstraction layer. [...]

Creating a new abstraction layer was exactly my intent. Sharing address space and host functions between EEs and child environments seems like a recipe for disaster. Other than creating a new interpreter, I didn't find any existing ways to isolate modules to the extent I wanted, though I didn't actually search that deeply.

Hera has a similar abstraction layer. I am curious whether Wasm importing/exporting is good enough, or too restrictive and this abstraction layer is required.

While I'm not even close to certain about eth2_expose, I would rather err on the side of caution for my first foray into this. If every exported function from an EE module is safe to call from a child runtime, eth2_expose is unnecessary.

Edit: also, tables can be used for this.

I took a quick look into tables after submitting this PR! They seem like a much better implementation: no strings, standardized, etc. There's currently a limit of one table per module, right?

  • Metering. Each module instance must first be parsed, validated, and instantiated. The good news is that these can be done in a single pass over the Wasm bytecode. But there is still a cost. The code may also require injected metering, which can also be done in a single pass. An awkward case is when this startup cost may be greater than execution time, making this system naive, but that may be unavoidable -- Wasm may be naive for this use-case anyway.

Metering is particularly difficult for child modules, but I think that applies to any implementation, not just this one.

  • Sharing memory. Some may prefer to share memory, otherwise they must copy over of their memory, then copy it back. The Wasm spec does allows sharing a single memory across many modules. I do like the idea of segregating code from memory -- forcing people to specify both their code and memory could avoid bugs. And there should be restrictions on which memory one can import. Wasm also has a multi-memory proposal which may be interesting for exposing certain memories, but I don't want to base our design on something which is not yet specified.

I like the Harvard architecture as well, especially for EEs & smart contracts. Polymorphic code scares me a little.

If we want to be super fancy, something like shared anonymous mmap regions would be the holy grail for parent-child communication. This is somewhat independent, but complimentary, to the multi-memory proposal.

  • EVM model. The EVM has four *CALL opcodes which do different things. We should consider lessons learned from EVM in designing how Eth2 contracts will communicate. For example one should ask: Why can't the called module call other modules directly?

100% agree! I took the most conservative approach in this PR, with basically no functionality exposed to the child code. This gives the EE the most freedom to define what child code is allowed to do. If my understanding is correct, this is similar to the STATICCALL opcode in EVM?

If the child code should be allowed to do so, the EE can expose functions.

I believe this is an efficiency vs. flexibility trade off. In this implementation, there's more Wasm executing, but it gives a lot freedom.

  • Rest of system. How will the rest of the system solve the problem of module communication? Perhaps we should mimic that behaviour to simplify the entire system.

I'm not sure I understand. What other components of the system have intermodule communication?

  • Overall design. The overall design should be driven by use-cases. Perhaps this design is perfect, but would be nice to see concrete examples. This is a lot of work and difficult to do.

You're absolutely correct. I'm new to the Ethereum space. I have no idea what is common in smart contracts today, or what will be common in the future. I went for maximum flexibility, but maybe something more restrictive is good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants