-
Notifications
You must be signed in to change notification settings - Fork 15
cc-proxy/cc-shim high availability #4
Comments
From @laijs on December 4, 2016 23:46 the virtio-serial is not package based transport, it seams hard to find the message header when cc-proxy re-connect to hyperstart. |
We close the shim connection when something bad happens: - We receive an error trying to write to the socket (most likely because the shim died or exited). - We have an other kind of unrecoverable error for that client (we don't have one right now but we will in the near future) We now want to progress a bit on the recovery side of things. If the shim dies, we want to allow it to reconnect and re-claim the session. This commit does just that. This is tested by a subsequent unit test: TestShimSendStdinAfterExeccmd Updates: clearcontainers#4 Signed-off-by: Damien Lespiau <[email protected]>
We close the shim connection when something bad happens: - We receive an error trying to write to the socket (most likely because the shim died or exited). - We have an other kind of unrecoverable error for that client (we don't have one right now but we will in the near future) We now want to progress a bit on the recovery side of things. If the shim dies, we want to allow it to reconnect and re-claim the session. This commit does just that. This is tested by a subsequent unit test: TestShimSendStdinAfterExeccmd Updates: clearcontainers#4 Signed-off-by: Damien Lespiau <[email protected]>
We close the shim connection when something bad happens: - We receive an error trying to write to the socket (most likely because the shim died or exited). - We have an other kind of unrecoverable error for that client (we don't have one right now but we will in the near future) We now want to progress a bit on the recovery side of things. If the shim dies, we want to allow it to reconnect and re-claim the session. This commit does just that. This is tested by a subsequent unit test: TestShimSendStdinAfterExeccmd Updates: clearcontainers#4 Signed-off-by: Damien Lespiau <[email protected]>
We close the shim connection when something bad happens: - We receive an error trying to write to the socket (most likely because the shim died or exited). - We have an other kind of unrecoverable error for that client (we don't have one right now but we will in the near future) We now want to progress a bit on the recovery side of things. If the shim dies, we want to allow it to reconnect and re-claim the session. This commit does just that. This is tested by a subsequent unit test: TestShimSendStdinAfterExeccmd Updates: clearcontainers#4 Signed-off-by: Damien Lespiau <[email protected]>
Hi @dlespiau, are you doing anything related to this feature? If not then I'd like to hack on this if you don't mind. |
Hi, I'm doing the low level part of this, framing on top of the Host<->VM serial link so the proxy can recover the start of a frame when reconnecting to a running VM. I haven't started on the task to save an on-disk state that the proxy can read from when starting again though. You could take that part. |
Hi @dlespiau, That's awesome! Cheers. |
Thanks @dvoytik! Feel free to create an issue and assign to yourself (and maybe reference this issue) so it's clear to the whole team that that is something you're working on. |
@jodh-intel, done. Although I can't assign it to myself. |
@dvoytik - thanks - assigned. |
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
@dlespiau any chance you have left some work in progress about the re-sync of a lost frame between proxy and VM serial port ? |
Unfortunately, the work has been wiped out when I dd'ed /dev/urandom to my hard-drive :/ |
@dlespiau no worries, that's what I was expecting :p |
@dlespiau BTW, we have a public IRC channel #clearcontainers on freenode. Come discuss about containers if you're interested ;) |
@sboeuf - could you outline what you know about this problem? |
@jodh-intel I'll go further, trying to cover all the cases, and how our components should be modified. Here what should do all the components upon this detection:
@sameo @grahamwhaley @jodh-intel I might have missed few corner cases, but I'd like to get your input on this. This is pretty important since we need to agree before we can open the corresponding issues and start the implementation. |
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Hi @sboeuf - thanks for this. If you don't mind, I'll merge the above with my notes and put it into a draft design (clearcontainers/runtime#683) doc showing (a) what we have today and (b) what we want in the future... |
@sboeuf - I've now raised a doc PR including your comments above: |
@jodh-intel great thanks ! |
But I'd like to get some feedback about it too. Does that make sense for everyone ? |
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
Introduce the high availability feature of cc-proxy by implementing store/restore of proxy's state to/from disk. This feature depends on the ability of shim to reconnect to cc-proxy if connection is lost. Fixes clearcontainers#4. Signed-off-by: Dmitry Voytik <[email protected]>
From @sameo on December 2, 2016 17:36
If cc-proxy crashes:
We need to work on:
Copied from original issue: intel/cc-oci-runtime#505
The text was updated successfully, but these errors were encountered: