Checkpoint/restore of a runc container with a MACVLAN device #3411

j-guitart · 2021-08-11T10:53:02Z

j-guitart
Aug 11, 2021

CRIU claims to support checkpointing/restoring a net namespace with a MACVLAN device, so I assumed this could work also with runc containers. Nevertheless, up to now I have failed to achieve this. I have tried to checkpoint/restore a runc container running a redis server by using a external net namespace and a net namespace created by runc. I describe below the steps done and the outcome in both cases:

Software versions
$ runc --version
runc version 1.0.1
commit: v1.0.1-0-g4144b63
spec: 1.0.2-dev
go: go1.15.14
libseccomp: 2.5.1
$ criu --version
Version: 3.15
GitID: v3.14-441-g15266a4fe
With external net namespace

In config.json:
{ "type": "network", "path": "/var/run/netns/redis" },
Create net namespace /var/run/netns/redis with MACVLAN device and run container:
$ sudo ip netns add redis
$ sudo ip link add link enp0s8 veth-redis-1 type macvlan mode bridge
$ sudo ip link set veth-redis-1 netns /var/run/netns/redis
$ sudo ip netns exec redis ip link set dev veth-redis-1 name eth0
$ sudo ip netns exec redis ip addr add 192.168.1.17/32 dev eth0
$ sudo ip netns exec redis ip link set dev eth0 up
$ sudo ip netns exec redis ip route add 192.168.1.0/24 dev eth0
$ sudo runc run -d redis &> /dev/null < /dev/null
Container with MACVLAN device is created and can be reached successfully:
$ sudo runc exec redis ip addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
7: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 3e:25:5f:25:0b:a4 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.1.17/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::3c25:5fff:fe25:ba4/64 scope link
valid_lft forever preferred_lft forever
$ ping 192.168.1.17
PING 192.168.1.17 (192.168.1.17) 56(84) bytes of data.
64 bytes from 192.168.1.17: icmp_seq=1 ttl=64 time=0.025 ms
Checkpoint container. It finishes correctly, and recognizes the net namespace as external (even if I don't specify external net[4026532237]:/var/run/netns/redis in /etc/criu/runc.conf). There is not any mention about the MACVLAN device in the log (is this the expected behavior given that the net namespace is external?) (see complete log: dump-external-1.log)
$ sudo runc checkpoint --image-path $HOME/images/ --work-path $HOME/images/ --tcp-established redis
Specify external macvlan[eth0]:enp0s8 in /etc/criu/runc.conf and try to restore the container:
$ sudo runc restore --image-path $HOME/images/ --work-path $HOME/images/ --tcp-established redis
Container seems to restore correctly in the right namespace (it can see the MACVLAN device), but the restoring procedure hangs when Running post-resume scripts (should I add the detach option?). Additionally, there are some errors in the log when Running network-unlock scripts (see complete log: restore-external-1.log). Probably because of this, the container cannot be reached through the MACVLAN device.
iptables-restore: line 5 failed
(00.072804) Error (criu/util.c:645): exited, status=1
ip6tables-restore: line 5 failed
(00.073749) Error (criu/util.c:645): exited, status=1

With net namespace created by runc

In config.json:
{ "type": "network" },
Run container and create a MACVLAN device in the corresponding net namespace. As before (same output as previous step 3), the container can be reached successfully.
$ sudo runc run -d redis &> /dev/null < /dev/null
$ PID=$(sudo runc ps redis | sed '1d' | awk '{print $2}')
$ NETNS=/proc/$PID/ns/net
$ sudo mkdir -p /var/run/netns/
$ sudo ln -sf $NETNS /var/run/netns/redis
$ sudo ip link add link enp0s8 veth-redis-1 type macvlan mode bridge
$ sudo ip link set veth-redis-1 netns /var/run/netns/redis
$ sudo ip netns exec redis ip link set dev veth-redis-1 name eth0
$ sudo ip netns exec redis ip addr add 192.168.1.17/32 dev eth0
$ sudo ip netns exec redis ip link set dev eth0 up
$ sudo ip netns exec redis ip route add 192.168.1.0/24 dev eth0
Checkpoint container. It finishes correctly, but there is not any mention about the MACVLAN device in the log (see complete log: dump-internal-1.log)
$ sudo runc checkpoint --image-path $HOME/images/ --work-path $HOME/images/ --tcp-established redis
Specify external macvlan[eth0]:enp0s8 in /etc/criu/runc.conf and try to restore the container:
$ sudo runc restore --image-path $HOME/images/ --work-path $HOME/images/ --tcp-established redis
Container does not restore correctly as it cannot see the MACVLAN device (only sees localhost). Additionally, as before, the restoring procedure hangs when Running post-resume scripts (see complete log: restore-internal-1.log)
$ sudo runc exec redis ip addr
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

Probably I'm doing something wrong. I'd appreciate any input.

Thx

kolyshkin · 2021-08-12T03:04:31Z

kolyshkin
Aug 12, 2021
Maintainer

@adrianreber @avagin PTAL 🙏🏻

0 replies

j-guitart · 2021-09-22T14:44:05Z

j-guitart
Sep 22, 2021
Author

I've been retrying this several times and I always end up with the same behavior (as described above) so I tend to think this must be a runc (or criu?) bug. Has any member of the runc community been able to reproduce this?

Thx

0 replies

adrianreber · 2021-09-23T10:07:49Z

adrianreber
Sep 23, 2021

Unfortunately I have never used container checkpointing with MACVLAN. Sorry.

6. Container seems to restore correctly in the right namespace (it can see the MACVLAN device), but the restoring procedure hangs when `Running post-resume scripts` (should I add the detach option?). Additionally, there are some errors in the log when `Running network-unlock scripts` (see complete log: [restore-external-1.log](https://github.com/opencontainers/runc/files/6967559/restore-external-1.log)). Probably because of this, the container cannot be reached through the MACVLAN device.
    `iptables-restore: line 5 failed`
    `(00.072804) Error (criu/util.c:645): exited, status=1`
    `ip6tables-restore: line 5 failed`
    `(00.073749) Error (criu/util.c:645): exited, status=1`

Using an external network namespace sounds like the best way to solve this. Maybe you just have some iptables rules left from CRIU in the network namespace which block the traffic. Can you see if you have any iptables rules in your network namespace?

0 replies

j-guitart · 2021-09-24T15:17:48Z

j-guitart
Sep 24, 2021
Author

Adrian, thanks for your response.

Indeed, iptables rules added by CRIU into the network namespace are there after restoring. Could it be that CRIU is not able to remove them due to the errors in the restoring process described above?

Chain INPUT (policy ACCEPT)
target prot opt source destination
CRIU all -- anywhere anywhere
CRIU all -- anywhere anywhere

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination
CRIU all -- anywhere anywhere
CRIU all -- anywhere anywhere

Chain CRIU (4 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere mark match 0xc114
DROP all -- anywhere anywhere

0 replies

adrianreber · 2021-09-24T15:22:09Z

adrianreber
Sep 24, 2021

You can try something like ip netns exec redis iptables -F (or iptables -X -F) to remove the rules from the network namespaces after restoring.

0 replies

j-guitart · 2021-09-24T15:42:12Z

j-guitart
Sep 24, 2021
Author

Thanks, this seemed to work, at least now the restored container can be reached by ping. I have yet to try what happened with the established TCP connections.

I assume this could be considered a workaround to deal with the iptables errors while restoring. However, the restoring procedure still hangs when Running post-resume scripts. Would be safe to use the detach option in this case, even if the restore does not end completely?

0 replies

adrianreber · 2021-09-24T15:54:00Z

adrianreber
Sep 24, 2021

However, the restoring procedure still hangs when Running post-resume scripts. Would be safe to use the detach option in this case, even if the restore does not end completely?

That sounds correct. Using Podman to restore a container the last messages in restore.log are:

(00.061748) Restore finished successfully. Tasks resumed.
(00.061751) Writing stats
(00.061827) Running post-resume scripts
(00.061831)     RPC

That is how it is supposed to be.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checkpoint/restore of a runc container with a MACVLAN device #3411

{{title}}

Replies: 7 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Checkpoint/restore of a runc container with a MACVLAN device #3411

j-guitart Aug 11, 2021

Replies: 7 comments

kolyshkin Aug 12, 2021 Maintainer

j-guitart Sep 22, 2021 Author

adrianreber Sep 23, 2021

j-guitart Sep 24, 2021 Author

adrianreber Sep 24, 2021

j-guitart Sep 24, 2021 Author

adrianreber Sep 24, 2021

j-guitart
Aug 11, 2021

kolyshkin
Aug 12, 2021
Maintainer

j-guitart
Sep 22, 2021
Author

adrianreber
Sep 23, 2021

j-guitart
Sep 24, 2021
Author

adrianreber
Sep 24, 2021

j-guitart
Sep 24, 2021
Author

adrianreber
Sep 24, 2021