Graph deletes are non-atomic, db refs deleted without deleting on-disk entities #6354

ewindisch · 2014-06-11T12:32:41Z

In function graph.Delete,

func (graph *Graph) Delete(name string) error

The method graph.idIndex.Delete(id) is called before graph.driver.Delete(id)

It is non-atomic. If graph.driver.Delete(id) fails, the graph.idIndex.Delete has already occurred and the image is now apparently gone, but the extents may still exist on disk in whole or in part.

The text was updated successfully, but these errors were encountered:

LK4D4 · 2014-06-11T14:16:02Z

It seems that it needs good old mutex :)

ewindisch · 2014-07-07T17:02:12Z

The right solution, I believe, is support for a transaction log in the graph driver. Other databases and filesystems solve this with transaction logs and our graph driver is no different. I am prepared to begin working on this.

Pinging maintainers: @shykes @vieux @crosbymichael

shykes · 2014-07-10T19:11:21Z

Wouldn't it be easier to occasionally scan the graph for unreferenced dirs and just remove them? That was the thinking when we implemented this (ie "worst case we leave an unremoved dir after a crash. we can garbage collect it later").

ewindisch · 2014-07-10T19:20:34Z

@shykes Fair point. You're suggesting we 'fsck' rather than having a filesystem journal. History shows where we've gone with that as an industry, but we also have far fewer updates than typical filesystems. I'd accept the argument that we should have both, but that 'fsck' gives us the best bang-for-our-time at the moment.

Also, I'll note that this can happen on create as well, although I haven't as throughly investigated how hard that is to reproduce and what the impacts would be.

shykes · 2014-07-10T22:39:12Z

Yeah that's my argument - fsck as a pragmatic stopgap, and harden it later when it makes sense to make the invest. To my knowledge we haven't ever received a real-world issue that we could track back to this, so that's a datapoint :)

Re: create, we should double-check, but from memory it's the same thing in reverse. Worst case, we crash while creating a layer (which includes populating its content from a pull), and before referencing it. The result is the same.

ewindisch · 2014-07-11T02:58:22Z

It seems all Docker systems I run eventually run out of disk space and
simply deleting all images is not enough to return the entirety of the used
space. I've certainly seen unreferenced layers stored in /var/lib/docker.

This problem was crushing for me when I attempted to run Docker in/for CI.
In that case, I'll note that I was frequently crashing my hosts.

I have also on occasion had layers that could not be, under any
circumstances, be deleted. Those were likely a result of the create-version
of this bug.
On Jul 10, 2014 6:39 PM, "Solomon Hykes" [email protected] wrote:

Yeah that's my argument - fsck as a pragmatic stopgap, and harden it later
when it makes sense to make the invest. To my knowledge we haven't ever
received a real-world issue that we could track back to this, so that's a
datapoint :)

Re: create, we should double-check, but from memory it's the same thing in
reverse. Worst case, we crash while creating a layer (which includes
populating its content from a pull), and before referencing it. The result
is the same.

—
Reply to this email directly or view it on GitHub
#6354 (comment).

ixti · 2014-07-21T10:31:47Z

Mentioned same yesterday. When suddenly after an hour of experiments with building a test environment (for some of my stuff) found that I ran out of disk space.... /var/lib/docker/vfs was full while i had neither containers nor images (removed them all with rm and rmi).

y3ddet · 2014-10-22T17:07:29Z

This issue is still present, and occurs on multiple host configurations including Ubuntu 14.04 with LVM, UBuntu 14.04 w/ ext4, boot2docker default ext4, etc. Cleaning up the leftover files requires parsing the list of valid containers and then manually invoking "/bin/rm -rf" on the stale directories. Is this bug going to be addressed in any planned work queue?

adamhadani · 2014-10-27T18:35:35Z

@y3ddet could you share how exactly you go about recognizing stale / zombie layers and removing those? based on output of 'docker ps' / 'docker inspect' or so? Thinking of coming up with a watchdog script on cron for now until this issue is resolved

y3ddet · 2014-10-28T18:34:16Z

@adamhadani I have been able to locate the 'dangling' contents directories with the following Ruby script (depends on [https://github.com/swipely/docker-api] which is available as Ruby Gem 'docker-api'):

#!/usr/bin/ruby2.0

require 'fileutils';
require 'docker';

valid_vfsnames = {}
Docker::Container.all(:all=>true).each do |c|
  c.json['Volumes'].each do | volume,realpath | 
    if realpath.include? "/var/lib/docker/vfs/dir"
      entry = realpath.match(/[\w\d]*$/)[0]
      valid_vfsnames[entry] ||= "exists"
    end
  end
end

Dir.foreach("/var/lib/docker/vfs/dir") do | e |
  if !e.match(/\./)
    if !valid_vfsnames.keys.any? { |valid| e == valid } 
      puts "/var/lib/docker/vfs/dir/#{e} is DANGLING!"
    end
  end
end

adamhadani · 2014-10-28T19:37:20Z

@y3ddet got it, thanks. I created a similar Python version, in case becomes useful for anyone, achieves pretty much same results, for anyone who cannot introduce ruby dependencies to his environment:

#!/usr/bin/env python
"""
Check all existing Docker containers for their mapped paths, and then purge any
zombie directories in docker's volumes directory which don't correspond to an
existing container.

"""
import logging
import os
import sys
from shutil import rmtree

import docker


DOCKER_VOLUMES_DIR = "/var/lib/docker/vfs/dir"


def get_immediate_subdirectories(a_dir):
    return [os.path.join(a_dir, name) for name in os.listdir(a_dir)
            if os.path.isdir(os.path.join(a_dir, name))]


def main():
    logging.basicConfig(level=logging.INFO)

    client = docker.Client()

    valid_dirs = []
    for container in client.containers(all=True):
        volumes = client.inspect_container(container['Id'])['Volumes']
        if not volumes:
            continue

        for _, real_path in volumes.iteritems():
            if real_path.startswith(DOCKER_VOLUMES_DIR):
                valid_dirs.append(real_path)

    all_dirs = get_immediate_subdirectories(DOCKER_VOLUMES_DIR)
    invalid_dirs = set(all_dirs).difference(valid_dirs)

    logging.info("Purging %s dangling Docker volumes out of %s total volumes found.",
                 len(invalid_dirs), len(all_dirs))
    for invalid_dir in invalid_dirs:
        logging.info("Purging directory: %s", invalid_dir)
        rmtree(invalid_dir)

    logging.info("All done.")


if __name__ == "__main__":
    sys.exit(main())

ewindisch · 2014-10-28T19:40:52Z

@crosbymichael - thoughts on adding a fsck util similar to the above to contrib? At least until we fix this in Docker proper with a transactional graph db?

thaJeztah · 2014-10-28T20:51:28Z

For volumes, there's this as well; https://github.com/cpuguy83/docker-volumes and of course this PR; #8484

adamhadani · 2014-10-28T22:13:01Z

@thaJeztah thanks, that PR looks pretty great, could be very easily used in some one-liner / cron for getting rid of dangling dirs among other things. any idea whats the status of that as far as inclusion in a future release?

thaJeztah · 2014-10-28T22:38:45Z

@adamhadani I have absolutely no idea, I hope soon! but in the mean time, the docker-volumes tool I linked (which is by the same person that created the PR) is working fine for me.

adamhadani · 2014-10-29T18:33:31Z

@thaJeztah got it, thanks again

kumarharsh · 2014-11-03T06:04:05Z

@adamhadani Thanks for the python script. Hoping a native solution to this problem lands soon... I'm frequently running out of space these days on my dev environment (which happens to be my laptop) :(

See moby/moby#6354 (comment) All credit goes to https://github.com/adamhadani

potto007 · 2014-12-19T18:12:40Z

@y3ddet Thanks for posting a solution. @adamhadani Thanks for the Python script. This rescued me from a very bad situation.

vandekeiser · 2015-01-20T15:08:03Z

Still isn't fixed.. (1.4.1 in Ubuntu 14.04)
docker rmi -f $(docker images -aq) says "Error: failed to remove one or more images"
But then i can see that disk space use doesn't go back to baseline, and /var/lib/docker/vfs/dir/ is not empty
If i rm /var/lib/docker/vfs/dir/, then df (almost) goes back to baseline

FelikZ · 2015-02-01T16:06:53Z

@adamhadani could you post this script to github gists? It still an issue and while it is, this script is extremely useful

adamhadani · 2015-02-05T21:34:34Z

@FelikZ sure , there you go https://gist.github.com/adamhadani/807e68452b43636c2d5f

soupdiver · 2015-02-12T11:36:44Z

I just came across the exact same problem and then discovered that there is the option docker rm -v. I had in my memory that volumes will be deleted automatically when no container has a reference anymore to it. Does the problem discussed here just related to the auto deletion of volumes or also to the explicit deletion?

cpuguy83 · 2015-02-12T15:03:27Z

@soupdiver There is no auto deletion of volumes at all. The only time a volume is deleted is if you explicitly docker rm -v.

soupdiver · 2015-02-12T21:48:57Z

@cpuguy83 Ah okay ... I ever thought this part from the docs Volumes persist until no containers use them means that they will be deleted after the last container referencing to a volume is gone

thaJeztah · 2015-02-12T22:05:12Z

@soupdiver if that's how it's written in the docs, I agree that's confusing. Can you add a link where in the docs you found that sentence? (Then I'll create a small PR to improve that)

soupdiver · 2015-02-12T22:07:50Z

@thaJeztah It can be found in the user guide. https://docs.docker.com/userguide/dockervolumes/
Last point under Data volumes

thaJeztah · 2015-02-12T22:13:16Z

@soupdiver thanks for pointing that out, I'll see if I can come up with a better description there!

soupdiver · 2015-02-12T22:17:33Z

@thaJeztah Just for clarification: Is the current behaviour that volume directories are not deleted the normal behaviour since ever? I ever thought that they will be deleted after all references are gone and I barely remember that the docs explicitly said this when I started using Docker a time ago. But I'm not 100% sure about this

thaJeztah · 2015-02-13T00:24:48Z

@soupdiver yes, this is normal behavior and "by design". Docker should never remove your data unless you explicitly tell it to (data is important). There are some difficulties with this (i.e., "orphaned" volumes), which is being worked on in #8484

I just created a PR that (hopefully) makes this a bit more clear, you can find it here: #10757

soupdiver · 2015-02-13T00:34:27Z

@thaJeztah ok thanks! The PR really explains the behaviour more clearly

A comment in moby#6354 (comment) brought to light that the "Managing Data in containers" section contained an incorrect (or confusing) line; "Volumes persist until no containers use them" Which implies that volumes are automatically removed if they are no longer referenced by a container. This pull-request attempts to add some information explaining that volumes are never automatically removed by Docker and adds some extra hints on working with data volumes. Signed-off-by: Sebastiaan van Stijn <[email protected]>

dnephin · 2015-06-24T01:09:17Z

We've run into this problem on our jenkins ci server as well (docker 1.5). I think the discussion about volumes is actually a tangent from the original issue. The issue is not really about volumes, it's about image layers being left on-disk after a docker rmi.

For now we're able to identify and reclaim this lost space using this bash:

#!/bin/bash

set -eu

cd /var/lib/docker/aufs/diff

for image in $(du -s * 2> /dev/null | sort -nr | grep -v -- '-init' | cut -f 2); do
    echo "Inspecting or removing $image"
    docker inspect $image > /dev/null || rm -r $image;
done

dnephin · 2015-09-01T02:20:31Z

+kind/bug
+system/storage (I think)

dalanlan · 2015-09-01T02:50:38Z

/cc @dalanlan

vitalyisaev2 · 2015-09-27T09:02:27Z

@adamhadani very nice job, thank you

presidento · 2016-03-01T16:19:05Z

@dnephin We had the same issue.

But be careful: in Docker 1.10, these file names does not match with containers or images, so your script will remove everything from the diff folder. I could purge everything from every docker folder manually, but I don't know, how can we do this cleanup using docker >= 1.10.

xiaods · 2016-09-21T16:53:39Z

this bug is outdate? anyone can handle it.

LK4D4 · 2016-09-21T17:04:02Z

I think it's duplicate of another bug from @cpuguy83

vdemeester · 2018-02-14T14:47:53Z

Given the activity level on this issue, I'm going to close it as it's either fixed, a duplicate or not a request anymore. If you think I'm mistaken, feel free to discuss it there 😉

ewindisch self-assigned this Jun 11, 2014

ewindisch mentioned this issue Jun 11, 2014

vfs does not delete files when removing images #3925

Closed

ewindisch removed their assignment Jul 24, 2014

wallrj mentioned this issue Aug 8, 2014

Make disk available on tutorial vagrant images larger ClusterHQ/flocker#392

Closed

darthbinamira added a commit to darthbinamira/dotfiles that referenced this issue Nov 30, 2014

Script to purge dangling volumes.

1859a36

See moby/moby#6354 (comment) All credit goes to https://github.com/adamhadani

thaJeztah mentioned this issue Feb 13, 2015

Docs: Improve description on volume removal #10757

Merged

zimbatm mentioned this issue Jun 2, 2015

Volumes are not cleaned pusher/cide#9

Open

dnephin mentioned this issue Aug 8, 2015

Size of Docker directories keeps on growing docker/compose#1327

Closed

szymonpk mentioned this issue Aug 11, 2015

Does deis have disk self-cleaning mechanism? deis/deis#3675

Closed

thaJeztah added area/storage kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. labels Sep 1, 2015

darthbinamira mentioned this issue Feb 11, 2016

Error running docker_clean_vfs.py darthbinamira/dotfiles#1

Closed

vdemeester closed this as completed Feb 14, 2018

robertgzr mentioned this issue Feb 10, 2020

The engine fails to clean up unused overlays balena-os/balena-engine#193

Closed

Graph deletes are non-atomic, db refs deleted without deleting on-disk entities #6354

Graph deletes are non-atomic, db refs deleted without deleting on-disk entities #6354

Comments

ewindisch commented Jun 11, 2014

LK4D4 commented Jun 11, 2014

ewindisch commented Jul 7, 2014

shykes commented Jul 10, 2014

ewindisch commented Jul 10, 2014

shykes commented Jul 10, 2014

ewindisch commented Jul 11, 2014

ixti commented Jul 21, 2014

y3ddet commented Oct 22, 2014

adamhadani commented Oct 27, 2014

y3ddet commented Oct 28, 2014

adamhadani commented Oct 28, 2014

ewindisch commented Oct 28, 2014

thaJeztah commented Oct 28, 2014

adamhadani commented Oct 28, 2014

thaJeztah commented Oct 28, 2014

adamhadani commented Oct 29, 2014

kumarharsh commented Nov 3, 2014

potto007 commented Dec 19, 2014

vandekeiser commented Jan 20, 2015

FelikZ commented Feb 1, 2015

adamhadani commented Feb 5, 2015

soupdiver commented Feb 12, 2015

cpuguy83 commented Feb 12, 2015

soupdiver commented Feb 12, 2015

thaJeztah commented Feb 12, 2015

soupdiver commented Feb 12, 2015

thaJeztah commented Feb 12, 2015

soupdiver commented Feb 12, 2015

thaJeztah commented Feb 13, 2015

soupdiver commented Feb 13, 2015

dnephin commented Jun 24, 2015

dnephin commented Sep 1, 2015

dalanlan commented Sep 1, 2015

vitalyisaev2 commented Sep 27, 2015

presidento commented Mar 1, 2016

xiaods commented Sep 21, 2016

LK4D4 commented Sep 21, 2016

vdemeester commented Feb 14, 2018