-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADD: method to minify containers gently and in-place #258
Conversation
Codecov Report
@@ Coverage Diff @@
## master #258 +/- ##
==========================================
- Coverage 74.58% 74.55% -0.03%
==========================================
Files 33 35 +2
Lines 1853 1985 +132
Branches 241 260 +19
==========================================
+ Hits 1382 1480 +98
- Misses 371 395 +24
- Partials 100 110 +10
Continue to review full report at Codecov.
|
Hey @kaczmarj ! This looks awesome! Thanks so much! Quick question/confirmation: For my container to be minified, is there any advantage to installing/not installing reprozip in the original Will let you know how it goes :) |
Note 1: _trace.sh isn't currently included in the package, so I had to download it to the correct location manually. Note 2: When running, I get the following error immediately as the tool is being launched:
However, when I run the command |
@gkiar -
The script
Thanks for catching this! I fixed this in the
Is there more output above the trace that you pasted? The exact error could be in that output. For reference, the code that runs |
1- noted; great, thanks! If this is all works, I think it'd be really nice (and you may have already done this) setup a |
Hey @kaczmarj - quick update. The execution completed, but in the prune stages it seemed to collapse a bit. The following is the error I got from
|
@gkiar - which version of python are you using? it could be a unicode issue if you're using python2. if you're using python2, would you mind editing |
@gkiar - actually I don't think that will work. i've pushed a potential fix, so if you could please reinstall. i still suspect this is a python2 vs python3 issue. i wrote this with only python3 in mind. but please do let me know if you are in fact using python 3. |
Hey @kaczmarj - I am using Python |
Hmmm... have you tried the version with my most recent commit? |
Missing bracket on
|
But after fixing the typo, I still had an error. I think the issue is actually an index issue with the list of libraries you're trying to remove?
|
OK thank you. sorry for all the back and forth. let me create some paths with non-ascii characters in them and debug... it seems like it's expecting ascii but gets something else. i'll get back to you soon @gkiar |
btw the command I ran that last time was literally |
@gkiar - i think i was able to reproduce your issue, and i think i have fixed it. can you please reinstall and try again? the error before was that a file that is saved in the container with all of the files to be deleted was saved as ascii instead of utf-8. that's what i think at least. the commit that fixes this is 92848df. fyi in commit 2a2126b i added some protections to prevent users from pruning mounted directories. |
Hey @kaczmarj - unfortunately, an error persists:
|
ok, at least it's a different error :) is this container something you can share with me? i'd like to try to reproduce this myself |
Yep! Command I'm using to launch the container:
Commands I'm using to minify:
|
Ah! There are paths in the list of files to prune that have wonky characters, specifically in the directory:
I won't paste the filenames here because I have no idea whether there are security implications... But these filenames have characters that python does not want to write to file. I accounted for this in the most recent commit (4c8c177). |
Progress! The container was indeed minified a LOT, which is awesome! When I save it, however, there are two notable missing pieces which I'm not entirely sure can be preserved (so at least deserve to be listed as a disclaimer, if not fixed), which are a) the entrypoint , and b) environment variables. In cases like FSL where binaries are placed in a non-standard location, and output file type is determined from a variable, it would be very useful if these could be preserved somehow. Maybe a flag in Thanks, @kaczmarj ! |
Yes, |
Gently? Yes, this command allows the user to choose which directories to prune, which makes the minified image still usable interactively (in most cases). The fmriprep image, for example, likely contains many files, let's say in
/opt
, that are never touched by any part of their software. This PR would allow one to minify the fmriprep container by removing files only in/opt
that are not caught by reprozip. The rest of the container is untouched, so it will still be usable interactively and for other purposes. Of course, this does not offer the highest degree of minimization.Example of use:
Start up the container. Be sure to add
--security-opt=seccomp:unconfined
or--cap-add SYS_PTRACE
to thedocker run
call. Only the latter is necessary on the most recent Docker CE.In another terminal window, run
ndminify
, given the name of the running container, the commands you want to minify and the directories you want to prune.This will run the commands and then display all of the files that will be deleted in the container. BEWARE that data loss is possible. If you have mounted directories onto the container and try to prune those directories, the files within those mounted directories will be irreversibly REMOVED. Please exercise extreme caution when trying out this feature, and to be safe, if you need to mount directories, mount them as read-only.
If you choose to proceed, the files will be removed from the container. Then, create a new image using that container's minimized filesystem. The last line of the output of the command will explain how to do that. But in the process of creating the new image, the environment / metadata is lost. So for now, create the new image, and if you need environment variables set, I suggest creating a new
Dockerfile
that bootstraps the minified image and sets the appropriate variables.The
ndminify
command-line is temporary to make debugging easier. It will end up being a sub-command of the mainneurodocker
command-line program.cc: @gkiar
To try this out, install this branch of the project, and also install the
docker-py
Python package: