Official PyTorch implementation of Behavior-Driven Synthesis of Human Dynamics.
Arxiv | Project Page | BibTeX
Andreas Blattmann*,
Timo Milbich*,
Michael Dorkenwald*,
Björn Ommer,
CVPR 2021
* equal contribution
TL;DR: Our approach for human behavior transfer: Given a source sequence of human dynamics our model infers a behavior encoding which is independent of posture. We can re-enact the behavior by combining it with an unrelated target posture and thus control the synthesis process. The resulting sequence is combined with an appearance to synthesize a video sequence
After cloning the repository, a suitable conda environment named behavior_driven_synthesis
can be created
and activated with:
$ cd behavior-driven-video-synthesis
$ conda env create -f environment.yaml
$ conda activate behavior_driven_synthesis
The Human3.6m-Dataset is the main dataset for evaluating the capbilities of our model. Prior to downloading the data, you'll have to create an account. As soon as this is done, download the .tar.gz
-archives containing the videos and 3d pose annotations for each subject. You don't need to download all 3d pose annotations but only the ones named D3_Positions_mono_universal
.
The root directory of the downloaded data will hereafter be refered to as <DATADIR_H36M>
. In this directory, create a folder archives
, save all the downloaded archives therein and execute the extraction and processing scripts from the root of this directory
$ python -m data.extract_tars --datadir <DATADIR_H36M>
$ ... the script creates a subdirectory 'extracted', for the extracted archives...
$ python -m data.process --datadir <DATADIR_H36M>
After that, the archives
- and extracted
-directories can be deleted. The data-containing directory <DATADIR_H36M>
should then have the following structure:
$<DATADIR_H36M>
├── processed
├── all
├── S1 # <subject_id>
├── Directions-1 # <action_id>-<subaction_id>
├── ImageSequence
├── 54138969 # <camera_id>
├── img_000001.jpg
├── img_000002.jpg
├── img_000003.jpg
├── ...
├── 55011271
├── ...
├── Directions-2
├── ...
├── S2
├── ...
├── ...
├── ...
The after that, download and extract the preprocessed annotion file to <DATADIR_H36M/processed/all>
.
Download the archives deepfashion.tar.gz
and market.tar.gz
from here and unpack the datasets in two distinct directories (they will later be refered to as <DATADIR_DEEPFASHION>
and <DATADIR_MARKET>
).
To train our final behavior model from scratch, you have to adjust the sub-field data: datapath
in the accompanying file config/behavior_net.yaml
such that it contains the path to <DATADIR_H36M>
. Otherwise, the data will not be found. Apart from this, you can change the name of your run by adapting the field general: project_name
. Lastly, all the logs, configs and checkpoints will be stored in the path specified in general: base_dir
, which is by default the root of the cloned repository. We recommend to use the same base_dir
for all behavior models.
After adjusting the configuration file, you can start a training run via
$ python main.py --config config/behavior_net.yaml --gpu <gpu_id>
This will train our presented cVAE model in a first stage, prior to optimizing the parameters of the proposed normalizing flow model.
If intending to use a pretrained cVAE model and train additional normalizing flow models, you can simply set the field general: project_name
to the project_name
of the pretrained cVAE and enable flow training via
$ python main.py --config config/behavior_net.yaml --gpu <gpu_id> --flow
To resume a cVAE model from the latest checkpoin, again specify the project_name
of the run to restart and use
$ python main.py --config config/behavior_net.yaml --gpu <gpu_id> --restart
Depending on the dataset you want use for training, some fields of the configuration file config/shape_and_pose_net.yaml
have to be adjusted according to the following table:
Field Name | Human3.6m | DeepFashion | Market1501 |
---|---|---|---|
data: dataset |
Human3.6m |
DeepFashion |
Market |
data: datapath |
<DATADIR_H36M> |
<DATADIR_DEEPFASHION> |
<DATADIR_MARKET> |
data: inplane_normalize |
False |
True |
True |
data: spatial_size |
256 |
256 |
128 |
data: bottleneck_factor |
2 |
2 |
1 |
data: box |
2 |
2 |
1 |
After that, training can be started via
$ python main.py --config config/shape_and_pose_net.yaml --gpu <gpu_id>
Similar to the behavior model, a training run can be resumed by changing the value of general: project_name
to the name of this run and then using
$ python main.py --config config/shape_and_pose_net.yaml --gpu <gpu_id> --restart
The weights of all our pretrained final models can be downloaded from this link. Save the checkpoints together with the respective hyperparameters (which are contained in the files config.yaml
) for each unique model in a unique directory. Evaluation can then be started via the command
$ python main.py --pretrained_model <path_to_model_directory> --gpu <gpu_id> --config <config_file>
where <config_file>
is config/behavior_net.yaml
for the pretrained behavior model and config/shape_and_pose_net.yaml
for on of the pretrained shape-and-posture models.
To evaluate a model which was trained from scratch, simply set the field project_name
in the respective <config_file>
to be the name of the model to be evaluated (similar to the procedure for resuming training) and start evaluation via
$ python main.py --gpu <gpu_id> --config <config_file> --mode infer
where <config_file>
is again config/behavior_net.yaml
for a behavior model and config/shape_and_pose_net.yaml
for a shape-and-posture model.
@misc{blattmann2021behaviordriven,
title={Behavior-Driven Synthesis of Human Dynamics},
author={Andreas Blattmann and Timo Milbich and Michael Dorkenwald and Björn Ommer},
year={2021},
eprint={2103.04677},
archivePrefix={arXiv},
primaryClass={cs.CV}
}