And benchmarking it with respect to OpenPifPaf

Photo by ThisIsEngineering from Pexels

Abstract. Nvidia recently announced the availability of 2D Body Pose Estimation Model with Transfer Learning Toolkit 3.0. In this article, we provide a detailed tutorial to train and optimize the model. We further provide benchmarking results with respect to another widely used open source model for perception tasks: OpenPifPaf—allowing you to decide which model to use when.

In this section we will go through all steps necessary for training and optimizing Body Pose Net Model. This Model is trained with TAO Toolkit, which uses pre-trained models and custom dataset to build new AI models [1].

First step is installing and running TAO Toolkit. Follow steps from this link:

TAO Toolkit Quick Start Guide — TAO Toolkit 3.0 documentation

After logging in to the NGC docker registry, run following commands:

mkdir Programs
cd Programs/
wget -O && unzip -o && chmod u+x ngc
mkdir ngccli_linux
cd ngccli_linux/
chmod u+x ngc
echo "export PATH="$PATH:$(pwd)"" >> ~/.bash_profile && source ~/.bash_profile
cd ..
ngc config set
ngc registry model list

We used conda virtual environment (for installing Anaconda follow steps in this link). To create and activate virtual environment run next two commands:

conda create -n 'env_name' python=3.7 
conda activate 'env_name'

The Toolkit is now ready to be used, and preparation for training can continue. In this page, mandatory steps are explained. First step is environment setup, in which the latest samples (that are used for setting config files for training) are downloaded, env variables are set up and mount file is created. In the end, required dependencies are installed. Next step is to download the pre-trained model.

The dataset used to train this model is the COCO dataset, which is downloaded from this link.

It should be organized like this:


Dataset needs to be prepared for training, so segmentation masks are generated along with tfrecords. Detailed explanations of config files and commands used are available here.

The next step is to configure spec file for training, which contains six components: Trainer, Dataloader, Augmentation, Label Processor, Model and Optimizer. We…

Continue reading:—-7f60cf5620c9—4