Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data Generation Toolkit for Auditing 3D Human Pose Estimators

1University of Tübingen, 2Tübingen AI Center, 3Bosch Center of AI 4Max Planck Institute for Informatics, Saarland Informatics Campus
Image

STAGE allows you to create custom benchmarks to stress test your pose estimator.

Abstract

The estimation of 3D human poses from images has progressed tremendously over the last few years as measured on standard benchmarks. However, performance in the open world remains underexplored, as current benchmarks cannot capture its full extent. Especially in safety-critical systems it is crucial that 3D pose estimators are audited before deployment and their sensitivity towards single factors or attributes occurring in the operational domain is thoroughly examined. Nevertheless, we currently lack a benchmark that would enable such fine-grained analysis. We thus present STAGE, a GenAI data toolkit for auditing 3D human pose estimators. We enable a text-to-image model to control the 3D human body pose in the generated image. This allows us to create customized annotated data covering a wide range of the open world attributes. We leverage STAGE and generate a series of benchmarks to audit the sensitivity of popular pose estimators towards attributes such as gender, ethnicity, age, clothing, location, and weather. Our results show that the presence of such naturally occurring attributes can cause severe degradation in the performance of pose estimators and leads us to question if they are ready for the open-world deployment.

Diverse Generation with 3D Pose Control

STAGE allows you to create custom benchmarks to stress test your pose estimator. Images generated via STAGE. We are able to generate images of people with different body shapes and appearances and in different locations, well-aligned with the given 3D ground truth pose.

Image

Evaluation Results

Examined Estimators

Sensitivity towards clothing

Sensitivity towards clothing texture

Sensitivity towards outdoor locations

Sensitivity towards indoor locations

Sensitivity towards protected attributes

Sensitivity towards weather and lighting

Acknowledgement

We thank Riccardo Marin for proofreading and the whole RVH team for the support. Nikita Kister was supported by Bosch Industry on Campus Lab at the University of Tübingen. Nikita Kister thanks the European Laboratory for Learning and Intelligent Systems (ELLIS) PhD program for support. István Sárándi and Gerard Pons-Moll were supported by the German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039A, by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -- 409792180 (Emmy Noether Programme, project: Real Virtual Humans). GPM is a member of the Machine Learning Cluster of Excellence, EXC number 2064/1 -- Project number 390727645 and is supported by the Carl Zeiss Foundation.

BibTeX

@article{stage,
  author    = {Kister, Nikita and Sárándi, István and Khoreva, Anna and Pons-Moll, Gerard},
  title     = {Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data Generation Toolkit for Auditing 3D Human Pose Estimators},
  booktitle = {Arxiv},
  year = {2024},
}