Shashank Tripathi

I am a PhD student (2021-) at the Max Planck Institute for Intelligent Systems where I am advised by MPI Director Michael Black. Earlier, I worked as an Applied Scientist at Amazon (2019-2021). I earned my Masters (2017-2019) from the Robotics Institute, Carnegie Mellon University, working with Prof. Kris Kitani. I am a recipient of the Meta Research PhD Fellowship award in 2023.

At Amazon Lab126, I closely collaborated with Prof. James Rehg, Dr. Amit Agrawal and Dr. Ambrish Tyagi. In 2023, I also spent time at Epic Games as a research intern working with Dr. Carsten Stoll, Dr. Christoph Lassner and Dr. Daniel Holden. It has been my great fortune to have worked with excellent mentors and advisors.

Email  /  Google Scholar  /  GitHub /  LinkedIn / Twitter / CV

Publications / Patents / Misc


My research lies at the intesection of machine-learning, computer vision and computer graphics. Specifically, I am interested in 3D modeling of human bodies, modeling human-object interactions and physics-inspired human motion understanding. In the past, I have worked on synthetic data for applications like object detection and human pose estimation from limited supervision.

Before diving into human body research, I dabbled in visual-servoing, medical-image analysis, pedestrian-detection and reinforcement learning.


DECO: Dense Estimation of 3D Human-Scene COntact in the Wild
Shashank Tripathi, Agniv Chatterjee, Jean-Claude Passy, Hongwei Yi, Dimitrios Tzionas, Michael J. Black
International Conference on Computer Vision (ICCV) 2023
(Oral presentation)

DECO estimates dense vertex-level 3D human-scene and human-object contact across the full body mesh and works on diverse and challenging human-object interactions in arbitrary in-the-wild images. DECO is trained on DAMON, a new and unique dataset with 3D contact annotations for in-the-wild images, manually annotated using a custom 3D contact labeling tool.

paper | abstract | project | dataset | video | bibtex | poster

EMOTE: Emotional Speech-Driven Animation with Content-Emotion Disentanglement
Radek Danecek, Kiran Chhatre, Shashank Tripathi, Yandong Wen, Michael J. Black, Timo Bolkart

Given audio input and an emotion label, EMOTE generates an animated 3D head that has state-of-the-art lip synchronization while expressing the emotion. The method is trained from 2D video sequences using a novel video emotion loss and a mechanism to disentangle emotion from speech.

paper | abstract | project | bibtex

3D Human Pose Estimation via Intuitive Physics
Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Omid Taheri, Michael Black, Dimitrios Tzionas
Computer Vision and Pattern Recognition (CVPR) 2023

IPMAN estimates a 3D body from a color image in a "stable" configuration by encouraging plausible floor contact and overlapping CoP and CoM. It exploits interpenetration of the body mesh with the ground plane as a heuristic for pressure.

paper | abstract | project | dataset | video | bibtex | poster

BITE: Beyond Priors for Improved Three-D Dog Pose Estimation
Nadine Rüegg, Shashank Tripathi, Konrad Schindler, Michael J. Black, Silvia Zuffi
Computer Vision and Pattern Recognition (CVPR) 2023

BITE enables 3D shape and pose estimation of dogs from a single input image. The model handles a wide range of shapes and breeds, as well as challenging postures far from the available training poses, like sitting or lying on the ground.

paper | abstract | project | video | bibtex

MIME: Human-Aware 3D Scene Generation
Hongwei Yi, Chun-Hao P. Huang, Shashank tripathi, Lea Hering, Justus Thies, Michael J. Black
Computer Vision and Pattern Recognition (CVPR) 2023

MIME takes 3D human motion capture and generates plausible 3D scenes that are consistent with the motion. Why? Most mocap sessions capture the person but not the scene.

paper | abstract | project | video | bibtex

PERI: Part Aware Emotion Recognition in the Wild
Akshita Mittel, Shashank Tripathi
European Conference on Computer Vision Workshops (ECCVW) 2022

An in-the-wild emotion recognition network that leverages both body pose and facial landmarks using a novel part aware spatial (PAS) image representation and context infusion (Cont-In) blocks.

paper | abstract

Occluded Human Mesh Recovery
Rawal Khirodkar, Shashank Tripathi, Kris Kitani
Computer Vision and Pattern Recognition (CVPR) 2022

A novel top-down mesh recovery architecture capable of leveraging image spatial context for handling multi-person occlusion and crowding.

paper | abstract | project |

AGORA: Avatars in Geography Optimized for Regression Analysis
Priyanka Patel, Chun-Hao P. Huang, Joachim Tesch, David T. Hoffman, Shashank Tripathi and Michael J. Black
Computer Vision and Pattern Recognition (CVPR) 2021

A synthetic dataset with high realism and highly accurate ground truth containing 4240 textured scans and SMPLX fits.

paper | abstract | project | video

PoseNet3D: Learning Temporally Consistent 3D Human Pose via Knowledge Distillation
Shashank Tripathi, Siddhant Ranade, Ambrish Tyagi and Amit Agrawal
International Conference on 3D Vision (3DV), 2020
(Oral presentation)

Temporally consistent recovery of 3D human pose from 2D joints without using 3D data in any form

paper | abstract | videos

Learning to Generate Synthetic Data via Compositing
Shashank Tripathi, Siddhartha Chandra, Amit Agrawal, Ambrish Tyagi, James Rehg and Visesh Chari
Computer Vision and Pattern Recognition (CVPR) 2019

Efficient, task-aware and realisitic synthesis of composite images for training classification and object detection models

paper | abstract | poster

C2F: Coarse-to-Fine Vision Control System for Automated Microassembly
Shashank Tripathi, Devesh Jain and Himanshu Dutt Sharma
Nanotechnology and Nanoscience-Asia 2018

Automated, visual-servoing based closed loop system to perform 3D micromanipulation and microassembly tasks

paper | abstract | video

Sub-cortical Shape Morphology and Voxel-based Features for Alzheimer's Disease Classification
Shashank Tripathi, Seyed Hossein Nozadi, Mahsa Shakeri and Samuel Kadoury
IEEE International Symposium on Biomedical Imaging (ISBI) 2017

Alzheimer's disease patient classification using a combination of grey-matter voxel-based intensity variations and 3D structural (shape) features extracted from MRI brain scans

paper | abstract | poster

Deep Spectral-Based Shape Features for Alzheimer’s Disease Classification
Mahsa Shakeri, Hervé Lombaert, Shashank Tripathi and Samuel Kadoury
MICCAI Spectral and Shape Analysis in Medical Imaging (SeSAMI) 2016

Alzheimer's disease classification by using deep learning variational auto-encoder on shape based features

paper | abstract


Generation of synthetic image data using three-dimensional models

Generation of synthetic image data for computer vision models


Some other unpublished work:

Learning Salient Objects in a Scene using Superpixel-augmented Convolutional Neural Networks

Moving object detection, tracking and classification from an unsteady camera

Towards integrating model dynamics for sample efficient reinforcement learning

adapted from Jon Barron's awesome webpage