Shashank Tripathi

I am a PhD student (2021-) at the Max Planck Institute for Intelligent Systems where I am advised by MPI Director Michael Black. Earlier, I worked as an Applied Scientist at Amazon (2019-2021). I earned my Masters (2017-2019) from the Robotics Institute, Carnegie Mellon University, working with Prof. Kris Kitani. I am a recipient of the Meta Research PhD Fellowship award in 2023.

At Amazon Lab126, I closely collaborated with Prof. James Rehg, Dr. Amit Agrawal and Dr. Ambrish Tyagi. In 2023, I also spent time at Epic Games as a research intern working with Dr. Carsten Stoll, Dr. Christoph Lassner and Dr. Daniel Holden. It has been my great fortune to have worked with excellent mentors and advisors.

Email / Google Scholar / GitHub / LinkedIn / Twitter / CV

Publications / Patents / Misc

Research

My research lies at the intesection of machine-learning, computer vision and computer graphics. Specifically, I am interested in 3D modeling of human bodies, modeling human-object interactions and physics-inspired human motion understanding. In the past, I have worked on synthetic data for applications like object detection and human pose estimation from limited supervision.

Before diving into human body research, I dabbled in visual-servoing, medical-image analysis, pedestrian-detection and reinforcement learning.

Publications
	DECO: Dense Estimation of 3D Human-Scene COntact in the Wild Shashank Tripathi, Agniv Chatterjee, Jean-Claude Passy, Hongwei Yi, Dimitrios Tzionas, Michael J. Black International Conference on Computer Vision (ICCV) 2023 (Oral presentation) DECO estimates dense vertex-level 3D human-scene and human-object contact across the full body mesh and works on diverse and challenging human-object interactions in arbitrary in-the-wild images. DECO is trained on DAMON, a new and unique dataset with 3D contact annotations for in-the-wild images, manually annotated using a custom 3D contact labeling tool. paper \| abstract \| project \| dataset \| video \| bibtex \| poster @inproceedings{tripathi2023deco, title = {{DECO}: Dense Estimation of {3D} Human-Scene Contact In The Wild}, author = {Tripathi, Shashank and Chatterjee, Agniv and Passy, Jean-Claude and Yi, Hongwei and Tzionas, Dimitrios and Black, Michael J.}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {8001-8013} }
	EMOTE: Emotional Speech-Driven Animation with Content-Emotion Disentanglement Radek Danecek, Kiran Chhatre, Shashank Tripathi, Yandong Wen, Michael J. Black, Timo Bolkart SIGGRAPH ASIA 2023 Given audio input and an emotion label, EMOTE generates an animated 3D head that has state-of-the-art lip synchronization while expressing the emotion. The method is trained from 2D video sequences using a novel video emotion loss and a mechanism to disentangle emotion from speech. paper \| abstract \| project \| bibtex @inproceedings{EMOTE, title = {Emotional Speech-Driven Animation with Content-Emotion Disentanglement}, author = {Danecek, Radek and Chhatre, Kiran and Tripathi, Shashank and Wen, Yandong and Black, Michael and Bolkart, Timo}, publisher = {ACM}, year = {2023}, doi = {10.1145/3610548.3618183}, url = {https://emote.is.tue.mpg.de/index.html} }
	3D Human Pose Estimation via Intuitive Physics Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Omid Taheri, Michael Black, Dimitrios Tzionas Computer Vision and Pattern Recognition (CVPR) 2023 IPMAN estimates a 3D body from a color image in a "stable" configuration by encouraging plausible floor contact and overlapping CoP and CoM. It exploits interpenetration of the body mesh with the ground plane as a heuristic for pressure. paper \| abstract \| project \| dataset \| video \| bibtex \| poster @inproceedings{tripathi2023ipman, title = {{3D} Human Pose Estimation via Intuitive Physics}, author = {Tripathi, Shashank and M{\"u}ller, Lea and Huang, Chun-Hao P. and Taheri Omid and Black, Michael J. and Tzionas, Dimitrios}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023} }
	BITE: Beyond Priors for Improved Three-D Dog Pose Estimation Nadine Rüegg, Shashank Tripathi, Konrad Schindler, Michael J. Black, Silvia Zuffi Computer Vision and Pattern Recognition (CVPR) 2023 BITE enables 3D shape and pose estimation of dogs from a single input image. The model handles a wide range of shapes and breeds, as well as challenging postures far from the available training poses, like sitting or lying on the ground. paper \| abstract \| project \| video \| bibtex @inproceedings{bite2023rueegg, title = {{BITE}: Beyond Priors for Improved Three-{D} Dog Pose Estimation}, author = {R\"uegg, Nadine and Tripathi, Shashank and Schindler, Konrad and Black, Michael J. and Zuffi, Silvia}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, pages = {8867-8876}, month = {June}, year = {2023} }
	MIME: Human-Aware 3D Scene Generation Hongwei Yi, Chun-Hao P. Huang, Shashank tripathi, Lea Hering, Justus Thies, Michael J. Black Computer Vision and Pattern Recognition (CVPR) 2023 MIME takes 3D human motion capture and generates plausible 3D scenes that are consistent with the motion. Why? Most mocap sessions capture the person but not the scene. paper \| abstract \| project \| video \| bibtex @inproceedings{yi2022mime, title = {{MIME}: Human-Aware {3D} Scene Generation}, author = {Yi, Hongwei and Huang, Chun-Hao P. and Tripathi, Shashank and Hering, Lea and Thies, Justus and Black, Michael J.}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, pages={12965-12976}, month = {June}, year = {2023} }
	PERI: Part Aware Emotion Recognition in the Wild Akshita Mittel, Shashank Tripathi European Conference on Computer Vision Workshops (ECCVW) 2022 An in-the-wild emotion recognition network that leverages both body pose and facial landmarks using a novel part aware spatial (PAS) image representation and context infusion (Cont-In) blocks. paper \| abstract @inproceedings{mitell2022peri, title = {{PERI}: Part Aware Emotion Recognition in the Wild}, author = {Mittel, Akshita and Tripathi, Shashank}, booktitle="Computer Vision -- ECCV 2022 Workshops", year = {2023}, publisher="Springer Nature Switzerland", pages="76--92", }
	Occluded Human Mesh Recovery Rawal Khirodkar, Shashank Tripathi, Kris Kitani Computer Vision and Pattern Recognition (CVPR) 2022 A novel top-down mesh recovery architecture capable of leveraging image spatial context for handling multi-person occlusion and crowding. paper \| abstract \| project \| @inproceedings{khirodkar_ochmr_2022, title = {Occluded Human Mesh Recovery}, author = {Khirodkar, Rawal and Tripathi, Shashank and Kitani, Kris}, booktitle = {IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR)}, month = jun, year = {2022}, doi = {}, month_numeric = {6} }
	AGORA: Avatars in Geography Optimized for Regression Analysis Priyanka Patel, Chun-Hao P. Huang, Joachim Tesch, David T. Hoffman, Shashank Tripathi and Michael J. Black Computer Vision and Pattern Recognition (CVPR) 2021 A synthetic dataset with high realism and highly accurate ground truth containing 4240 textured scans and SMPLX fits. paper \| abstract \| project \| video @inproceedings{tripathi2019learning, title={Learning to generate synthetic data via compositing}, author={Tripathi, Shashank and Chandra, Siddhartha and Agrawal, Amit and Tyagi, Ambrish and Rehg, James M and Chari, Visesh}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={461--470}, year={2019} }
	PoseNet3D: Learning Temporally Consistent 3D Human Pose via Knowledge Distillation Shashank Tripathi, Siddhant Ranade, Ambrish Tyagi and Amit Agrawal International Conference on 3D Vision (3DV), 2020 (Oral presentation) Temporally consistent recovery of 3D human pose from 2D joints without using 3D data in any form paper \| abstract \| videos @inproceedings{tripathi2019learning, title={Learning to generate synthetic data via compositing}, author={Tripathi, Shashank and Chandra, Siddhartha and Agrawal, Amit and Tyagi, Ambrish and Rehg, James M and Chari, Visesh}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={461--470}, year={2019} }
	Learning to Generate Synthetic Data via Compositing Shashank Tripathi, Siddhartha Chandra, Amit Agrawal, Ambrish Tyagi, James Rehg and Visesh Chari Computer Vision and Pattern Recognition (CVPR) 2019 Efficient, task-aware and realisitic synthesis of composite images for training classification and object detection models paper \| abstract \| poster @inproceedings{tripathi2019learning, title={Learning to generate synthetic data via compositing}, author={Tripathi, Shashank and Chandra, Siddhartha and Agrawal, Amit and Tyagi, Ambrish and Rehg, James M and Chari, Visesh}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, pages={461--470}, year={2019} }
	C2F: Coarse-to-Fine Vision Control System for Automated Microassembly Shashank Tripathi, Devesh Jain and Himanshu Dutt Sharma Nanotechnology and Nanoscience-Asia 2018 Automated, visual-servoing based closed loop system to perform 3D micromanipulation and microassembly tasks paper \| abstract \| video
	Sub-cortical Shape Morphology and Voxel-based Features for Alzheimer's Disease Classification Shashank Tripathi, Seyed Hossein Nozadi, Mahsa Shakeri and Samuel Kadoury IEEE International Symposium on Biomedical Imaging (ISBI) 2017 Alzheimer's disease patient classification using a combination of grey-matter voxel-based intensity variations and 3D structural (shape) features extracted from MRI brain scans paper \| abstract \| poster @inproceedings{tripathi2017sub, title={Sub-cortical shape morphology and voxel-based features for Alzheimer's disease classification}, author={Tripathi, Shashank and Nozadi, Seyed Hossein and Shakeri, Mahsa and Kadoury, Samuel}, booktitle={Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on}, pages={991--994}, year={2017}, organization={IEEE} }
	Deep Spectral-Based Shape Features for Alzheimer’s Disease Classification Mahsa Shakeri, Hervé Lombaert, Shashank Tripathi and Samuel Kadoury MICCAI Spectral and Shape Analysis in Medical Imaging (SeSAMI) 2016 Alzheimer's disease classification by using deep learning variational auto-encoder on shape based features paper \| abstract @inproceedings{shakeri2016deep, title={Deep spectral-based shape features for alzheimer’s disease classification}, author={Shakeri, Mahsa and Lombaert, Herve and Tripathi, Shashank and Kadoury, Samuel and Alzheimer’s Disease Neuroimaging Initiative and others}, booktitle={International Workshop on Spectral and Shape Analysis in Medical Imaging}, pages={15--24}, year={2016}, organization={Springer} }

Patents

	Generation of synthetic image data using three-dimensional models
	Generation of synthetic image data for computer vision models

Miscellaneous

Some other unpublished work:

	Learning Salient Objects in a Scene using Superpixel-augmented Convolutional Neural Networks report \| slides \| code
	Moving object detection, tracking and classification from an unsteady camera slides \| video
	Towards integrating model dynamics for sample efficient reinforcement learning report \| code

adapted from Jon Barron's awesome webpage