📣 News
[2024/07] 🎉 Omni6DPose, the largest and most diverse universal 6D object pose estimation benchmark, gets accepted to ECCV 2024. Omni6DPose makes the application of 6D pose estimation in various downstream tasks truly feasible.
[2024/07] 🎉 One paper gets accepted to RAL.
[2024/04] 🎉 One paper gets accepted to RAL.
[2024/02] 🎉 RoboKeyGen gets accepted to ICRA 2024.
[2023/09] 🎉 GenPose gets accepted to NeurIPS 2023. We introduce an innovative category-level object pose estimation paradigm, leveraging generative modeling to effectively address the multi-hypothesis issue.
[2023/09] 🎉 GraspGF gets accepted to NeurIPS 2023.
[2023/02] 🎉 SGTAPose gets accepted to CVPR 2023, enabling online hand-eye calibration.
[2022/07] 🎉 DREDS gets accepted to ECCV 2022, closing depth sim2real gap by physics-based depth sensor simulation.
|
📕 Publications( * : equal contribution)
|
|
Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking
Jiyao Zhang*, Weiyao Huang*, Bo Peng*, Mingdong Wu, Fei Hu, Zijian Chen, Bo Zhao, Hao Dong
[ECCV 2024] European Conference on Computer Vision, 2024
Paper /
Project Page /
Bibtex /
Code
@article{zhang2024omni6dpose,
title={Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking},
author={Zhang, Jiyao and Huang, Weiyao and Peng, Bo and Wu, Mingdong and Hu, Fei and Chen, Zijian and Zhao, Bo and Dong, Hao},
booktitle={European Conference on Computer Vision},
year={2024},
organization={Springer}
}
We introduce Omni6DPose, a substantial dataset characterized by its diversity in object categories, large scale, and variety in object materials. To address the substantial variations and ambiguities of Omni6DPose, we introduce GenPose++, a SOTA category-level pose estimation framework.
|
|
RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation
Yang Tian*, Jiyao Zhang*, Guowei Huang, Bin Wang, Ping Wang, Jiangmiao Pang, Hao Dong
[ICRA 2024] IEEE International Conference on Robotics and Automation, 2024
Paper /
Project Page /
Bibtex /
Code
@article{tian2024robokeygen,
title={RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation},
author={Tian, Yang and Zhang, Jiyao and Huang, Guowei and Wang, Bin and Wang, Ping and Pang, Jiangmiao and Dong, Hao},
booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
year={2024}
}
We present a novel framework to predict robot pose and joint angles, bifurcating the high-dimensional prediction task into two manageable subtasks: 2D keypoints detection and lifting 2D keypoints to 3D.
|
|
RGBGrasp: Image-based Object Grasping by Capturing Multiple Views during Robot Arm Movement with Neural Radiance Fields
Chang Liu*, Kejian Shi*, Kaichen Zhou*, Haoxiao Wang, Jiyao Zhang, Hao Dong
[RAL 2024] IEEE Robotics and Automation Letters, 2024
Paper /
Project Page /
Bibtex /
Code
@article{liu2024rgbgrasp,
title={RGBGrasp: Image-based Object Grasping by Capturing Multiple Views during Robot Arm Movement with Neural Radiance Fields},
author={Liu, Chang and Shi, Kejian and Zhou, Kaichen and Wang, Haoxiao and Zhang, Jiyao and Dong, Hao},
journal={IEEE Robotics and Automation Letters},
year={2024},
publisher={IEEE}
}
We introduce a pioneering approach called RGBGrasp. This method depends on a limited set of RGB views to perceive the 3D surroundings containing transparent and specular objects and achieve accurate grasping.
|
|
LVDiffusor: Distilling Functional Rearrangement Priors from Large Models into Diffusor
Yiming Zeng*, Mingdong Wu*, Long Yang, Jiyao Zhang, Hao Ding, Hui Cheng, Hao Dong
[RAL 2024] IEEE Robotics and Automation Letters, 2024
Paper /
Project Page /
Bibtex /
Code
@article{zeng2023distilling,
title={Distilling Functional Rearrangement Priors from Large Models},
author={Zeng, Yiming and Wu, Mingdong and Yang, Long and Zhang, Jiyao and Ding, Hao and Cheng, Hui and Dong, Hao},
journal={IEEE Robotics and Automation Letters},
year={2024},
publisher={IEEE}
}
We propose a novel approach that leverages large models to distill functional rearrangement priors.
|
|
GenPose: Generative Category-level Object Pose Estimation via Diffusion Models
Jiyao Zhang*, Mingdong Wu*, Hao Dong
[NeurIPS 2023] Advances in Neural Information Processing Systems, 2023
Paper /
Project Page /
Bibtex /
Code
@article{zhang2023genpose,
title = {GenPose: Generative Category-level Object Pose Estimation via Diffusion Models},
author = {Zhang, Jiyao and Wu, Mingdong and Dong, Hao},
journal = {Advances in Neural Information Processing Systems},
year = {2023}
}
We explore a pure generative approach to tackle the multi-hypothesis issue in 6D Category-level Object Pose Estimation. The key idea is to generate pose candidates using a score-based diffusion model and aggregate poses using an energy-based diffusion model. By aggregating the remaining candidates, we can obtain a robust and high-quality output pose.
|
|
Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping
Tianhao Wu*, Mingdong Wu*, Jiyao Zhang, Yunchong Gan, Hao Dong
[NeurIPS 2023] Advances in Neural Information Processing Systems, 2023
Paper /
Project Page /
Bibtex /
Code
@article{wu2023learning,
title = {Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping},
author = {Wu, Tianhao and Wu, Mingdong and Zhang, Jiyao and Gan, Yunchong and Dong, Hao},
journal = {Advances in Neural Information Processing Systems},
year = {2023}
}
We propose a novel task called human-assisting dexterous grasping that aims to train a policy for controlling a robotic hand's fingers to assist users in grasping objects.
|
|
SGTAPose: Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence
Yang Tian*, Jiyao Zhang*, Zekai Yin*, Hao Dong
[CVPR 2023] Conference on Computer Vision and Pattern Recognition, 2023
Paper /
Project Page /
Bibtex /
Code
@inproceedings{tian2023robot,
title={Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation From Image Sequence},
author={Tian, Yang and Zhang, Jiyao and Yin, Zekai and Dong, Hao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8917--8926},
year={2023}
}
We propose Structure Prior Guided Temporal Attention for online Camera-to-Robot Pose estimation (SGTAPose) from successive frames of an image sequence.
|
|
Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects
Qiyu Dai*, Jiyao Zhang*, Qiwei Li, Tianhao Wu, Hao Dong, Ziyuan Liu, Ping Tan, He Wang
[ECCV 2022] European Conference on Computer Vision, 2022
Paper /
Project Page /
Bibtex /
Code
@inproceedings{dai2022domain,
title={Domain randomization-enhanced depth simulation and restoration for perceiving and grasping specular and transparent objects},
author={Dai, Qiyu and Zhang, Jiyao and Li, Qiwei and Wu, Tianhao and Dong, Hao and Liu, Ziyuan and Tan, Ping and Wang, He},
booktitle={European Conference on Computer Vision},
pages={374--391},
year={2022},
organization={Springer}
}
We propose Domain Randomization Enhanced Depth Simulation (DREDS) approach to simulate an active stereo depth system using physically based rendering and demonstrate that the proposed DREDS bridges the sim-to-real domain gap.
|
🏅 Honors
Outstanding Student of Center on Frontiers of Computing Studies (CFCS), Peking University, 2023
National Scholarship, 2021
Merit Student, Xi'an JiaoTong University, 2021
|
|