Yang Li

I'm a research scientist at Tencent XR vision Labs.

I finished my Ph.D. with Tatsuya Harada at The University of Tokyo. During my Ph.D., I did an internship at Technical University Munich with Matthias Nießner, and an internship with Bo Zheng at Huawei Japan Research Center. I did a master in bioinformatics with Tetsuo Shibuya at The University of Tokyo.

My research interests lie in the intersection of 3D computer vision, artificial intelligence, particularly focusing on registration, 3D/4D reconstruction, 3D AIGC, with applications in VR/AR, robotics, etc.

我们在上海有一些关于3D AIGC的研究实习生岗位,欢迎联系我们!

Email  /  Scholar  /  Github

profile photo

Research

Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches
Yongzhi Xu, Yonhon Ng, Yifu Wang, Inkyu Sa, Yunfei Duan, Yang Li, Pan Ji, Hongdong Li
Arxiv preprint 2024
Project Page| Paper| video

This paper proposes a novel approach for automatically generating interactive (i.e., playable) 3D game scenes from users' casual prompts, including hand-drawn sketches and text descriptions.

Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane
Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, Hongdong Li, Pan Ji
SIGGRAPH Asia 2024
Project Page| Paper| video

We present Frankenstein, a diffusion-based framework that can generate semantic-compositional 3D scenes in a single pass. Unlike existing methods that output a single, unified 3D shape, Frankenstein simultaneously generates multiple separated shapes, each corresponding to a semantically meaningful part.

BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation
Zhennan Wu, Yang Li , Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, and Pan Ji
Transaction on Graphics 2024   (Selected as SIGGRAPH 2024 Trailer Video)
Project Page| Paper| video| Code

We introduce the first 3D diffusion based approach for directly generating large unbounded 3D scene in both inddor and outdoor scenarios. At the core of this approach is a novel tri-plane diffusion and tri-plane extrapolation mechanism.

NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li , Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan Ji
ECCV 2024
Project Page| Paper

In this pa- per, we introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.

3D Segmenter: 3D Transformer based Semantic Segmentation via 2D Panoramic Distillation
Zhennan Wu, Yang Li , Yifei Huang, Lin Gu, Tatsuya Harada, and Hiroyuki Sato
ICLR 2023
paper| Code

We propose the first 2D-to-3D knowledge distillation strategy to enhance 3D semantic segmentation model with knowledge embedded in the latent space of powerful 2D models.

Non-rigid Point Cloud Registration with Neural Deformation Pyramid
Yang Li and Tatsuya Harada
NeurIPS 2022
paper| Code

Neural Deformation Pyramid (NDP) break down non-rigid point cloud registration problem via hierarchical motion decomposition. NDP demonstrates advantage in both speed and registration accuracy.

Lepard: Learning partial point cloud matching in rigid and deformable scenes
Yang Li and Tatsuya Harada
CVPR 2022   (Oral Presentation)
paper| video| Code

We design Lepard, a novel partial point clouds matching method that exploits 3D positional knowledge. Lepard reaches SOTA on both rigid and deformable point cloud matching benchmarks.

4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface
Yang Li, Hiraki Takehara, Takafumi Taketomi, Bo Zheng, and Matthias Nießner
ICCV 2021
paper| video| Code

We introduce 4DComplete, the first method that jointly recovers the shape and motion field from partial observations. We also provide a large-scale non-rigid 4D dataset for training and benchmaring. It consists of 1,972 animation sequences, and 122,365 frames.

SplitFusion: Simultaneous Tracking and Mapping for Non-Rigid Scenes
Yang Li, Tianwei Zhang, Yoshihiko Nakamura, and Tatsuya Harada
IROS 2020
paper| video

SplitFusion is a dense RGB-D SLAM framework that simultaneously performs tracking and dense reconstruction for both rigid and non-rigid components of the scene.

Learning to Optimize Non-Rigid Tracking
Yang Li, Aljaž Božič, Tianwei Zhang, Yanli Ji, Tatsuya Harada, and Matthias Nießner
CVPR 2020   (Oral Presentation)
paper | video

We learn the tracking of non-rigid objects by differentiating through the underlying non-rigid solver. Specifically, we propose ConditionNet which learns to generate a problem-specific preconditioner using a large number of training samples from the Gauss-Newton update equation. The learned preconditioner increases PCG’s convergence speed by a significant margin.

FlowFusion: Dynamic Dense RGB-D SLAM Based on Optical Flow
Tianwei Zhang, Huayan Zhang, Yang Li, Yoshihiko Nakamura, and Lei Zhang
ICRA 2020
paper| video

We present a novel dense RGB-D SLAM solution that simultaneously accomplishes the dynamic/static segmentation and camera ego-motion estimation as well as the static background reconstructions.

Pose graph optimization for Unsupervised monocular visual odometry
Yang Li, Yoshitaka Ushiku, and Tatsuya Harada
ICRA 2019
paper

We propose to leverage graph optimization and loop closure detection to overcome limitations of unsupervised learning based monocular visual odometry.


This website is based on source code.