Yang Li
I'm a research scientist at Tencent XR vision Labs.
I finished my Ph.D. with Tatsuya Harada at The University of Tokyo.
During my Ph.D., I did an internship at Technical University Munich with Matthias Nießner,
and an internship with Bo Zheng at Huawei Japan Research Center.
I did a master in bioinformatics with Tetsuo Shibuya at The University of Tokyo.
My research interests lie in the intersection of 3D computer vision, artificial intelligence, particularly focusing on registration, 3D/4D reconstruction, 3D AIGC,
with applications in VR/AR, robotics, etc.
我们在上海有一些关于3D AIGC的研究实习生岗位,欢迎联系我们!
Email /
Scholar /
Github
|
|
|
Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches
Yongzhi Xu, Yonhon Ng, Yifu Wang, Inkyu Sa, Yunfei Duan, Yang Li, Pan Ji, Hongdong Li
Arxiv preprint 2024
Project Page|
Paper|
video
This paper proposes a novel approach for automatically generating interactive (i.e., playable) 3D game scenes from users' casual prompts, including hand-drawn sketches and text descriptions.
|
|
Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane
Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, Hongdong Li, Pan Ji
SIGGRAPH Asia 2024
Project Page|
Paper|
video
We present Frankenstein, a diffusion-based framework that can
generate semantic-compositional 3D scenes in a single pass. Unlike existing
methods that output a single, unified 3D shape, Frankenstein simultaneously
generates multiple separated shapes, each corresponding to a semantically
meaningful part.
|
|
BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation
Zhennan Wu, Yang Li † , Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, and Pan Ji
Transaction on Graphics 2024
  (Selected as SIGGRAPH 2024 Trailer Video)
Project Page|
Paper|
video|
Code
We introduce the first 3D diffusion based approach for directly generating large unbounded 3D scene in both inddor and outdoor scenarios.
At the core of this approach is a novel tri-plane diffusion and tri-plane extrapolation mechanism.
|
|
NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation
Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li , Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan Ji
ECCV 2024
Project Page|
Paper
In this pa- per, we introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
|
|
3D Segmenter: 3D Transformer based Semantic Segmentation via 2D Panoramic Distillation
Zhennan Wu, Yang Li , Yifei Huang, Lin Gu, Tatsuya Harada, and Hiroyuki Sato
ICLR 2023
paper|
Code
We propose the first 2D-to-3D knowledge distillation strategy to enhance 3D semantic segmentation model with knowledge embedded in the latent space of powerful 2D models.
|
|
Non-rigid Point Cloud Registration with Neural Deformation Pyramid
Yang Li and Tatsuya Harada
NeurIPS 2022
paper|
Code
Neural Deformation Pyramid (NDP) break down non-rigid point cloud registration problem via hierarchical motion decomposition.
NDP demonstrates advantage in both speed and registration accuracy.
|
|
Lepard: Learning partial point cloud matching in rigid and deformable scenes
Yang Li and Tatsuya Harada
CVPR 2022
  (Oral Presentation)
paper|
video|
Code
We design Lepard, a novel partial point clouds matching method that exploits 3D positional knowledge.
Lepard reaches SOTA on both rigid and deformable point cloud matching benchmarks.
|
|
4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface
Yang Li, Hiraki Takehara, Takafumi Taketomi, Bo Zheng, and Matthias Nießner
ICCV 2021
paper|
video|
Code
We introduce 4DComplete, the first method that jointly recovers the shape and motion field from partial observations. We also provide a large-scale non-rigid 4D dataset for training and benchmaring. It consists of 1,972 animation sequences, and 122,365 frames.
|
|
SplitFusion: Simultaneous Tracking and Mapping for Non-Rigid Scenes
Yang Li, Tianwei Zhang, Yoshihiko Nakamura, and Tatsuya Harada
IROS 2020
paper|
video
SplitFusion is a dense RGB-D SLAM framework that simultaneously performs tracking and dense reconstruction for both rigid and non-rigid components of the scene.
|
|
Learning to Optimize Non-Rigid Tracking
Yang Li, Aljaž Božič, Tianwei Zhang, Yanli Ji, Tatsuya Harada, and Matthias Nießner
CVPR 2020
  (Oral Presentation)
paper |
video
We learn the tracking of non-rigid objects by differentiating through the underlying non-rigid
solver. Specifically, we propose ConditionNet which learns to generate a problem-specific
preconditioner using a large number of training samples from the Gauss-Newton update equation. The
learned preconditioner increases PCG’s convergence speed by a significant margin.
|
|
FlowFusion: Dynamic Dense RGB-D SLAM Based on Optical Flow
Tianwei Zhang, Huayan Zhang, Yang Li, Yoshihiko Nakamura, and Lei Zhang
ICRA 2020
paper|
video
We present a novel dense RGB-D SLAM solution that simultaneously accomplishes the dynamic/static segmentation and camera ego-motion estimation as well as the static background reconstructions.
|
|
Pose graph optimization for Unsupervised monocular visual odometry
Yang Li, Yoshitaka Ushiku, and Tatsuya Harada
ICRA 2019
paper
We propose to leverage graph optimization and loop closure detection to overcome limitations of unsupervised learning based monocular visual odometry.
|
|