Zhenqiang Li

I am Zhenqiang Li, a computer vision researcher and engineer working across both academic research and industrial system development. My interests include multimodal perception, video understanding, and efficient deployment across platforms from GPU/CUDA to edge and embedded systems.

I received my Ph.D. in Information Science and Technology from The University of Tokyo under the supervision of Prof. Yoichi Sato.

Current Focus

Computer vision and machine learning
Vision-language models
Model deployment and inference acceleration
Efficient computer vision systems

Research Themes

Video understanding. Learning from video for performance assessment, temporal reasoning, and structured prediction.
Explainable video models. Model-agnostic explanation, attribution, and interpretable intermediate representations.
Semantic and temporal structure. Methods that expose meaningful abstractions for analysis, supervision, and improved transparency.
Geometry-aware generation. Collaborative work on video generation with 3D consistency using GAN and Diffusion.

See research projects

Engineering Themes

Real-time vision systems. SDKs, real-time demos, and perception pipelines designed for practical deployment.
Multimodal perception. RGB, event-based, stereo, and cross-modal pipelines for tracking, gaze, interpolation, and scene understanding.
Deployment-aware engineering. Optimization and integration under constraints spanning GPU/CUDA, mobile, SoC, and MCU targets.

See engineering projects

Education

Ph.D. in Information Science and Technology
The University of Tokyo, 2019-2022
Dissertation: Interpretable Neural Networks for Human Activity Understanding

M.Sc. in Information Science and Technology
The University of Tokyo, 2017-2019

B.Eng. in Control Engineering
Huazhong University of Science and Technology, 2012-2016