Zhenqiang Li
I am Zhenqiang Li, a computer vision researcher and engineer working across both academic research and industrial system development. My interests include multimodal perception, video understanding, and efficient deployment across platforms from GPU/CUDA to edge and embedded systems.
I received my Ph.D. in Information Science and Technology from The University of Tokyo under the supervision of Prof. Yoichi Sato.
Current Focus
- Computer vision and machine learning
- Vision-language models
- Model deployment and inference acceleration
- Efficient computer vision systems
Research Themes
- Video understanding. Learning from video for performance assessment, temporal reasoning, and structured prediction.
- Explainable video models. Model-agnostic explanation, attribution, and interpretable intermediate representations.
- Semantic and temporal structure. Methods that expose meaningful abstractions for analysis, supervision, and improved transparency.
- Geometry-aware generation. Collaborative work on video generation with 3D consistency using GAN and Diffusion.
Engineering Themes
- Real-time vision systems. SDKs, real-time demos, and perception pipelines designed for practical deployment.
- Multimodal perception. RGB, event-based, stereo, and cross-modal pipelines for tracking, gaze, interpolation, and scene understanding.
- Deployment-aware engineering. Optimization and integration under constraints spanning GPU/CUDA, mobile, SoC, and MCU targets.
Education
Ph.D. in Information Science and Technology
Dissertation: Interpretable Neural Networks for Human Activity Understanding
M.Sc. in Information Science and Technology
B.Eng. in Control Engineering