I am Zhenqiang Li, a computer vision researcher and engineer working across both academic research and industrial system development. My interests include multimodal perception, video understanding, and efficient deployment across platforms from GPU/CUDA to edge and embedded systems.

I received my Ph.D. in Information Science and Technology from The University of Tokyo under the supervision of Prof. Yoichi Sato.

Portrait of Zhenqiang Li

Current Focus

  • Computer vision and machine learning
  • Vision-language models
  • Model deployment and inference acceleration
  • Efficient computer vision systems

Research Themes

  • Video understanding. Learning from video for performance assessment, temporal reasoning, and structured prediction.
  • Explainable video models. Model-agnostic explanation, attribution, and interpretable intermediate representations.
  • Semantic and temporal structure. Methods that expose meaningful abstractions for analysis, supervision, and improved transparency.
  • Geometry-aware generation. Collaborative work on video generation with 3D consistency using GAN and Diffusion.

See research projects

Engineering Themes

  • Real-time vision systems. SDKs, real-time demos, and perception pipelines designed for practical deployment.
  • Multimodal perception. RGB, event-based, stereo, and cross-modal pipelines for tracking, gaze, interpolation, and scene understanding.
  • Deployment-aware engineering. Optimization and integration under constraints spanning GPU/CUDA, mobile, SoC, and MCU targets.

See engineering projects

Education

Ph.D. in Information Science and Technology

Dissertation: Interpretable Neural Networks for Human Activity Understanding

M.Sc. in Information Science and Technology

B.Eng. in Control Engineering