Engineering

My engineering work focuses on real-time computer vision systems, multimodal perception, and efficient deployment across NVIDIA GPU/CUDA, mobile, SoC, and MCU platforms.

Overview

System types: C++ SDKs, real-time demos, embedded perception pipelines, and validation platforms
Technical themes: multimodal perception, geometry-aware vision, on-device deployment, and inference acceleration
Deployment targets: NVIDIA GPU/CUDA platforms, mobile, SoC, and MCU

Multimodal Pose and Hand Tracking Demo

Designed lightweight multimodal models for pose and hand tracking from grayscale and event-based input, then built real-time PC demos to showcase the system. Extended model outputs with motion cues and implemented an asynchronous pipeline to improve tracking behavior and throughput.

Focus

Lightweight multimodal detection and tracking
Real-time demo system implementation
Tracking-oriented output design and asynchronous execution

Real-Time Gaze Tracking SDK

Built a real-time C++ SDK for face detection and gaze tracking with RGB and event-based input support. Integrated calibration, screen-space mapping, and runtime inference into a PC validation platform running at real-time speed.

Focus

RGB and event-based multimodal input
Real-time PC validation platform
C++ SDK interface and runtime integration

Mobile Multi-Frame Denoising SDK

Implemented a C++ SDK for mobile-oriented multi-frame image denoising from an existing Python reference. Designed a reusable tile-based computation abstraction to support a stable alignment-and-merge pipeline and future SIMD optimization.

Focus

Python-to-C++ translation for deployment
Reusable tile-based computation abstraction
Mobile-oriented SDK design

On-Device Speaker-Awareness Pipeline

Designed a deployable perception pipeline for a wearable dual-camera prototype that detects whether any visible person is speaking toward the wearer. Translated the product goal into real-time calibration, face analysis, speaking detection, and multi-view fusion under low-power device constraints.

Focus

Product requirement to perception formulation
Real-time stereo calibration and fusion
Low-power wearable deployment constraints

Ultra-Low-Power Embedded Detection Models

Developed compact embedded models for presence classification and object detection on a highly constrained low-power platform. Combined hardware-aware architecture design, profiling, quantization, and distillation to meet strict latency, memory, and power budgets.

Focus

TinyML-style model delivery
Hardware-aware model architecture refinement
Quantization, distillation, and profiling

Event-Based Motion Estimation and Tracking

Advanced a pre-research effort from low-level optical flow optimization to lightweight motion estimation redesign and multimodal tracking. Combined event-derived motion cues with RGB tracking to improve temporal association quality.

Focus

Event-based optical flow optimization
Lightweight motion-estimation redesign
Multimodal tracking with RGB plus event motion cues

Event-Assisted Stereo Frame Interpolation

Designed a multi-stage pipeline for reconstructing high-frame-rate high-resolution RGB video from stereo RGB, grayscale, and event inputs. Combined geometry, motion estimation, and interpolation to outperform a strong RGB-only baseline in internal evaluation.

Focus

Multi-rate stereo and event-based input fusion
Geometry-aware warp inference
High-frame-rate RGB reconstruction