Yilong Zhu

Hello, I'm Yilong Zhu

I am a Ph.D. candidate in the Aerial Robotics Group at the Hong Kong University of Science and Technology (HKUST), under the supervision of Prof. Shaojie Shen. My research centers on navigation systems for autonomous platforms, with emphasis on robust localization, mapping, and cross-view perception.

Currently, I am a VLA Research Intern at China Merchants Lion Rock AI Lab, working on Vision-Language-Action models for robotic systems.

Prior to pursuing my Ph.D., I served as Algorithm Leader in the Mapping and Localization Group at Unity Drive Innovation, where I led the development of multi-sensor localization systems integrating LiDAR, UWB, and inertial measurements.

My work has been published in premier robotics journals including T-RO and IJRR. I hold multiple patents in localization technologies and serve as a reviewer for ICRA, IROS, and T-RO.

EmailGitHubGoogle Scholar

News

[2025/12] Successfully defended my PhD thesis: "Robot Navigation from Explicit Geometry to Implicit Models". Grateful to my committee: Prof. Kun Xu (Chair), Prof. Shaojie Shen (Supervisor), Prof. Wei Zhang, Prof. Zili Meng, Prof. Yang Gao, and external examiner Prof. Dimitrios Kanoulas (UCL, UK).
[2025/12] Released my iOS app RTK Helper, a professional high-precision positioning tool on the App Store.
[2025/6] Invited to serve as Session Co-Chair for "Autonomous Vehicles 3" at IROS 2025. Looking forward to facilitating discussions in Hangzhou, China!
[2025/6] Paper accepted to IROS 2025 on Visual Localization using Novel Satellite Imagery. See you in Hangzhou, China!
[2025/2] Paper accepted to IEEE Transactions on Instrumentation and Measurement on Global Optimal Solutions to Scaled Quadratic Pose Estimation Problems.
[2025/2] Paper accepted to IEEE Transactions on Instrumentation and Measurement on Globally Optimal Estimation of Accelerometer-Magnetometer Misalignment.
[2024/12] Paper accepted to IEEE Robotics and Automation Letters on Efficient Camera Exposure Control for Visual Odometry via Deep Reinforcement Learning.

Research Interests

I am actively seeking industry positions or Postdoctoral opportunities for 2026 Spring. Please reach out!

My research bridges classical robotics and learning-based methods, focusing on robust, generalizable navigation systems. I am interested in augmenting physically grounded models with high-level semantic reasoning and generative priors.

Simultaneous Localization and Mapping (SLAM)

Developing tightly-coupled optimization frameworks using LiDAR, IMU, and UWB to achieve drift-free, real-time state estimation.

SLAM Visualization

Bird's Eye View (BEV) Localization

Exploring BEV-based geometric and semantic alignment between ego-view and satellite-view images to enable cross-view localization.

BEV Visualization

Robust LiDAR-Inertial Localization (RLIL)

A system integrating motion distortion correction, IMU bias estimation, and Kd-tree-accelerated scan matching. RLIL achieves centimeter-level accuracy in GNSS-denied environments.

RLIL Overview

Dynamic-Aware Localization

Constructing static TSDF maps with dynamic object removal and using scan-to-map deviations to filter dynamic points in real-time.

Dynamic Localization

Generative Models for Cross-view Understanding

Applying diffusion models to generate semantically aligned BEV representations from monocular images.

Generative Models

DiffLoc: Semantic-Guided BEV Generation

Integrates IPM, Navier-Stokes inpainting, and CLIP-based description to synthesize BEV representations for matching against satellite imagery.

DiffLoc

Long-Horizon & Cross-Modal Navigation

Integrating fast reactive modules with slow semantic planners leveraging satellite imagery and VLMs to enable long-horizon navigation and early-warning instructions.