Trung Thanh Nguyen

🔬 PhD Candidate @ Nagoya University | Student Researcher @ RIKEN 🇯🇵 | Visiting Researcher @ Universität Freiburg 🇩🇪

prof_pic.jpg

I am a PhD Candidate at Nagoya University, specializing in the Department of Intelligent Systems. My research focuses on vision-language models, multimodal recognition, and video captioning, with applications in solving real-world problems.

PhD Candidate
Nagoya University — Graduate School of Informatics, Japan
Student Researcher
RIKEN National Science Institute — Guardian Robot Project, Japan
Visiting Researcher
University of Freiburg — Excellence Cluster Future Forests, Germany
Higher Education — Industry Collaboration

news

May 27, 2026 Our paper, “PRIMS: Physics-guided Representation for Fluid Identification in Multimodal Sensing,” has been accepted to ECML PKDD 2026, Naples 🇮🇹.
Apr 19, 2026 Our paper on the MultiSensor-Home dataset was accepted in Pattern Recognition (IF: 7.6).
Apr 17, 2026 I was selected to present my PhD research at the Doctoral Consortium of IEEE FG2026, Kyoto 🇯🇵.
Mar 31, 2026 🇩🇪 Universität Freiburg: “International researchers are networking at the Freiburg Rising Stars Academy
Feb 16, 2026 My interview in a special feature “The Reality of the Doctoral Program” (in Japanese) by Nagoya University is now published on Tamatebako (玉手箱).
Feb 03, 2026 I have received a Certificate of Completion from the ACM Asian School on HPC and AI, Japan.
Dec 18, 2025 I was invited by RIKEN R-CSS to attend the SCA/HPC Asia 2026 and the ACM Asia School on HPC and AI, Japan.
Dec 12, 2025 Our paper Q-Adapter won the Best Oral Award at ACM MMAsia, Malaysia.
Dec 03, 2025 I have successfully completed my PhD pre-defense. Onward to the final defense!
Nov 11, 2025 2 papers — ViCoKD and PADM — have been accepted to IEEE/CVF WACV2026, United States.

latest posts

selected publications

  1. Pattern Recognit.
    2026_PR_GA.png
    MultiSensor-Home: Multi-modal multi-view dataset and benchmarks for action recognition in home environments
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, Takahiro Komamizu, and Ichiro Ide
    Pattern Recognition, 2026
  2. ACM TOMM
    2025_MMASL_GA.jpg
    Action Selection Learning for Weakly Labeled Multi-modal Multi-view Action Recognition
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, Takahiro Komamizu, and Ichiro Ide
    ACM Transactions on Multimedia Computing, Communications, and Applications, 2026
  3. ACM TOMM
    2024_FG_GA.jpg
    Hierarchical Local-Global Fusion for One-stage Open-vocabulary Temporal Action Detection
    Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and Ichiro Ide
    ACM Transactions on Multimedia Computing, Communications, and Applications, 2026
  1. IEEE/CVF WACV
    2026_WACV_Nguyen_GA.jpg
    View-aware Cross-modal Distillation for Multi-view Action Recognition
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, Takahiro Komamizu, and Ichiro Ide
    In Proceedings of the 2026 IEEE/CVF Winter Conference on Applications of Computer Vision, 2026
  2. IEEE FG
    2025_FG_GA.jpg
    MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, Takahiro Komamizu, and Ichiro Ide
    In Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition, 2025