Trung Thanh Nguyen

🔬 PhD Candidate @ Nagoya University | Student Researcher @ RIKEN

prof_pic.jpg

I am a PhD candidate at Nagoya University, specializing in the Department of Intelligent Systems. My research focuses on vision-language models, multimodal recognition, and video captioning, with applications in solving real-world problems.

Currently, I am a student researcher at RIKEN National Science Institute, working on the Guardian Robot Project. My research involves open-world action detection and multi-view multi-modal action recognition by analyzing multimodal sensory data.

Additionally, I am in charge at the Center for Artificial Intelligence, Mathematical and Data Science, collaborating with Japanese corporations to develop practical AI solutions.

đź“© Contact: nguyent (at) cs.is.i.nagoya-u.ac.jp

Google Scholar   LinkedIn

news

Dec 03, 2025 I have successfully completed my PhD pre-defense. Onward to the final defense!
Nov 11, 2025 2 papers — “View-aware Cross-modal Distillation for Multi-view Action Recognition” and “PADM: A Physics-aware Diffusion Model for Attenuation Correction” — have been accepted to IEEE/CVF WACV2026, United States.
Oct 21, 2025 Our paper “Hierarchical Global-Local Fusion for One-stage Open-vocabulary Temporal Action Detection” has been accepted to ACM TOMM (IF: 6.0) journal.
Oct 08, 2025 I was selected as a Rising Star for the Freiburg Rising Stars Academy, Universität Freiburg, Germany.
Oct 03, 2025 I was selected to present my PhD research at the Doctoral Symposium of ACM MMAsia, Malaysia.
Oct 01, 2025 Our paper “Q-Adapter: Visual Query Adapter for Extracting Textually-related Features in Video Captioning” has been accepted to ACM MMAsia, Malaysia.
Sep 18, 2025 Our paper “Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation” has been accepted to NeurIPS, United States.
Aug 25, 2025 I was awarded a research grant from Murata Foundation (est. 1970), Japan.
Aug 01, 2025 I was awarded a research grant from THERS (National University Corporation), Japan.
Aug 01, 2025 We presented 2 papers (IS3-038, IS3-148) at MIRU2025, Japan.

selected publications

  1. IEEE/CVF WACV
    2026_WACV_Nguyen_GA.jpg
    View-aware Cross-modal Distillation for Multi-view Action Recognition
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors
    In Proceedings of the 2026 IEEE/CVF Winter Conference on Applications of Computer Vision, 2026
  2. ACM TOMM
    2024_FG_GA.jpg
    Hierarchical Local-Global Fusion for One-stage Open-vocabulary Temporal Action Detection
    Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and 1 more author
    ACM Transactions on Multimedia Computing, Communications, and Applications, 2025
  3. ACM TOMM
    2025_MMASL_GA.jpg
    Action Selection Learning for Weakly Labeled Multi-modal Multi-view Action Recognition
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors
    ACM Transactions on Multimedia Computing, Communications, and Applications, 2025
  4. IEEE FG
    2025_FG_GA.jpg
    MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion
    Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors
    In Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition, 2025
  5. IEEE Access
    2024_IEEEACCESS_GA.jpg
    Zero-shot Pill-Prescription Matching with Graph Convolutional Network and Contrastive Learning
    Trung Thanh Nguyen, Phi Le Nguyen, Yasutomo Kawanishi, and 2 more authors
    IEEE Access, 2024
  6. IEEE TNSM
    2022_TNSM_GA.png
    Fuzzy Q-Learning-Based Opportunistic Communication for MEC-Enhanced Vehicular Crowdsensing
    Trung Thanh Nguyen, Truong Thao Nguyen, Thanh-Hung Nguyen, and 1 more author
    IEEE Transactions on Network and Service Management, 2022