Trung Thanh Nguyen
🔬 PhD Candidate @ Nagoya University | Student Researcher @ RIKEN

I am a PhD candidate at Nagoya University, specializing in the Department of Intelligent Systems. My research focuses on vision-language models, multimodal recognition, and video captioning, with applications in solving real-world problems.
Currently, I am a student researcher at RIKEN National Science Institute, working on the Guardian Robot Project. My research involves open-world action detection and multi-view multi-modal action recognition by analyzing multimodal sensory data.
Additionally, I am in charge at the Center for Artificial Intelligence, Mathematical and Data Science, collaborating with Japanese corporations to develop practical AI solutions.
đź“© Contact: nguyent (at) cs.is.i.nagoya-u.ac.jp
news
Oct 08, 2025 | I was selected as a Rising Star for the Freiburg Rising Stars Academy, Universität Freiburg, Germany. |
---|---|
Oct 03, 2025 | I was selected to present my PhD research at the Doctoral Symposium of ACM MMAsia, Malaysia. |
Oct 01, 2025 | Our paper “Q-Adapter: Visual Query Adapter for Extracting Textually-related Features in Video Captioning” has been accepted to ACM MMAsia, Malaysia. |
Sep 18, 2025 | Our paper “Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation” has been accepted to NeurIPS, United States. |
Aug 25, 2025 | I was awarded a research grant from Murata Foundation (est. 1970), Japan. |
Aug 01, 2025 | I was awarded a research grant from THERS (National University Corporation), Japan. |
Aug 01, 2025 | We presented 2 papers (IS3-038, IS3-148) at MIRU2025, Japan. |
Jul 17, 2025 | I received a Letter of Appreciation from RIKEN in recognition of outstanding research achievements. |
Jul 16, 2025 | I received a Certificate of Achievement from Academia Sinica, Taiwan. |
Jun 24, 2025 | On a business trip to Academia Sinica until Jul 16, Taiwan. |
selected publications
- IEEE FGMultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor FusionIn Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition, 2025