publications | Trung Thanh Nguyen

type

journal

2025

ACM TOMM

Action Selection Learning for Weakly Labeled Multi-modal Multi-view Action Recognition

Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors

ACM Transactions on Multimedia Computing, Communications, and Applications, 2025

DOI Bib

@article{Nguyen2025ACMTOMM,
  note = {International-Journal},
  author = {Nguyen, Trung Thanh and and Kawanishi, Yasutomo and John, Vijay and Komamizu, Takahiro and Ide, Ichiro},
  title = {Action Selection Learning for Weakly Labeled Multi-modal Multi-view Action Recognition},
  journal = {ACM Transactions on Multimedia Computing, Communications, and Applications},
  doi = {10.1145/3744742},
  publisher = {IEEE},
  year = {2025},
}

2024

IEEE Access
Zero-shot Pill-Prescription Matching with Graph Convolutional Network and Contrastive Learning

Trung Thanh Nguyen, Phi Le Nguyen, Yasutomo Kawanishi, and 2 more authors

IEEE Access, 2024

Open Access Abs DOI Bib HTML Code

This paper is an open access article. https://ieeexplore.ieee.org/document/10504270

Patients’ safety is paramount in the healthcare industry, and reducing medication errors is essential for improvement. A promising solution to this problem involves the development of automated systems capable of assisting patients in verifying their pill intake mistakes. This paper investigates a Pill-Prescription matching task that seeks to associate pills in a multi-pill photo with their corresponding names in the prescription. We specifically aim to overcome the limitations of existing pill detection methods when faced with unseen pills, a situation characteristic of zero-shot learning. We propose a novel method named Zero-PIMA (Zero-shot Pill-Prescription Matching), designed to match pill images with prescription names effectively, even for pills not included in the training dataset. Zero-PIMA is an end-to-end model that includes an object localization module to determine and extract features of pill images and a graph convolutional network to capture the spatial relationship of the pills’ text in the prescription. After that, we leverage the contrastive learning paradigm to increase the distance between mismatched pill images and pill name pairs while minimizing the distance between matched pairs. In addition, to deal with the zero-shot pill detection problem, we leverage pills’ metadata retrieved from the DrugBank database to fine-tune a pre-trained text encoder, thereby incorporating visual information about pills (e.g., shape, color) into their names, making them more informative and ultimately enhancing the pill image-name matching accuracy. Extensive experiments are conducted on our collected real-world VAIPE-PP dataset of multi-pill photos and prescriptions. Through a series of comprehensive experiments, the proposed method outperforms other methods for both seen and unseen pills in terms of mean average precision. These results indicate that the proposed method could reduce medication errors and improve patients’ safety.
@article{Nguyen2024IEEEAccess, note = {International-Journal}, author = {Nguyen, Trung Thanh and Nguyen, Phi Le and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro}, title = {Zero-shot Pill-Prescription Matching with Graph Convolutional Network and Contrastive Learning}, journal = {IEEE Access}, publisher = {IEEE}, year = {2024}, doi = {10.1109/ACCESS.2024.3390153}, }

2022

IEEE TNSM
Fuzzy Q-Learning-Based Opportunistic Communication for MEC-Enhanced Vehicular Crowdsensing

Trung Thanh Nguyen, Truong Thao Nguyen, Thanh-Hung Nguyen, and 1 more author

IEEE Transactions on Network and Service Management, 2022

Abs DOI arXiv Bib

This study focuses on MEC-enhanced, vehicle-based crowdsensing systems that rely on devices installed on automobiles. We investigate an opportunistic communication paradigm in which devices can transmit measured data directly to a crowdsensing server over a 4G communication channel or to nearby devices or so-called Road Side Units positioned along the road via Wi-Fi. We tackle a new problem that is how to reduce the cost of 4G while preserving the latency. We propose an offloading strategy that combines a reinforcement learning technique known as Q-learning with Fuzzy logic to accomplish the purpose. Q-learning assists devices in learning to decide the communication channel. Meanwhile, Fuzzy logic is used to optimize the reward function in Q-learning. The experiment results show that our offloading method significantly cuts down around 30-40% of the 4G communication cost while keeping the latency of 99% packets below the required threshold.
@article{9841517, note = {International-Journal}, author = {Nguyen, Trung Thanh and Thao Nguyen, Truong and Nguyen, Thanh-Hung and Nguyen, Phi Le}, journal = {IEEE Transactions on Network and Service Management}, title = {Fuzzy Q-Learning-Based Opportunistic Communication for MEC-Enhanced Vehicular Crowdsensing}, year = {2022}, volume = {19}, number = {4}, pages = {5021-5033}, doi = {10.1109/TNSM.2022.3192397}, }

international conference

2025

ACM MMAsia

MultiSensor-Home: Benchmark for Multi-modal Multi-view Action Recognition in Home Environments

Trung Thanh Nguyen

In Proceedings of the 7th ACM International Conference on Multimedia in Asia (Doctoral Symposium), 2025

Open Access DOI

This paper is an open access article.
ACM MMAsia

Q-Adapter: Visual Query Adapter for Extracting Textually-related Features in Video Captioning

Junan Chen, Trung Thanh Nguyen, Takahiro Komamizu, and 1 more author

In Proceedings of the 7th ACM International Conference on Multimedia in Asia, 2025

DOI arXiv
NeurIPS

Toward a Vision-Language Foundation Model for Medical Data: Multimodal Dataset and Benchmarks for Vietnamese PET/CT Report Generation

Huu Tien Nguyen, Dac Thai Nguyen, The Minh Duc Nguyen, and 11 more authors

In Proceedings of the 39th Conference on Neural Information Processing Systems, 2025

arXiv
ACM MM

IntentVC 2025: The ACM Multimedia Grand Challenge on Intention-Oriented Controllable Video Captioning

Takahiro Komamizu, Marc A. Kastner, Yasutomo Kawanishi, and 2 more authors

In Proceedings of the 33rd ACM International Conference on Multimedia (Grand Challenge), 2025

DOI HTML
IEEE FG
MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion

Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors

In Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition, 2025

Best Student Paper Award DOI arXiv Bib Code

We presented a new multimodal, multi-view dataset called “MultiSensor-Home”, which provides high-resolution and fine-grained frame-level annotations for action recognition in wide-area distributed environments, along with a Transformer-based sensor fusion method called “MultiTSF”, at the international conference FG 2025, and received the Best Student Paper Award.
@inproceedings{nguyen2025multisensor, note = {International-Conference}, title = {MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion}, author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and John, Vijay and Komamizu, Takahiro and Ide, Ichiro}, booktitle = {Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition}, year = {2025}, doi = {10.1109/FG61629.2025.11099071}, primaryclass = {cs.CV}, }
IEEE/CVF WACV
CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach

Dac Thai Nguyen, Trung Thanh Nguyen, Huu Tien Nguyen, and 5 more authors

In Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Abs arXiv Bib Code

Positron Emission Tomography (PET) and Computed Tomography (CT) are essential for diagnosing, staging, and monitoring various diseases, particularly cancer. Despite their importance, the use of PET/CT systems is limited by the necessity for radioactive materials, the scarcity of PET scanners, and the high cost associated with PET imaging. In contrast, CT scanners are more widely available and significantly less expensive. In response to these challenges, our study addresses the issue of generating PET images from CT images, aiming to reduce both the medical examination cost and the associated health risks for patients. Our contributions are twofold: First, we introduce a conditional diffusion model named CPDM, which, to our knowledge, is one of the initial attempts to employ a diffusion model for translating from CT to PET images. Second, we provide the largest CT-PET dataset to date, comprising 2,028,628 paired CT-PET images, which facilitates the training and evaluation of CT-to-PET translation models. For the CPDM model, we incorporate domain knowledge to develop two conditional maps: the Attention map and the Attenuation map. The former helps the diffusion process focus on areas of interest, while the latter improves PET data correction and ensures accurate diagnostic information. Experimental evaluations across various benchmarks demonstrate that CPDM surpasses existing methods in generating high-quality PET images in terms of multiple metrics.
@inproceedings{Nguyen2025CTPET, note = {International-Conference}, author = {Nguyen, Dac Thai and Nguyen, Trung Thanh and Nguyen, Huu Tien and Nguyen, Thanh Trung and Pham, Huy Hieu and Nguyen, Thanh Hung and Nguyen, Truong Thao and Nguyen, Phi Le}, title = {CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach}, booktitle = {Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision}, year = {2025}, publisher = {IEEE}, }

2024

ACM MMAsia
Action Selection Learning for Multi-label Multi-view Action Recognition

Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and 1 more author

In Proceedings of the 6th ACM International Conference on Multimedia in Asia, 2024

Abs DOI arXiv Bib HTML Code

Multi-label multi-view action recognition aims to recognize multiple concurrent or sequential actions from untrimmed videos captured by multiple cameras. Existing work has focused on multi-view action recognition in a narrow area with strong labels available, where the onset and offset of each action are labeled at the frame-level. This study focuses on real-world scenarios where cameras are distributed to capture a wide-range area with only weak labels available at the video-level. We propose a method named Multi-view Action Selection Learning (MultiASL), which leverages action selection learning to enhance view fusion by selecting the most useful information from different viewpoints. The proposed method includes a Multi-view Spatial-Temporal Transformer video encoder to extract spatial and temporal features from multi-viewpoint videos. Action Selection Learning is employed at the frame-level, using pseudo ground-truth obtained from weak labels at the video-level, to identify the most relevant frames for action recognition. Experiments in a real-world office environment using the MM-Office dataset demonstrate the superior performance of the proposed method compared to existing methods.
@inproceedings{10.1145/3696409.3700211, note = {International-Conference}, author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro}, title = {Action Selection Learning for Multi-label Multi-view Action Recognition}, year = {2024}, isbn = {9798400712739}, doi = {10.1145/3696409.3700211}, booktitle = {Proceedings of the 6th ACM International Conference on Multimedia in Asia}, }

IEEE NCA

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

Manh Duong Nguyen, Trung Thanh Nguyen, Huy Hieu Pham, and 3 more authors

In Proceedings of the 22nd IEEE International Symposium on Network Computing and Applications, 2024

arXiv Bib

@inproceedings{Nguyen2024FedMAC,
  note = {International-Conference},
  author = {Nguyen, Manh Duong and Nguyen, Trung Thanh and Pham, Huy Hieu and Hoang, Trong Nghia and Nguyen, Phi Le and Huynh, Thanh Trung},
  title = {FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization},
  booktitle = {Proceedings of the 22nd IEEE International Symposium on Network Computing and Applications},
  year = {2024},
  publisher = {IEEE},
}

IEEE NCA

FedCert: Federated Accuracy Certification

Minh Hieu Nguyen, Huu Tien Nguyen, Trung Thanh Nguyen, and 4 more authors

In Proceedings of the 22nd IEEE International Symposium on Network Computing and Applications, 2024

arXiv Bib

@inproceedings{Nguyen2024FedCert,
  note = {International-Conference},
  author = {Nguyen, Minh Hieu and Nguyen, Huu Tien and Nguyen, Trung Thanh and Nguyen, Manh Duong and Hoang, Trong Nghia and Nguyen, Truong Thao and Nguyen, Phi Le},
  title = {FedCert: Federated Accuracy Certification},
  booktitle = {Proceedings of the 22nd IEEE International Symposium on Network Computing and Applications},
  year = {2024},
  publisher = {IEEE},
}

IEEE FG
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-Scale and Action Label Features

Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and 1 more author

In Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, 2024

Abs DOI arXiv Bib

Open-vocabulary Temporal Action Detection (Open-vocab TAD) is an advanced video analysis approach that expands Closed-vocabulary Temporal Action Detection (Closed-vocab TAD) capabilities. Closed-vocab TAD is typically confined to localizing and classifying actions based on a predefined set of categories. In contrast, Open-vocab TAD goes further and is not limited to these predefined categories. This is particularly useful in real-world scenarios where the variety of actions in videos can be vast and not always predictable. The prevalent methods in Open-vocab TAD typically employ a 2-stage approach, which involves generating action proposals and then identifying those actions. However, errors made during the first stage can adversely affect the subsequent action identification accuracy. Additionally, existing studies face challenges in handling actions of different durations owing to the use of fixed temporal processing methods. Therefore, we propose a 1-stage approach consisting of two primary modules: Multi-scale Video Analysis (MVA) and Video-Text Alignment (VTA). The MVA module captures actions at varying temporal resolutions, overcoming the challenge of detecting actions with diverse durations. The VTA module leverages the synergy between visual and textual modalities to precisely align video segments with corresponding action labels, a critical step for accurate action identification in Open-vocab scenarios. Evaluations on widely recognized datasets THUMOS14 and ActivityNet-1.3, showed that the proposed method achieved superior results compared to the other methods in both Open-vocab and Closed-vocab settings. This serves as a strong demonstration of the effectiveness of the proposed method in the TAD task.
@inproceedings{10581896, note = {International-Conference}, author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro}, booktitle = {Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition}, title = {One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-Scale and Action Label Features}, year = {2024}, doi = {10.1109/FG59268.2024.10581896}, }

2022

PRICAI

A Novel Approach for Pill-Prescription Matching with GNN Assistance and Contrastive Learning

Trung Thanh Nguyen, Hoang Dang Nguyen, Thanh Hung Nguyen, and 3 more authors

In Proceedings of the 2022 Pacific Rim International Conference on Artificial Intelligence, 2022

DOI arXiv Bib

@inproceedings{nguyen2022novel,
  note = {International-Conference},
  title = {A Novel Approach for Pill-Prescription Matching with GNN Assistance and Contrastive Learning},
  author = {Nguyen, Trung Thanh and Nguyen, Hoang Dang and Nguyen, Thanh Hung and Pham, Huy Hieu and Ide, Ichiro and Nguyen, Phi Le},
  booktitle = {Proceedings of the 2022 Pacific Rim International Conference on Artificial Intelligence},
  pages = {261--274},
  year = {2022},
  organization = {Springer},
  doi = {10.1007/978-3-031-20862-1_19},
}

ICPP

FedDRL: Deep Reinforcement Learning-based Adaptive Aggregation for Non-IID Data in Federated Learning

Nang Hung Nguyen, Phi Le Nguyen, Thuy Dung Nguyen, and 5 more authors

In Proceedings of the 51st International Conference on Parallel Processing, 2022

DOI arXiv Bib

@inproceedings{nguyen2022feddrl,
  note = {International-Conference},
  title = {FedDRL: Deep Reinforcement Learning-based Adaptive Aggregation for Non-IID Data in Federated Learning},
  author = {Nguyen, Nang Hung and Nguyen, Phi Le and Nguyen, Thuy Dung and Nguyen, Trung Thanh and Nguyen, Duc Long and Nguyen, Thanh Hung and Pham, Huy Hieu and Truong, Thao Nguyen},
  booktitle = {Proceedings of the 51st International Conference on Parallel Processing},
  year = {2022},
  doi = {10.1145/3545008.3545085},
}

IEEE WCNC

Deep Reinforcement Learning-based Offloading for Latency Minimization in 3-tier V2X Networks

Hieu Dinh, Nang Hung Nguyen, Trung Thanh Nguyen, and 3 more authors

In Proceedings of the 2022 IEEE Wireless Communications and Networking Conference, 2022

DOI Bib

@inproceedings{9771583,
  note = {International-Conference},
  author = {Dinh, Hieu and Nguyen, Nang Hung and Nguyen, Trung Thanh and Nguyen, Thanh Hung and Nguyen, Truong Thao and Le Nguyen, Phi},
  booktitle = {Proceedings of the 2022 IEEE Wireless Communications and Networking Conference},
  title = {Deep Reinforcement Learning-based Offloading for Latency Minimization in 3-tier V2X Networks},
  year = {2022},
  pages = {1803-1808},
  doi = {10.1109/WCNC51071.2022.9771583}
}

2021

KSE

Data Augmentation for Small Face Datasets and Face Verification by Generative Adversarial Networks Inversion

Dinh Tan Nguyen, Cao Truong Tran, Trung Thanh Nguyen, and 4 more authors

In Proceedings of the 13th International Conference on Knowledge and Systems Engineering, 2021

DOI Bib

@inproceedings{9648720,
  note = {International-Conference},
  author = {Nguyen, Dinh Tan and Tran, Cao Truong and Nguyen, Trung Thanh and Hoang, Cao Bao and Luu, Van Phu and Nguyen, Ba Ngoc and Cheong, Pou Ian},
  booktitle = {Proceedings of the 13th International Conference on Knowledge and Systems Engineering},
  title = {Data Augmentation for Small Face Datasets and Face Verification by Generative Adversarial Networks Inversion},
  year = {2021},
  pages = {1-6},
  doi = {10.1109/KSE53942.2021.9648720},
}

IEEE IPCCC
Q-learning-based Opportunistic Communication for Real-time Mobile Air Quality Monitoring Systems

Trung Thanh Nguyen, Truong Thao Nguyen, Tuan Anh Nguyen Dinh, and 2 more authors

In Proceedings of the 2021 IEEE International Performance, Computing, and Communications Conference, 2021

Abs DOI arXiv Bib

We focus on real-time air quality monitoring systems that rely on devices installed on automobiles in this research. We investigate an opportunistic communication model in which devices can send the measured data directly to the air quality server through a 4G communication channel or via Wi-Fi to adjacent devices or the so-called Road Side Units deployed along the road. We aim to reduce 4G costs while assuring data latency, where the data latency is defined as the amount of time it takes for data to reach the server. We propose an offloading scheme that leverages Q-learning to accomplish the purpose. The experiment results show that our offloading method significantly cuts down around 40-50% of the 4G communication cost while keeping the latency of 99.5% packets smaller than the required threshold.
@inproceedings{9679398, note = {International-Conference}, author = {Nguyen, Trung Thanh and Thao Nguyen, Truong and Nguyen Dinh, Tuan Anh and Nguyen, Thanh Hung and Nguyen, Phi Le}, booktitle = {Proceedings of the 2021 IEEE International Performance, Computing, and Communications Conference}, title = {Q-learning-based Opportunistic Communication for Real-time Mobile Air Quality Monitoring Systems}, year = {2021}, doi = {10.1109/IPCCC51483.2021.9679398}, }

domestic journal/conference

2025

MIRU - Japan

MultiSensor-Home: Benchmark for Multi-modal Multi-view Action Recognition in Home Environments

Trung Thanh Nguyen, Yasutomo Kawanishi, John Vijay, and 2 more authors

In Proceedings of the 28th Meeting on Image Recognition and Understanding, 2025

Bib

@inproceedings{Nguyen2025MIRU,
  note = {Domestic-Conference},
  author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and Vijay, John and Komamizu, Takahiro and Ide, Ichiro},
  title = {MultiSensor-Home: Benchmark for Multi-modal Multi-view Action Recognition in Home Environments},
  booktitle = {Proceedings of the 28th Meeting on Image Recognition and Understanding},
  year = {2025}
}

MIRU - Japan

Visual Adapter for Extracting Textually-related Features in Video Captioning

Junan Chen, Trung Thanh Nguyen, Takahiro Komamizu, and 1 more author

In Proceedings of the 28th Meeting on Image Recognition and Understanding, 2025

Bib

@inproceedings{ChenJ2025MIRU,
  note = {Domestic-Conference},
  author = {Chen, Junan and Nguyen, Trung Thanh and Komamizu, Takahiro and Ide, Ichiro},
  title = {Visual Adapter for Extracting Textually-related Features in Video Captioning},
  booktitle = {Proceedings of the 28th Meeting on Image Recognition and Understanding},
  year = {2025}
}

PRMU - Japan

広域多視点マルチモーダル行動認識のためのセンサ統合手法とMultiSensor-Homeデータセットの提案

Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors

In Proceedings of the 2025 Pattern Recognition and Media Understanding , 2025

Bib HTML

@inproceedings{Nguyen2025PRMU,
  note = {Domestic-Conference},
  author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and John, Vijay and Komamizu, Takahiro and Ide, Ichiro},
  title = {広域多視点マルチモーダル行動認識のためのセンサ統合手法とMultiSensor-Homeデータセットの提案},
  booktitle = {Proceedings of the 2025 Pattern Recognition and Media Understanding },
  year = {2025},
}

2024

PRMU - Japan

大域・局所特徴統合埋め込みに基づくオープン語彙時系列行動検出

Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and 1 more author

In Proceedings of the 2024 Pattern Recognition and Media Understanding, 2024

Bib HTML

@inproceedings{Nguyen2024PRMU,
  note = {Domestic-Conference},
  author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro},
  title = {大域・局所特徴統合埋め込みに基づくオープン語彙時系列行動検出},
  booktitle = {Proceedings of the 2024 Pattern Recognition and Media Understanding},
  year = {2024},
}

2023

DBWS - Japan
医療分野におけるゼロショット錠剤‐処方箋対応付け

Trung Thanh Nguyen, Phi Le Nguyen, Yasutomo Kawanishi, and 2 more authors

In Proceedings of the 2023 Tokai-Kansai Database Workshop, 2023

Best Presentation Award Bib

To prevent medication intake errors, we proposed a method to match prescription images with photos of medication, associating the pills in the images with those listed in the prescription. The method was evaluated through participant voting and was awarded the Best Presentation Award.
@inproceedings{Nguyen2023DBWS, note = {Domestic-Conference}, author = {Nguyen, Trung Thanh and Nguyen, Phi Le and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro}, title = {医療分野におけるゼロショット錠剤‐処方箋対応付け}, booktitle = {Proceedings of the 2023 Tokai-Kansai Database Workshop}, year = {2023}, }

MIRU - Japan

時間マルチスケール特徴と行動ラベル特徴によるオープンボキャブラリ行動区間認識

Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and 1 more author

In Proceedings of the 26th Meeting on Image Recognition and Understanding, 2023

Bib PDF

@inproceedings{Nguyen2023,
  note = {Domestic-Conference},
  author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro},
  title = {時間マルチスケール特徴と行動ラベル特徴によるオープンボキャブラリ行動区間認識},
  booktitle = {Proceedings of the 26th Meeting on Image Recognition and Understanding},
  year = {2023}
}

IEICE - Japan

PiDP: 処方箋に対する GCN の対照学習に基づく錠剤検出

Trung Thanh Nguyen, Yasutomo Kawanishi, Takahiro Komamizu, and 1 more author

In Proceedings of the 2023 IEICE General Conference, 2023

Bib PDF

@inproceedings{2023pidp,
  note = {Domestic-Conference},
  title = {PiDP: 処方箋に対する GCN の対照学習に基づく錠剤検出},
  author = {Nguyen, Trung Thanh and Kawanishi, Yasutomo and Komamizu, Takahiro and Ide, Ichiro},
  booktitle = {Proceedings of the 2023 IEICE General Conference},
  year = {2023},
  organization = {The Institute of Electronics, Information and Communication Engineers},
}

2022

HUST

Tối ưu truyền tin trong mạng điện toán biên di động bằng tuận toán Fuzzy Q-learning

Trung Thanh Nguyen, Nang Hung Nguyen, Manh Cuong Dao, and 2 more authors

Scientific Research Student Conference, Hanoi University of Science and Technology, 2022

PDF

2021

JSTIC - Vietnam
Mô hình mạng giao thông thông minh ba tầng và giải pháp giảm thiểu độ trễ truyền tin ứng dụng phương pháp học tăng cường

Nang Hung Nguyen, Trung Thanh Nguyen, Minh Khiem Pham, and 4 more authors

Journal of Science and Technology on Information and Communications, 2021

Abs Bib HTML PDF

Ứng dụng công nghệ điện toán đám mây di động (MCC) và điện toán biên di động (MEC) để giảm tải tác vụ tính toán là hướng tiếp cận đầy hứa hẹn cho phép các ứng dụng yêu cầu độ trễ thấp có thể thực thi trên các phương tiện giao thông thông minh. Hầu hết các nghiên cứu về giảm tải hiện nay tập trung vào sử dụng một mô hình đơn nhất là MEC hoặc MCC. Trong báo cáo này, chúng tôi nghiên cứu việc sử dụng kết hợp các công nghệ trong một mạng giao thông thông minh ba tầng V2X, trong đó phương tiện giao thông có thể giảm tải các tác vụ tính toán bằng cả MEC và MCC. Cụ thể, chúng tôi xem xét các xác suất tối ưu để định tuyến truyền tải trên ba đường truyền tin bao gồm: V2I, V2N và I2N. Nghiên cứu của chúng tôi có các đóng góp chính: Thứ nhất, chúng tôi đề xuất mô hình toán học tính toán chính xác độ trễ xử lý của các tác vụ, và công thức tìm lời giải tối ưu để đạt được độ trễ tối thiểu. Thứ hai, chúng tôi đề xuất phương pháp tối ưu dựa trên giải thuật di truyền (GA). Thứ ba, nhận thấy yếu điểm của GA, chúng tôi đề xuất một giải thuật ứng dụng kỹ thuật học tăng cường: multi-agent multi-armed bandits (MAB) với cải tiến cơ chế khám phá dựa trên hàm Sigmoid (SEM) để tối thiểu hóa độ trễ. Kết quả thí nghiệm cho thấy rằng thuật toán GA chúng tôi đề xuất trong mô hình ba tầng có thể rút ngắn độ trễ lên đến 99,9% so với các mô hình hai tầng hiện có, cơ chế khám phá cải tiến SEM chúng tôi đề xuất có độ trễ các tác vụ trung bình giảm 35% so với MAB cơ bản, thí nghiệm cũng cho thấy thuật toán MAB cải tiến rút ngắn độ trễ với trung bình 18.5% và 56.9% trong trường hợp tốt nhất so với thuật toán GA.
@article{JSTIC2021, note = {Domestic-Journal}, title = {Mô hình mạng giao thông thông minh ba tầng và giải pháp giảm thiểu độ trễ truyền tin ứng dụng phương pháp học tăng cường}, author = {Nguyen, Nang Hung and Nguyen, Trung Thanh and Pham, Minh Khiem and Dinh, Van Hieu and Nguyen Dinh, Tuan Anh and Nguyen, Thanh Hung and Nguyen, Phi Le}, year = {2021}, journal = {Journal of Science and Technology on Information and Communications}, }
HUST

Mô hình mạng giao thông thông minh ba tầng và giải pháp giảm thiểu độ trễ truyền tin ứng dụng phương pháp học tăng cường

Minh Khiem Pham, Nang Hung Nguyen, Trung Thanh Nguyen, and 2 more authors

Scientific Research Student Conference, Hanoi University of Science and Technology, 2021

PDF

thesis

2024

Master

Open-Vocabulary Temporal Action Detection Leveraging Multi-scale Video-Text Fusion

Trung Thanh Nguyen

Nagoya University, Japan, 2024

Honorary Valedictorian

This thesis was awarded the Honorary Valedictorian by the Graduate School of Informatics, Nagoya University, Japan.

2022

Bachelor

Đề xuất mô hình nhận diện thuốc uống theo đơn ứng dụng mạng đồ thị và kỹ thuật học đối kháng

Trung Thanh Nguyen

Hanoi University of Science and Technology, Vietnam, 2022

Best Thesis Award

This thesis was awarded the Best Thesis Award by the School of Information and Communications Technology, Hanoi University of Science and Technology, Vietnam.

miscellaneous (article, arXiv, etc.)

2025

arXiv

MultiTSF: Transformer-based Sensor Fusion for Human-Centric Multi-view and Multi-modal Action Recognition

Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, and 2 more authors

arXiv preprint arXiv:2504.02279, 2025

arXiv