Chihan Huang, Xiaobo Shen
AAAI Conference on Artificial Intelligence (AAAI) 2025 Poster
Deep hashing models have achieved great success in retrieval tasks due to their powerful representation and strong information compression capabilities. However, they inherit the vulnerability of deep neural networks to adversarial perturbations. Attackers can severely impact the retrieval capability of hashing models by adding subtle, carefully crafted adversarial perturbations to benign images, transforming them into adversarial images. Most existing adversarial attacks target image classification models, with few focusing on retrieval models. We propose HUANG, the first targeted adversarial attack algorithm to leverage a diffusion model for hashing retrieval in black-box scenarios. In our approach, adversarial denoising uses adversarial perturbations and residual image to guide the shift from benign to adversarial distribution. Extensive experiments demonstrate the superiority of HUANG across different datasets, achieving state-of-the-art performance in black-box targeted attacks. Additionally, the dynamic interplay between denoising and adding adversarial perturbations in adversarial denoising endows HUANG with exceptional robustness and transferability.
Chihan Huang, Xiaobo Shen
AAAI Conference on Artificial Intelligence (AAAI) 2025 Poster
Deep hashing models have achieved great success in retrieval tasks due to their powerful representation and strong information compression capabilities. However, they inherit the vulnerability of deep neural networks to adversarial perturbations. Attackers can severely impact the retrieval capability of hashing models by adding subtle, carefully crafted adversarial perturbations to benign images, transforming them into adversarial images. Most existing adversarial attacks target image classification models, with few focusing on retrieval models. We propose HUANG, the first targeted adversarial attack algorithm to leverage a diffusion model for hashing retrieval in black-box scenarios. In our approach, adversarial denoising uses adversarial perturbations and residual image to guide the shift from benign to adversarial distribution. Extensive experiments demonstrate the superiority of HUANG across different datasets, achieving state-of-the-art performance in black-box targeted attacks. Additionally, the dynamic interplay between denoising and adding adversarial perturbations in adversarial denoising endows HUANG with exceptional robustness and transferability.
Chihan Huang, Xiaobo Shen
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025 Poster
Deep hashing have achieved exceptional performance in retrieval tasks due to their robust representational capabilities. However, they inherit the vulnerability of deep neural networks to adversarial attacks. These models are susceptible to finely crafted adversarial perturbations that can lead them to return incorrect retrieval results. Although numerous adversarial attack methods have been proposed, there has been a scarcity of research focusing on targeted black-box attacks against deep hashing models. We introduce the Efficient Multi-branch Black-box Semantic-aware Targeted Attack against Deep Hashing Retrieval (EmbSTar), capable of executing targeted black-box attacks on hashing models. Initially, we distill the target model to create a knockoff model. Subsequently, we devised novel Target Fusion and Target Adaptation modules to integrate and enhance the semantic information of the target label and image. Knockoff model is then utilized to align the adversarial image more closely with the target image semantically. With the knockoff model, we can obtain powerful targeted attacks with few queries. Extensive experiments demonstrate that EmbSTar significantly surpasses previous models in its targeted attack capabilities, achieving SOTA performance for targeted black-box attacks.
Chihan Huang, Xiaobo Shen
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025 Poster
Deep hashing have achieved exceptional performance in retrieval tasks due to their robust representational capabilities. However, they inherit the vulnerability of deep neural networks to adversarial attacks. These models are susceptible to finely crafted adversarial perturbations that can lead them to return incorrect retrieval results. Although numerous adversarial attack methods have been proposed, there has been a scarcity of research focusing on targeted black-box attacks against deep hashing models. We introduce the Efficient Multi-branch Black-box Semantic-aware Targeted Attack against Deep Hashing Retrieval (EmbSTar), capable of executing targeted black-box attacks on hashing models. Initially, we distill the target model to create a knockoff model. Subsequently, we devised novel Target Fusion and Target Adaptation modules to integrate and enhance the semantic information of the target label and image. Knockoff model is then utilized to align the adversarial image more closely with the target image semantically. With the knockoff model, we can obtain powerful targeted attacks with few queries. Extensive experiments demonstrate that EmbSTar significantly surpasses previous models in its targeted attack capabilities, achieving SOTA performance for targeted black-box attacks.
Chihan Huang, Xiaobo Shen
International Conference on Computational Linguistics (COLING) 2025 Poster
Ancient Chinese poetry stands as a crucial treasure in Chinese culture. To address the absence of pre-trained models for ancient poetry, we introduced PoemBERT, a BERT-based model utilizing a corpus of classical Chinese poetry. Recognizing the unique emotional depth and linguistic precision of poetry, we incorporated sentiment and pinyin embeddings into the model, enhancing its sensitivity to emotional information and addressing challenges posed by the phenomenon of multiple pronunciations for the same Chinese character. Additionally, we proposed Character Importance-based masking and dynamic masking strategies, significantly augmenting the model's capability to extract imagery-related features and handle poetry-specific information. Fine-tuning our PoemBERT model on various downstream tasks, including poem generation and sentiment classification, resulted in state-of-the-art performance in both automatic and manual evaluations. We provided explanations for the selection of the dynamic masking rate strategy and proposed a solution to the issue of a small dataset size.
Chihan Huang, Xiaobo Shen
International Conference on Computational Linguistics (COLING) 2025 Poster
Ancient Chinese poetry stands as a crucial treasure in Chinese culture. To address the absence of pre-trained models for ancient poetry, we introduced PoemBERT, a BERT-based model utilizing a corpus of classical Chinese poetry. Recognizing the unique emotional depth and linguistic precision of poetry, we incorporated sentiment and pinyin embeddings into the model, enhancing its sensitivity to emotional information and addressing challenges posed by the phenomenon of multiple pronunciations for the same Chinese character. Additionally, we proposed Character Importance-based masking and dynamic masking strategies, significantly augmenting the model's capability to extract imagery-related features and handle poetry-specific information. Fine-tuning our PoemBERT model on various downstream tasks, including poem generation and sentiment classification, resulted in state-of-the-art performance in both automatic and manual evaluations. We provided explanations for the selection of the dynamic masking rate strategy and proposed a solution to the issue of a small dataset size.
Chihan Huang
European Conference on Artificial Intelligence (ECAI) 2024 Poster
The translation of artistic style is a challenging yet crucial task for both computer vision and the arts, and the unique attributes of Chinese ink painting—such as its use of negative space, brushwork, ink diffusion, and more—present significant challenges to the application of existing style transfer algorithms. In response to these distinctive characteristics, we propose a progressive artistic aethetic ink painting style transfer method. The progressive multi-scale aesthetic style attention module in the network leverages the complementary benefits of shallow and deep style information to progressively fuse style features across multiple scales. The covariance transform fusion module addresses issues of stylistic disharmony and enhances the aesthetic quality of the style transfer while preserving the content structure effectively. Additionally, we have developed adaptive spatial interpolation module for detailed information finetuning. Finally, we conducted comparative experiments with previous studies as well as ablation studies, and invited 30 experts in art and design to perform manual evaluations. The results demonstrate that our method can achieve more aesthetically pleasing Chinese ink painting style transfers, confirming its effectiveness and artistic integrity.
Chihan Huang
European Conference on Artificial Intelligence (ECAI) 2024 Poster
The translation of artistic style is a challenging yet crucial task for both computer vision and the arts, and the unique attributes of Chinese ink painting—such as its use of negative space, brushwork, ink diffusion, and more—present significant challenges to the application of existing style transfer algorithms. In response to these distinctive characteristics, we propose a progressive artistic aethetic ink painting style transfer method. The progressive multi-scale aesthetic style attention module in the network leverages the complementary benefits of shallow and deep style information to progressively fuse style features across multiple scales. The covariance transform fusion module addresses issues of stylistic disharmony and enhances the aesthetic quality of the style transfer while preserving the content structure effectively. Additionally, we have developed adaptive spatial interpolation module for detailed information finetuning. Finally, we conducted comparative experiments with previous studies as well as ablation studies, and invited 30 experts in art and design to perform manual evaluations. The results demonstrate that our method can achieve more aesthetically pleasing Chinese ink painting style transfers, confirming its effectiveness and artistic integrity.
Chihan Huang
International Journal of Crashworthiness 2024
The lightweight of the bumper is conducive to reducing fuel consumption and pollution emission. How to better design and choose a better method in lightweight design was a problem that researchers needed to solve urgently. This paper established a finite element model of an automobile bumper and used the LS-DYNA system for a collision simulation. First, the shape of the bumper is optimised. Then, this paper selects bumper beam thickness, energy absorber thickness, thickened area width and thickened area thickness as design variables to construct the response surface model and the R2 of each model is larger than 0.9. Afterwards, the NSGA-II with Bayesian optimisation is adopted to realise the lightweight design. Finally, the results are simulated and compared with the original bumper. The results convey that the optimised bumper gains a weight reduction of 53.96% while improving crashworthiness and energy absorption.
Chihan Huang
International Journal of Crashworthiness 2024
The lightweight of the bumper is conducive to reducing fuel consumption and pollution emission. How to better design and choose a better method in lightweight design was a problem that researchers needed to solve urgently. This paper established a finite element model of an automobile bumper and used the LS-DYNA system for a collision simulation. First, the shape of the bumper is optimised. Then, this paper selects bumper beam thickness, energy absorber thickness, thickened area width and thickened area thickness as design variables to construct the response surface model and the R2 of each model is larger than 0.9. Afterwards, the NSGA-II with Bayesian optimisation is adopted to realise the lightweight design. Finally, the results are simulated and compared with the original bumper. The results convey that the optimised bumper gains a weight reduction of 53.96% while improving crashworthiness and energy absorption.
黄驰涵, 沈肖波
南京信息工程大学学报 2024
跨模态行人重识别是一项具有挑战性的任务,目的是在可见光和红外模式之间匹配行人图像,以便在犯罪调查和智能视频监控应用中发挥重要作用.为了解决跨模态行人重识别任务中对细粒度特征提取能力不强的问题,本文提出一种基于融合注意力和特征增强的行人重识别模型.首先,利用自动数据增强技术缓解不同摄像机的视角、尺度差异,并基于交叉注意力多尺度Vision Transformer,通过处理多尺度特征生成具有更强区分性的特征表示;接着,提出通道注意力和空间注意力机制,在融合可见光和红外图像特征时学习对区分特征重要的信息;最后,设计损失函数,采用基于自适应权重的难三元组损失,增强了每个样本之间的相关性,提高了可见光和红外图像对不同行人的识别能力.在SYSU-MM01和RegDB数据集上进行大量实验,结果表明,本文提出方法的mAP分别达到了68.05%和85.19%,相较之前的工作性能有所提升,且通过消融实验和对比分析验证了本文模型的先进性和有效性.
黄驰涵, 沈肖波
南京信息工程大学学报 2024
跨模态行人重识别是一项具有挑战性的任务,目的是在可见光和红外模式之间匹配行人图像,以便在犯罪调查和智能视频监控应用中发挥重要作用.为了解决跨模态行人重识别任务中对细粒度特征提取能力不强的问题,本文提出一种基于融合注意力和特征增强的行人重识别模型.首先,利用自动数据增强技术缓解不同摄像机的视角、尺度差异,并基于交叉注意力多尺度Vision Transformer,通过处理多尺度特征生成具有更强区分性的特征表示;接着,提出通道注意力和空间注意力机制,在融合可见光和红外图像特征时学习对区分特征重要的信息;最后,设计损失函数,采用基于自适应权重的难三元组损失,增强了每个样本之间的相关性,提高了可见光和红外图像对不同行人的识别能力.在SYSU-MM01和RegDB数据集上进行大量实验,结果表明,本文提出方法的mAP分别达到了68.05%和85.19%,相较之前的工作性能有所提升,且通过消融实验和对比分析验证了本文模型的先进性和有效性.
Chihan Huang
International Journal of Crashworthiness 2024
To improve the safety of the children in child seats when a vehicle collision occurs, a finite element model of a child safety seat is established, and LS-DYNA is used for crash simulation. Select backrest inclination, headrest inclination, and seat belt centre height as design variables to establish response surface models. The application of chaotic mutation theory in particle swarm optimisation (PSO) is proposed, and the optimal solution of child seat parameters is obtained by using the chaotic mutation based grouped multi-objective particle swarm optimisation (MOPSO). The optimal solution was used to carry out the simulation experiment again, and compared with the original performance. It turned out that the Head Performance Criterion (HPC) is 47.08% better than the original child seat. Besides, the rib deformation is reduced by 21.30%, the neck shear stress is reduced by 55.74%, and the thigh stress is reduced by 48.55%, which indicate significant improvements. The results suggest that a certain parameter combination of the child seat leads to the optimal overall performance in child injury prevention, which indicates significant improvement in the damage of all parts of the child.
Chihan Huang
International Journal of Crashworthiness 2024
To improve the safety of the children in child seats when a vehicle collision occurs, a finite element model of a child safety seat is established, and LS-DYNA is used for crash simulation. Select backrest inclination, headrest inclination, and seat belt centre height as design variables to establish response surface models. The application of chaotic mutation theory in particle swarm optimisation (PSO) is proposed, and the optimal solution of child seat parameters is obtained by using the chaotic mutation based grouped multi-objective particle swarm optimisation (MOPSO). The optimal solution was used to carry out the simulation experiment again, and compared with the original performance. It turned out that the Head Performance Criterion (HPC) is 47.08% better than the original child seat. Besides, the rib deformation is reduced by 21.30%, the neck shear stress is reduced by 55.74%, and the thigh stress is reduced by 48.55%, which indicate significant improvements. The results suggest that a certain parameter combination of the child seat leads to the optimal overall performance in child injury prevention, which indicates significant improvement in the damage of all parts of the child.