Bridging the gap: dual perception attention and local-global similarity fusion for cross-modal image-text matching
Bridging modal gaps: A Cross-Modal Feature Complementation and Feature Projection Network for visible-infrared person re-identification