Deep canonical correlation analysis, ICML, 2013. ,
, , vol.1, p.6, 2016.
Large scale online learning of image similarity through ranking, JMLR, vol.11, issue.2, pp.1109-1135, 2010. ,
Empirical evaluation of gated recurrent neural networks on sequence modeling, NIPS w. on Deep Learning, vol.2, p.4, 2014. ,
R-FCN: Object detection via region-based fully convolutional networks, NIPS, 2016. ,
Wildcat: Weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation, CVPR, vol.3, p.7, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01515640
Weldon: Weakly supervised learning of deep convolutional neural networks, CVPR, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-01343785
Linking image and text with 2-way nets, vol.6, p.7, 2016. ,
Linking image and text with 2-way nets, CVPR, 2017. ,
VSE++: Improved visual-semantic embeddings, vol.6, p.7, 2005. ,
DeViSE: A deep visual-semantic embedding model, NIPS, vol.1, 2013. ,
Learning globallyconsistent local distance functions for shape-based image retrieval and classification, ICCV, 2007. ,
Deep residual learning for image recognition, CVPR, vol.2, p.6, 2016. ,
Long short-term memory, Neural computation, vol.9, issue.8, pp.1735-1780, 1997. ,
Deep metric learning using triplet network, ICLRw, 2015. ,
Relations between two sets of variates, Biometrika, issue.2, 1936. ,
Deep visual-semantic alignments for generating image descriptions, CVPR, 2005. ,
Convolutional neural networks for sentence classification, EMNLP, 2014. ,
Adam: A method for stochastic optimization, ICLR, 2014. ,
Unifying visual-semantic embeddings with multimodal neural language models, 2005. ,
Skip-thought vectors, NIPS, vol.2, p.4, 2015. ,
Visual genome: Connecting language and vision using crowdsourced dense image annotations, 2016. ,
Imagenet classification with deep convolutional neural networks, NIPS, 2012. ,
Kernel and nonlinear canonical correlation analysis, Int. J. Neural Syst, issue.2, 2000. ,
, Training RNNs as fast as CNNs, p.6, 2004.
Microsoft COCO: Common objects in context, ECCV, vol.1, p.5, 2014. ,
Multimodal convolutional neural networks for matching image and sentence, ICCV, 2015. ,
PCCA: A new approach for distance learning from sparse pairwise constraints, CVPR, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00806007
Efficient estimation of word representations in vector space, vol.2, p.4, 2013. ,
Dual attention networks for multimodal reasoning and matching, 2017. ,
Hierarchical multimodal LSTM for dense visual-semantic embedding, CVPR, 2017. ,
GloVe: Global vectors for word representation, EMNLP, 2014. ,
Joint imagetext representation by gaussian visual-semantic embedding, ACMMM, 2016. ,
Learning cross-modal embeddings for cooking recipes and food images, CVPR, 2017. ,
Grad-CAM: Visual explanations from deep networks via gradient-based localization, 2017. ,
Very deep convolutional networks for large-scale image recognition, 2014. ,
Learning a similarity metric discriminatively, with application to face verification, CVPR, 2005. ,
Learning deep structurepreserving image-text embeddings, CVPR, 2016. ,
Learning two-branch neural networks for image-text matching tasks, vol.6, p.7, 2003. ,
Distance metric learning for large margin nearest neighbor classification, JMLR, vol.10, issue.2, pp.207-244, 2009. ,
Wsabie: Scaling up to large vocabulary image annotation, IJCAI, 2011. ,
Weakly-supervised visual grounding of phrases with linguistic structures, CVPR, vol.6, p.7, 2017. ,
Distance metric learning, with application to clustering with side-information, NIPS, 2002. ,
Deep correlation for matching images and text, CVPR, 2015. ,
Learning deep features for discriminative localization, CVPR, 2008. ,