Don't just assume; look and answer: Overcoming priors for visual question answering, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008. ,
Dont just assume; look and answer: Overcoming priors for visual question answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4971-4980, 2018. ,
Bottom-up and top-down attention for image captioning and visual question answering, IEEE Conference on Computer Vision and Pattern Recognition CVPR, vol.6, p.7, 2005. ,
VQA: Visual Question Answering, International Conference on Computer Vision (ICCV), vol.1, 2015. ,
Neural machine translation by jointly learning to align and translate ,
Deep attention neural tensor network for visual question answering, The European Conference on Computer Vision (ECCV), 2018. ,
Relational inductive biases, deep learning, and graph networks, vol.2, p.4, 2018. ,
Mutan: Multimodal tucker fusion for visual question answering, vol.6, p.7, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-02073637
Iterative visual reasoning beyond convolutions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.1, 2018. ,
Structured attentions for visual question answering, IEEE International Conference on Computer Vision (ICCV, 2002. ,
Multimodal compact bilinear pooling for visual question answering and visual grounding, EMNLP. The Association for Computational Linguistics, 2007. ,
Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering, IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2008. ,
Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2016. ,
Learning to reason: End-to-end module networks for visual question answering, Proceedings of the IEEE International Conference on Computer Vision (ICCV, 2002. ,
Compositional attention networks for machine reasoning, International Conference on Learning Representations, vol.2, p.3, 2018. ,
CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning, IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2005. ,
Inferring and executing programs for visual reasoning, ICCV, vol.1, 2017. ,
An analysis of visual question answering algorithms, The IEEE International Conference on Computer Vision (ICCV), 2007. ,
, , vol.2, p.7, 2018.
Hadamard Product for Low-rank Bilinear Pooling, The 5th International Conference on Learning Representations, 2007. ,
Adam: A method for stochastic optimization, ICLR, 2015. ,
Skip-thought vectors, Proceedings of the 28th International Conference on Neural Information Processing Systems, vol.2, pp.3294-3302, 2015. ,
,
Visual genome: Connecting language and vision using crowdsourced dense image annotations, International Journal of Computer Vision, vol.123, issue.1, pp.32-73, 2017. ,
Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, vol.25, pp.1097-1105, 2012. ,
Discovering causal signals in images, IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017. ,
Visual relationship detection with language priors, European Conference on Computer Vision, 2016. ,
Learning visual question answering by bootstrapping hard attention, The European Conference on Computer Vision (ECCV), 2008. ,
A multi-world approach to question answering about real-world scenes based on uncertain input, Advances in Neural Information Processing Systems, vol.27, pp.1682-1690, 2014. ,
Transparency by design: Closing the gap between performance and interpretability in visual reasoning, IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2002. ,
Training recurrent answering units with joint loss minimization for vqa, 2016. ,
Learning conditioned graph structures for interpretable visual question answering, vol.6, p.7, 2003. ,
Film: Visual reasoning with a general conditioning layer, AAAI, vol.2, p.5, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01648685
Measuring abstract reasoning in neural networks, JMLR Workshop and Conference Proceedings, vol.80, pp.4477-4486, 2018. ,
A simple neural network module for relational reasoning, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp.4974-4983, 2017. ,
Question type guided attention in visual question answering, The European Conference on Computer Vision (ECCV), 2018. ,
Show, attend and tell: Neural image caption generation with visual attention, Proceedings of the 32Nd International Conference on International Conference on Machine Learning, vol.37, pp.2048-2057, 2015. ,
Stacked attention networks for image question answering, IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2016. ,
Multi-modal factorized bilinear pooling with co-attention learning for visual question answering, IEEE International Conference on Computer Vision (ICCV), vol.2, p.3, 2017. ,
Beyond bilinear: Generalized multi-modal factorized high-order pooling for visual question answering, IEEE Transactions on Neural Networks and Learning Systems, vol.2, p.3, 2018. ,
Pythia v0.1: the winning entry to the vqa challenge, p.7, 2005. ,
Learning to count objects in natural images for visual question answering, International Conference on Learning Representations, vol.2, p.7, 2018. ,