Unsupervised Multiway Data Analysis: A Literature Survey, IEEE Transactions on Knowledge and Data Engineering, vol.21, issue.1, pp.6-20, 2009. ,
Modeling and Multiway Analysis of Chatroom Tensors, Proceedings of the 2005 IEEE International Conference on Intelligence and Security Informatics. ISI'05, pp.256-268, 2005. ,
TallyQA: Answering Complex Counting Questions, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (cit, p.97, 2019. ,
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol.30, p.93, 2018. ,
Practical aspects of PARAFAC modeling of fluorescence excitation-emission data, Journal of Chemometrics, vol.17, p.20, 2003. ,
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit. on pp, vol.85, p.91, 2018. ,
Learning to Compose Neural Networks for Question Answering, NAACL HLT 2016, pp.1545-1554, 2016. ,
Neural module networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.39-48, 2016. ,
VQA: Visual Question Answering, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015. ,
Factors of Transferability for a Generic ConvNet Repbibliography resentation, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 38.9, p.13, 2016. ,
Temporal Analysis of Semantic Graphs Using ASALSAN, Seventh IEEE International Conference on Data Mining (ICDM 2007), pp.33-42, 2007. ,
Neural Machine Translation by Jointly Learning to Align and Translate, Proceedings of the International Conference on Learning Representations (ICLR, p.22, 2015. ,
Deep Attention Neural Tensor Network for Visual Question Answering, Proceedings of the IEEE European Conference on Computer Vision (ECCV), 2018. ,
Relational inductive biases, deep learning, and graph networks, p.81, 2018. ,
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02073644
MUTAN: Multimodal Tucker Fusion for Visual Question Answering, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-02073637
MUREL: Multimodal Relational Reasoning for Visual Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02073649
Learning Long-Term Dependencies with Gradient Descent is Difficult, IEEE Transactions on Neural Networks 5.2, p.16, 1994. ,
A Guide to Recurrent Neural Networks and Backpropagation (cit, p.16, 2001. ,
Optimization Methods for Large-Scale Machine Learning, SIAM Review, vol.60, 2016. ,
Review on Multiway Analysis in Chemistry, Critical Reviews in Analytical Chemistry, vol.36, p.20, 2000. ,
Analysis of individual differences in multidimensional scaling via an n-way generalization of "EckartYoung" decomposition, Psychometrika (cit, p.60, 1970. ,
Finding Frequent Items in Data Streams, International Colloquium on Automata, Languages and Programming, vol.684566, p.19, 2002. ,
Structured Attentions for Visual Question Answering, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. ,
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), p.17, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01433235
, , 2014.
, Presented at the Deep Learning workshop at NIPS2014, Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, p.17
Tensor decompositions for signal processing applications: From two-way to multiway component analysis, IEEE Signal Processing Magazine, vol.32, pp.145-163, 2015. ,
Tensors : A brief introduction, IEEE Signal Processing Magazine, vol.31, p.20, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00923279
Detecting Visual Relationships with Deep Relational Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ,
, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, p.99, 2017.
Decompositions of a Higher-Order Tensor in Block Terms -Part II: Definitions and Uniqueness, In: SIAM J. Matrix Anal. Appl, vol.30, issue.3, p.61, 2008. ,
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, p.18, 2018. ,
Finding structure in time, COGNITIVE SCIENCE 14, vol.2, p.16, 1990. ,
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). The Association for Computational Linguistics (cit. on pp. 13, vol.46, p.66, 2016. ,
Neocognitron: a Self Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position, Biological Cybernetics, vol.36, p.12, 1980. ,
Higher-Order Web Link Analysis Using Multilinear Algebra, pp.242-249, 2005. ,
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering, Advances in Neural Information Processing Systems (NIPS). NIPS'15, vol.13, p.11, 2015. ,
Generative adversarial nets, Advances in Neural Information Processing Systems (NIPS), pp.2672-2680, 2014. ,
Detection of irregular heartbeats using tensors, 2015 Computing in Cardiology Conference (CinC), p.20, 2015. ,
Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit. on pp. 29, vol.63, 2017. ,
LSTM: A Search Space Odyssey, p.16, 2015. ,
VizWiz Grand Challenge: Answering Visual Questions from Blind People, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, p.97, 2018. ,
Canonical Correlation Analysis: An Overview with Application to Learning Methods, Neural Comput. 16, vol.12, issue.2, pp.2639-2664, 2004. ,
Foundations of the Parafac Procedure: Models and Conditions for an "explanatory" Multimodal Factor Analysis, 2001. ,
Mask R-CNN, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit, p.28, 2017. ,
Deep Residual Learning for Image Recognition, 2015. ,
Deep Residual Learning for Image Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, p.13, 2016. ,
Long Short-Term Memory, Neural Comput. 9, vol.8, p.16, 1997. ,
Long Short-Term Memory, Neural Comput. 9, vol.8, p.21, 1997. ,
Relations Between Two Sets of Variates, Biometrika 28.3-4, pp.321-377, 1936. ,
,
Attribute-Enhanced Face Recognition With Neural Tensor Fusion Networks, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit, p.22, 2017. ,
Learning to Reason: End-to-End Module Networks for Visual Question Answering, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit. on p, p.27, 2017. ,
Revisiting Visual Question Answering Baselines, Computer Vision -ECCV 2016, p.19, 2016. ,
Pythia v0.1: The Winning Entry to the VQA Challenge, vol.85, p.91, 2018. ,
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1988-1997, 2017. ,
Inferring and Executing Programs for Visual Reasoning, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit. on p, p.27, 2017. ,
An Analysis of Visual Question Answering Algorithms, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. ,
DVQA: Understanding Data Visualizations via Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, p.97, 2018. ,
The Unreasonable Effectiveness of Recurrent Neural Networks, vol.17, p.16, 2015. ,
Bilinear Attention Networks, Advances in Neural Information Processing Systems (NIPS), pp.1571-1581, 2018. ,
Multimodal Residual Learning for Visual QA, Advances in Neural Information Processing Systems (NIPS), pp.361-369, 2016. ,
Hadamard Product for Low-rank Bilinear Pooling, Proceedings of the International Conference on Learning Representations (ICLR) (cit. on pp. 13, vol.46, p.92, 2017. ,
Adam: A Method for Stochastic Optimization, vol.46, 2014. ,
Multimodal Neural Language Models, Proceedings of Machine Learning Research. Bejing, pp.595-603, 2014. ,
Unifying visual-semantic embeddings with multimodal neural language models, 2014. ,
Skip-thought Vectors, Advances in Neural Information Processing Systems (NIPS), pp.3294-3302, 2015. ,
ImageNet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems (NIPS), vol.45, p.12, 2012. ,
Visual genome: Connecting language and vision using crowdsourced dense image annotations, International Journal of Computer Vision, pp.32-73, 2017. ,
Speeding up convolutional neural networks using fine-tuned CP-decomposition, p.21, 2014. ,
Backpropagation Applied to Handwritten Zip Code Recognition, Neural computation 1.4, p.12, 1989. ,
ViP-CNN: Visual Phrase Guided Convolutional Neural Network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, vol.72, p.67, 2017. ,
Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, p.72, 2017. ,
Microsoft COCO: Common Objects in Context, Proceedings of the IEEE European Conference on Computer Vision (ECCV), p.28, 2014. ,
Bilinear CNN Models for Fine-grained Visual Recognition, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit, p.59, 2015. ,
Fully Convolutional Networks for Semantic Segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.13, 2015. ,
Visual Relationship Detection with Language Priors, Proceedings of the IEEE European Conference on Computer Vision (ECCV), 2016. ,
Hierarchical Question-Image Co-Attention for Visual Question Answering, Advances in Neural Information Processing Systems (NIPS), pp.289-297, 2016. ,
Learning to Answer Questions from Image Using Convolutional Neural Network, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). AAAI'16, vol.13, p.11, 2016. ,
Learning Visual Question Answering by Bootstrapping Hard Attention, Proceedings of the IEEE European Conference on Computer Vision (ECCV), 2018. ,
A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input, Advances in Neural Information Processing Systems (NIPS). Ed. by Z. Ghahramani, pp.1682-1690, 2014. ,
Towards a Visual Turing Challenge, Learning Semantics (cit, vol.4, 2014. ,
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering, 2016. ,
Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit. on p, p.27, 2018. ,
Recurrent neural network based language model, p.16, 2010. ,
Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems (NIPS), vol.18, p.15, 2013. ,
Decomposing EEG data into space-time-frequency components using Parallel Factor Analysis, p.20, 2004. ,
Applications of tensor (multiway array) factorizations and decompositions in data mining, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1.1, pp.24-40, 2011. ,
Dual Attention Networks for Multimodal Reasoning and Matching, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.25, 2017. ,
Training recurrent answering units with joint loss minimization for vqa, vol.93, p.66, 2016. ,
Learning Conditioned Graph Structures for Interpretable Visual Question Answering, 2018. ,
, The Building Blocks of Interpretability, p.24, 2018.
, Understanding LSTM Networks, p.16, 2015.
On the difficulty of training recurrent neural networks, JMLR Proceedings. JMLR.org, vol.28, p.16, 2013. ,
Glove: Global vectors for word representation, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), p.15, 2014. ,
,
FiLM: Visual Reasoning with a General Conditioning Layer, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (cit. on pp. 28, vol.84, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01648685
Improving the Fisher Kernel for Large-Scale Image Classification, Proceedings of the IEEE European Conference on Computer Vision (ECCV), 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00548630
Deep contextualized word representations, 2018. ,
Weaklysupervised learning of visual relations, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit, p.72, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01576035
Language Models are Unsupervised Multitask Learners, 2019. ,
CNN Features off-the-shelf: an Astounding Baseline for Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop (cit, p.13, 2014. ,
Generative Adversarial Text to Image Synthesis, Proceedings of Machine Learning Research, vol.48, pp.1060-1069, 2016. ,
Exploring Models and Data for Image Question Answering, Advances in Neural Information Processing Systems (NIPS), pp.2953-2961, 2015. ,
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, Advances in Neural Information Processing Systems (NIPS), p.14, 2015. ,
ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, pp.211-252, 2015. ,
KVQA: Knowledge-Aware Visual Question Answering, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (cit. on pp. 97, 99), 2019. ,
A simple neural network module for relational reasoning, Advances in Neural Information Processing Systems (NIPS), p.26, 2017. ,
Question Type Guided Attention in Visual Question Answering, Proceedings of the IEEE European Conference on Computer Vision (ECCV) (cit, vol.93, p.91, 2018. ,
Where To Look: Focus Regions for Visual Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
Grounded Compositional Semantics for Finding and Describing Images with Sentences, Transactions of the Association for Computational Linguistics (TACL), vol.2, pp.207-218, 2014. ,
CubeSVD: A Novel Approach to Personalized Web Search, Proceedings of the 14th International Conference on World Wide Web. WWW '05, p.21, 2005. ,
Tips and Tricks for Visual Question Answering: Learnings From the 2017 Challenge, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. ,
Some mathematical notes on three-mode factor analysis, Psychometrika 31.3, vol.36, p.30, 1966. ,
Multilinear Analysis of Image Ensembles: TensorFaces, Proceedings of the IEEE European Conference on Computer Vision (ECCV), p.20, 2002. ,
A Tensor Approximation Approach to Dimensionality Reduction, International Journal of Computer Vision, p.20, 2008. ,
Ask me anything: free-form visual question answering based on knowledge from external sources, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
Dynamic Memory Networks for Visual and Textual Question Answering, JMLR.org, p.13, 2016. ,
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering, Proceedings of the IEEE European Conference on Computer Vision (ECCV), pp.451-466, 2016. ,
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML'15, p.22, 2015. ,
Deep Correlation for Matching Images and Text, 2015. ,
Multilinear Discriminant Analysis for Face Recognition, IEEE Transactions on Image Processing, p.20, 2007. ,
Deep Multi-task Representation Learning: A Tensor Factorisation Approach, Proceedings of the International Conference on Learning Representations (ICLR) (cit, p.21, 2017. ,
Unifying Multi-domain Multitask Learning: Tensor and Neural Network Perspectives, Domain Adaptation in Computer Vision Applications, p.22, 2017. ,
Stacked Attention Networks for Image Question Answering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.21-29, 2016. ,
Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. ,
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding, Advances in Neural Information Processing Systems (NIPS), p.28, 2018. ,
Longterm forecasting using tensor-train RNNs, p.22, 2017. ,
Visual Relationship Detection With Internal and External Linguistic Knowledge Distillation, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. ,
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering, Proceedings of the IEEE International Conference on Computer Vision (ICCV) (cit. on pp. 13, 2017. ,
Beyond Bilinear: Generalized Multi-modal Factorized High-order Pooling for Visual Question Answering, IEEE Transactions on Neural Networks and Learning Systems, 2018. ,
Visual Translation Embedding Network for Visual Relation Detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (cit, vol.72, p.67, 2017. ,
PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. ,
Learning to Count Objects in Natural Images for Visual Question Answering, Proceedings of the International Conference on Learning Representations (ICLR, 2018. ,
Simple baseline for visual question answering, p.16, 2015. ,