Y. Bengio, R. Ducharme, P. Vincent, and C. Janvin, A neural probabilistic language model, J. Mach. Learn. Res, vol.3, pp.1137-1155, 2003.

J. Bollen, H. Mao, and X. , Twitter mood predicts the stock market, J. Comput. Science, vol.2, issue.1, pp.1-8, 2011.

R. Collobert and J. Weston, A unified architecture for natural language processing: Deep neural networks with multitask learning, Proceedings of the 25th International Conference on Machine Learning, pp.160-167, 2008.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu et al., Natural language processing (almost) from scratch, J. Mach. Learn. Res, vol.12, pp.2493-2537, 2011.

A. Conneau, H. Schwenk, L. Barrault, and Y. Lecun, Very deep convolutional networks for natural language processing, 2016.

M. Costa, Probabilistic interpretation of feedforward network outputs, with relationships to statistical prediction of ordinal quantities, International Journal Neural Systems, vol.7, pp.627-638, 1996.

G. Corso, A. Gulli, and F. Romani, Ranking a stream of news, Proceedings of the 14th international conference on World Wide Web, pp.97-106, 2005.

B. Dhingra, H. Liu, W. W. Cohen, and R. Salakhutdinov, Gated-attention readers for text comprehension, 2016.

K. Peter-sheridan-dodds, I. M. Harris, C. A. Kloumann, C. M. Bliss, and . Danforth, Temporal patterns of happiness and information in a global social network: Hedonometrics and twitter, 2011.

J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang et al., Decaf: A deep convolutional activation feature for generic visual recognition, ICML, 2014.

K. Rong-en-fan, C. Chang, X. Hsieh, C. Wang, and . Lin, Liblinear: A library for large linear classification, J. Mach. Learn. Res, vol.9, pp.1871-1874, 2008.

X. Glorot and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS10). Society for Artificial Intelligence and Statistics, 2010.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, pp.770-778, 2016.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, pp.770-778, 2016.

G. Hinton, L. Deng, D. Yu, G. Dahl, A. Rahman-mohamed et al., Deep neural networks for acoustic modeling in speech recognition, Signal Processing Magazine, 2012.

G. Huang, Z. Liu, and K. Q. Weinberger, , 2016.

T. J. , Text categorization with suport vector machines: Learning with many relevant features, Proceedings of the 10th European Conference on Machine Learning. ECML '98, pp.137-142, 1998.

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, Bag of tricks for efficient text classification, 2016.

N. Kalchbrenner, E. Grefenstette, and P. Blunsom, A convolutional neural network for modelling sentences, Proceedings of the 52nd, 2014.

, Annual Meeting of the Association for Computational Linguistics, p.655665

M. Kaya, G. Fidan, I. Hakk?, and . Toroslu, Transfer Learning Using Twitter Data for Improving Sentiment Classification of Turkish Political News, pp.139-148, 2013.

Y. Kim, Convolutional neural networks for sentence classification, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp.1746-1751, 2014.

Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, Character-aware neural language models, AAAI, pp.2741-2749, 2016.

P. Diederik, J. Kingma, and . Ba, Adam: A method for stochastic optimization, 2014.

A. Krizhevsky, I. Sutskever, and G. E. Hin, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems 25, pp.1097-1105, 2012.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.

A. Mccallum and K. Nigam, A comparison of event models for naive bayes text classification, AAAI-98 WORKSHOP ON LEARNING FOR TEXT CATEGORIZATION, pp.41-48, 1998.

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems 26, pp.3111-3119, 2013.

Y. Miyamoto and K. Cho, Gated word-character recurrent language model, EMNLP, 2016.

P. Nakov, S. Rosenthal, S. Kiritchenko, M. Saif, Z. Mohammad et al., Developing a successful semeval task in sentiment analysis of twitter and other social media texts, Lang. Resour. Eval, vol.50, issue.1, pp.35-65, 2016.

P. Nakov, S. Rosenthal, Z. Kozareva, V. Stoyanov, A. Ritter et al., Semeval-2013 task 2: Sentiment analysis in twitter, Proceedings of the 7th International Workshop on Semantic Evaluation, pp.312-320, 2013.

C. Muhammad-atif-qureshi, G. O'riordan, and . Pasi, Clustering with errorestimation for monitoring reputation of companies on twitter, Information Retrieval Technology9th Asia Information Retrieval Societies Conference, AIRS 2013, vol.8281, pp.170-180, 2013.

A. Radford, R. Józefowicz, and I. Sutskever, Learning to generate reviews and discovering sentiment, 2017.

S. Rosenthal, N. Farra, and P. Nakov, Semeval-2017 task 4: Sentiment analysis in twitter, Proceedings of the 11th International Workshop on Semantic Evaluation, 2017.

S. Rosenthal, A. Ritter, P. Nakov, and V. Stoyanov, Semeval-2014 task 9: Sentiment analysis in twitter, Proceedings of the 8th International Workshop on Semantic Evaluation, pp.73-80, 2014.

A. Severyn and A. Moschitti, Twitter sentiment analysis with deep convolutional neural networks, Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, vol.15, pp.959-962, 2015.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning et al., Recursive deep models for semantic compositionality over a sentiment treebank, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1631-1642, 2013.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed et al., Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich, Computer Vision and Pattern Recognition (CVPR, 2015.

Y. Xiao and K. Cho, Efficient character-level document classification by combining convolution and recurrent layers, 2016.

Z. Yang, B. Dhingra, Y. Yuan, J. Hu, W. W. Cohen et al., Words or characters? fine-grained gating for reading comprehension, ICLR, 2017.

D. Yogatama, C. Dyer, W. Ling, and P. Blunsom, Generative and discriminative text classification with recurrent neural networks, 2017.

J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, How transferable are features in deep neural networks?, Proceedings of the 27th International Conference on Neural Information Processing Systems, pp.3320-3328, 2014.

D. Matthew, R. Zeiler, and . Fergus, Visualizing and understanding convolutional networks, Computer Vision-ECCV 2014-13th European Conference, pp.818-833, 2014.

X. Zhang, J. Zhao, and Y. Lecun, Character-level convolutional networks for text classification, Proceedings of the 28th International Conference on Neural Information Processing Systems, pp.649-657, 2015.

Y. Zhang and B. C. Wallace, A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classification, 2015.