G. Adda, S. Stüker, M. Adda-decker, O. Ambouroue, L. Besacier et al., Breaking the unwritten language barrier: The Bulb project A case study on using speech-to-translation alignments for language documentation, Proceedings of SLTU (Spoken Language Technologies for Under- Resourced Languages), pp.170-178, 2016.

K. Peter and . Austin, The Cambridge Handbook of Endangered Languages, 2011.

R. P. Beapami, R. Chatfield, G. Kouarata, and A. Embengue-waldschmidt, Dictionnaire Mbochi-Français, 2000.

P. Bedrosian, The Mboshi noun class system, Journal of West African Languages, vol.26, issue.1, pp.27-47, 1996.

D. Blachon, E. Gauthier, L. Besacier, G. Kouarata, M. Adda-decker et al., Parallel Speech Collection for Under-resourced Language Studies Using the Lig-Aikuma Mobile Device App, Proceedings of SLTU (Spoken Language Technologies for Under- Resourced Languages), 2016.
DOI : 10.1016/j.procs.2016.04.030

URL : https://hal.archives-ouvertes.fr/hal-01350065

L. Bouquiaux, Enquête et description des languesàlangues`languesà tradition orale, 1976.

J. Cooper-leavitt, L. Lamel, G. Adda, M. Adda-decker, and A. Rialland, Corpus base linguistic exploration via forced alignments with a 'light-weight' ASR tool, Workshop on Language Technology for Less Resourced Languages (LT-LRL) at the 8th Language & Technology Conference, 2017.

J. Cooper-leavitt, L. Lamel, A. Rialland, M. Adda-decker, and G. Adda, Developing an Embosi (Bantu C25) Speech Variant Dictionary to Model Vowel Elision and Morpheme Deletion, Interspeech 2017, 2017.
DOI : 10.21437/Interspeech.2017-1280

L. Duong, A. Anastasopoulos, D. Chiang, S. Bird, and T. Cohn, An Attentional Model for Speech Translation Without Transcription, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.949-959, 2016.
DOI : 10.18653/v1/N16-1109

E. Aborobongui and G. M. , Processus segmentaux et tonals en Mbondzi ? (variété de la langue embosi C25), 2013.

J. Franke, M. Müller, F. Hamlaoui, S. Stüker, and A. Waibel, Phoneme boundary detection using deep bidirectional LSTMs, Speech Communication Symposium; Proceedings of, pp.1-5, 2016.

J. Franke, M. Müller, S. Stüker, and A. Waibel, Phoneme boundary detection using deep bidirectional lstms, Speech Communication ITG Symposium; Proceedings of. VDE, 2016.

P. Godard, G. Adda, M. Adda-decker, A. Allauzen, L. Besacier et al., Preliminary Experiments on Unsupervised Word Discovery in Mboshi, Interspeech 2016, 2016.
DOI : 10.21437/Interspeech.2016-886

URL : https://hal.archives-ouvertes.fr/hal-01350119

S. Goldwater, T. L. Griffiths, J. , and M. , Contextual dependencies in unsupervised word segmentation, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL , ACL '06, pp.673-680, 2006.
DOI : 10.3115/1220175.1220260

URL : http://dl.acm.org/ft_gateway.cfm?id=1220260&type=pdf

S. Goldwater, T. L. Griffiths, J. , and M. , A Bayesian framework for word segmentation: Exploring the effects of context, Cognition, vol.112, issue.1, pp.21-54, 2009.
DOI : 10.1016/j.cognition.2009.03.008

URL : http://homepages.inf.ed.ac.uk/sgwater/papers/cognition-hdp.pdf

F. Hamlaoui, E. Makasso, M. Müller, J. Engelmann, G. Adda et al., BULBasaa: A bilingual B` asàá-French speech corpus for the evaluation of language documentation tools, LREC 2018, 2018.

A. Jansen and B. Van-durme, Efficient spoken term discovery using randomized algorithms, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, pp.401-406, 2011.
DOI : 10.1109/ASRU.2011.6163965

URL : http://www.cs.jhu.edu/%7Evandurme/papers/JansenVanDurmeASRU11.pdf

G. Kouarata, Variations de formes dans la langue mbochi (Bantu C25), 2014.

L. Lamel and J. Gauvain, The Oxford Handbook of Computational Linguistics, chapter Speech Recognition, 2015.

M. Lekakou, V. Baldissera, and A. Anastasopoulos, Documentation and analysis of an endangered language: aspects of the grammar of Griko, 2013.

B. Ludusan, M. Versteegh, A. Jansen, G. Gravier, X. Cao et al., Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems, Proceedings of LREC, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01026368

M. Müller, J. Franke, S. Stüker, and A. Waibel, Towards phoneme inventory discovery for documentation of unwritten languages, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
DOI : 10.1109/ICASSP.2017.7953148

A. Rialland and M. E. Aborobongui, How intonations interact with tones in Embosi (Bantu C25), a two-tone language without downdrift, Intonation in African Tone Languages, 2016.
DOI : 10.1515/9783110503524-007

A. Rialland, G. M. Embanga-aborobongui, M. Adda-decker, and L. Lamel, Dropping of the class-prefix consonant, vowel elision and automatic phonological mining in Embosi, Proceedings of the 44th ACAL meeting, pp.221-230, 2015.
URL : https://hal.archives-ouvertes.fr/halshs-01251202

S. Stüker, G. Adda, M. Adda-decker, O. Ambouroue, L. Besacier et al., Innovative technologies for under-resourced language documentation: The Bulb project, Proceedings of CCURL (Collaboration and Computing for Under-Resourced Languages : toward an Alliance for Digital Language Diversity), Portoro?zPortoro?z Slovenia, 2016.

A. C. Woodbury, Defining documentary linguistics Language Documentation and Description, pp.35-51, 2003.

Z. Boito, M. Berard, A. Villavicencio, A. Besacier, and L. , Unwritten languages demand attention too! Word discovery with encoder-decoder models, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), p.2017, 2017.
DOI : 10.1109/ASRU.2017.8268972

URL : https://hal.archives-ouvertes.fr/hal-01592091