Skip to Main content Skip to Navigation
Conference papers

Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3)

Abstract : We introduce a novel chart-based algorithm for span-based parsing of discontinuous constituency trees of block degree two, including ill-nested structures. In particular, we show that we can build variants of our parser with smaller search spaces and time complexities ranging from O(n^6) down to O(n^3). The cubic time variant covers 98% of constituents observed in linguistic treebanks while having the same complexity as continuous constituency parsers. We evaluate our approach on German and English treebanks (Negra, Tiger, and DPTB) and report state-of-the-art results in the fully supervised setting. We also experiment with pre-trained word embeddings and Bertbased neural networks.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03029253
Contributor : Caio Corro <>
Submitted on : Saturday, November 28, 2020 - 2:32:49 AM
Last modification on : Monday, February 22, 2021 - 4:21:17 PM

File

2020.emnlp-main.219.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-03029253, version 1

Collections

Citation

Caio Corro. Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3). Empirical Methods in Natural Language Processing, Nov 2020, Punta Cana (virtual), Dominican Republic. ⟨hal-03029253⟩

Share

Metrics

Record views

72

Files downloads

24