Efficient SIMD technique with parallel Max-Log-MAP Algorithm for Turbo Decoders

David Gnaedig; Mathias Lapeyre; Florent Mouchou; Emmanuel Boutillon

Communication Dans Un Congrès Année : 2004

Efficient SIMD technique with parallel Max-Log-MAP Algorithm for Turbo Decoders

(1) , (1) , (1) , (1)

David Gnaedig

Fonction : Auteur

Laboratoire d'Electronique des Systèmes TEmps Réel

Mathias Lapeyre

Fonction : Auteur

Laboratoire d'Electronique des Systèmes TEmps Réel

Florent Mouchou

Fonction : Auteur

Laboratoire d'Electronique des Systèmes TEmps Réel

Emmanuel Boutillon

Fonction : Auteur
PersonId : 745355
IdHAL : emmanuel-boutillon
ORCID : 0000-0003-2124-0786
IdRef : 031719716

Laboratoire d'Electronique des Systèmes TEmps Réel

Résumé

This paper proposes a SIMD technique to implement efficiently the Max-Log-MAP algorithm on a DSP [2]. The Max-Log-MAP algorithm is the key component of a turbo-decoder.It performs the soft decoding of a convolutional code using two finite Viterbi recursions on a trellis: the forward and the backward recursions. The basic computations involved in a trellis stage computation are Addition-Comparison-Selection (ACS) operations. Loo et al. proposed to speed up the decoding throughput of the Max-Log-MAP algorithm on a DPS by using its SIMD functionality [3]. For DSP offering SIMD operations, parallel (vectorized) computations can be achieved simultaneously, on a set of data called a vector. Their SIMD technique performs for a given trellis stage several ACS operations. This method improves the decoding throughput but is not optimal. In fact, because of the structure of the trellis, the state metrics needs to be reordered into a single vector at the end of each trellis stage. This operation requires 2 instructions and reduces the efficiency of the SIMD technique. The use of this technique is efficient when all operations can be vectorized, i.e. when there is no need to pack/unpack data into a single vector to execute a SIMD operation. The new and original method proposed in the present paper aims to overcome this problem by vectorizing all operations involved in the Max-Log-MAP decoding for turbo decoders. It consists in using SIMD instructions to perform several independent trellises in parallel. Hence, each variable in the Max-Log-MAP algorithm is vectorized as a vector of m components: each component of the vector is associated to one of the m trellises decoded. This trellis parallelization is made possible by the use of an adapted two-dimensional turbo code proposed in [4]. This adapted turbo-code is constructed, in each dimension, as the parallel concatenation of m independent Circular Recursive Systematic Convolutional (CRSC) codes, called slices. The m slices are then decoded in parallel with a Max-Log-MAP using vectorized operations. Moreover, an appropriate parallel interleaver is also proposed in [4]. Thanks to its structure, the interleaver maintains the vector organization between the natural and the interleaved order. Thus, there is no need to unpack and repack the vectors between half-iterations. It has also been shown that this adapted turbo code introduces no performance degradation compared to a conventional turbo code. In the present paper, the adapted turbo-code is described with its interleaver properties. After a brief description of the Max-Log-MAP algorithm, the key architectural features for implementing the algorithm with SIMD functionality are given (memory organization and SIMD operations). The proposed trellis parallelization technique has been implemented on a DSP TMS320C6416: four parallel Max-Log-MAP algorithms are performed on 4 independent trellises. Implementation results shows that for a 600 MHz clock frequency, the parallel Max-Log-MAP algorithm achieves a throughput of 2560 Mbits/s. An accurate estimation for the complete turbo decoder with 8 decoding iterations gives a decoding throughput of 160 Mbits/s.

Mots clés

Slice Turbo code DSP SIMD decoder

Domaines

Fichier principal

GSPx_04_TCR.pdf (239.76 Ko)

Emmanuel Boutillon : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00068925

Soumis le : lundi 15 mai 2006-16:47:08

Dernière modification le : vendredi 24 mars 2023-14:52:47

Archivage à long terme le : samedi 3 avril 2010-23:23:00

Dates et versions

hal-00068925 , version 1 (15-05-2006)

Identifiants

HAL Id : hal-00068925 , version 1

Citer

David Gnaedig, Mathias Lapeyre, Florent Mouchou, Emmanuel Boutillon. Efficient SIMD technique with parallel Max-Log-MAP Algorithm for Turbo Decoders. GSPx Embedded Applications Software & Hardware, 2004, Santa Clara, United States. pp.27-30. ⟨hal-00068925⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LESTER

148 Consultations

162 Téléchargements

Efficient SIMD technique with parallel Max-Log-MAP Algorithm for Turbo Decoders

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager