Hybrid-DBT: Hardware Accelerated Dynamic Binary Translation
Simon Rokicki, Erven Rohou, Steven Derrien

To cite this version:
Simon Rokicki, Erven Rohou, Steven Derrien. Hybrid-DBT: Hardware Accelerated Dynamic Binary Translation. RISC-V 2019 - Workshop Zurich, Jun 2019, Zurich, Switzerland. pp.1. hal-02155019

HAL Id: hal-02155019
https://hal.archives-ouvertes.fr/hal-02155019
Submitted on 13 Jun 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Hybrid-DBT: Hardware Accelerated Dynamic Binary Translation

Simon Rokicki, Erven Rohou, Steven Derrien
Univ Rennes, Inria, CNRS, IRISA

Context - Heterogeneous Multi-cores

Heterogeneous multi-core systems present key advantages compared to their homogeneous counterparts. They allow a dynamic balancing between performance and energy efficiency. Current implementations are limited to a single ISA to easily migrate tasks from one core to the other.

To further improve the current systems, specialized cores such as VLIWs can be added. To handle the ISA differences between the cores, a layer of Dynamic Binary Translation is associated to the specialized cores. DBT translates the instructions from a given ISA to another one as they are being executed on the target core. To lessen the overheads introduced by the use of DBT, we present Hybrid-DBT: a HW/SW Co-Designed DBT system which uses several hardware accelerators to reduce the costs of DBT [3][4].

Translation flow

Opt. level 0
- Native Binaries
- Instruction Translation
- No ILP

Opt. level 1
- IR
- Solving branches
- IR Builder
- Insertion of profiling instr
- Data flow graphs
- Building CFG
- Inter-block optimization

Opt. level 2
- IR Scheduler

Software opt.
In-memory data
Details on the Intermediate Representation (IR)

Hardware opt.
Data & Control flow graphs

Introduction of Hybrid-DBT

Overview of Hybrid-DBT

- The VLIW core executes applications.
- The L1 Instr, cache stores RISC-V instructions.
- The HW Decoder decodes RISC-V into VLIW ISA.
- The DBT Processor analyses and optimizes RISC-V binaries, using the HW Accelerators.
- Those binaries are stored in the Translation Cache.

Experimental Results

- Bar charts showing improvement vs. software for First-Pass Translation, IR Generation, IR Scheduling.
- Line charts comparing Performance and Energy efficiency for In-Order, Hybrid-DBT, OoO.

References

[4] Rokicki et al. Supporting Runtime Reconfigurable VLIWs Cores through Dynamic Binary Translation

Contacts

simon.rokicki@irisa.fr
https://github.com/srokicki/HybridDBT

Experimental Machines

Previous work on HW/SW Co-Designed Machines:

- Transmeta Crusoe (x86 on VLIW) [1]
- NVidia Denver (Armv8 on VLIW) [2]

References

[4] Rokicki et al. Supporting Runtime Reconfigurable VLIWs Cores through Dynamic Binary Translation

Contacts

simon.rokicki@irisa.fr
https://github.com/srokicki/HybridDBT