Professional AssociationsProfessional Associations: ACM | Information Processing Society of Japan (IPSJ)
- Code Patch to the STAMP Benchmarks for Hardware Transdactional Memory (HTM)
- IISWC 2014 Paper
- PPoPP 2014 Paper
- IISWC 2013 Paper
- Research Report: Eliminating GIL in Ruby through HTM
- ASPLOS 2012 Paper
- CGO 2010 Paper
- VEE 2010 Paper
- ISCA 2015 Paper
- Code Patch to Eliminate Global Interpreter Lock (GIL) in Ruby through Hardware Transactional Memory
IISWC 2014 Paper
Thread-Level Speculation on Off-the-Shelf Hardware Transactional Memory
Rei Odaira and Takuya Nakaike.
In Proceedings of the 2014 IEEE International Symposium on Workload Characterization (IISWC), pp.212--221, 2014.
Full text [PDF]: IISWC2014_TLSonHTM.pdf
Slides [PDF]: IISWC2014_TLSonHTM_Slides.pdf
Thread-level speculation can speed up a single-thread application by splitting its execution into multiple tasks and speculatively executing those tasks in multiple threads. Efficient thread-level speculation requires hardware support for memory conflict detection, store buffering, and execution rollback, and in addition, previous research has also proposed advanced optimization facilities, such as ordered transactions and data forwarding. Recently, implementations of hardware transactional memory (HTM) are coming into the market with minimal hardware support for thread-level speculation. However, few implementations offer advanced optimization facilities. Thus, it is important to determine how well thread-level speculation can be realized on the current HTM implementations, and what optimization facilities should be implemented in the future. In our research, we studied thread-level speculation on the off-the-shelf HTM implementation in Intel TSX. We manually modified potentially parallel benchmarks in SPEC CPU2006 for thread-level speculation. Our experimental results showed that thread-level speculation resulted in up to an 11% speed-up even without the advanced optimization facilities, but actually degraded the performance in most cases. In contrast to our expectations, the main reason for the performance loss was not the lack of hardware support for ordered transactions but the transaction aborts due to memory conflicts. Our investigation suggests that future hardware should support not only ordered transactions but also memory data forwarding, data synchronization, multi-version cache, and word-level conflict detection for thread-level speculation.
IEEE - Copyright (C) 2014 by IEEE. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.