Hardware performance figures
On this page, we summarize the performance figures of Keccak[r=1024,c=576] as they can be found in various implementation reports made in the scope of the SHA-3 contest.
All the displayed throughput are for long messages. In this case, the speed is directly proportional to the bitrate r. The figures below have been normalized for the nominal value r=1024, even though some of the implementations referred to below used a different rate in their reports. To estimate the performance for other bitrate values, one can simply multiply the throughput by r/1024.
ASIC
| Reference | Technology | Synthesis | Area (kGE) | Frequency | Throughput | Clock cycles | Energy (mJ/Gbit) |
|---|---|---|---|---|---|---|---|
| [9] Sugawara | STM 90nm | Gate level | 55.9 | 1030 MHz | 44 Gbps | 24 | |
| [4] Henzen et al. | UMC 90nm | Place and route | 50.0 | 949 MHz | (*)40 Gbps | 24 | 2.4 |
| [9] AIST | STM 90nm | Gate level | 50.6 | 781 MHz | 33 Gbps | 24 | |
| [9] Sugawara | STM 90nm | Gate level | 26.5 | 553 MHz | 24 Gbps | 24 | |
| [9] AIST | STM 90nm | Gate level | 33.6 | 541 MHz | 23 Gbps | 24 | |
| Keccak team | STM 130nm | Gate level | 48.0 | 526 MHz | 22 Gbps | 24 | |
| [7] Tillich et al. | UMC 0.18μm | Gate level | 56.3 | 488 MHz | (*)20 Gbps | 25 | |
| [3] Guo et al. | UMC 130nm | Place and route | 47.4 | 377 MHz | 15 Gbps | 25 | |
| [9] Sugawara | STM 90nm | Gate level | 25.1 | 356 MHz | 15 Gbps | 24 | |
| [9] AIST | STM 90nm | Gate level | 29.5 | 355 MHz | 15 Gbps | 24 | |
| [8] Tillich et al. | UMC 0.18μm | Place and route | 56.7 | 267 MHz | (*)11 Gbps | 25 | |
| [3] Guo et al. | UMC 130nm | Place and route | 34.9 | 161 MHz | 6600 Mbps | 25 | |
| [4] Henzen et al. | UMC 90nm | Place and route | 27.5 | 149 MHz | (*)6362 Mbps | 24 | 5.5 |
| Keccak team | STM 130nm | Gate level | (a)9.3 | 200 MHz | 39 Mbps | 5160 | |
| [5] Kavun et al. | 130nm | Gate level | 20.0 | 100 kHz | (*)85 kbps | 1200 |
FPGA
| Reference | Type | Area | Frequency | Throughput |
|---|---|---|---|---|
| [6] Strömbergson | Cyclone III | 2670 registers, 5842 LE | 123 MHz | 7000 Mbps |
| Keccak team | Cyclone III | 2670 registers, 5770 LE | 145 MHz | 6100 Mbps |
| Keccak team | Cyclone III | 242 registers, 1570 LE | 183 MHz | 39 Mbps |
| [6] Strömbergson | Cyclone III | 242 registers, 1769 LE | 85 MHz | 22 Mbps |
| Reference | Type | Area | Frequency | Throughput |
|---|---|---|---|---|
| [6] Strömbergson | Spartan 3A | 2780 registers, 3393 slices | 85 MHz | 4800 Mbps |
| [2] Gai et al. | Spartan III | 3339 CLB | 83 MHz | (*)3161 Mbps |
| Reference | Type | Area | Frequency | Throughput |
|---|---|---|---|---|
| [2] Gai et al. | Stratix III | 4458 ALUT | 296 MHz | (*)13 Gbps |
| [6] Strömbergson | Stratix III | 2670 registers, 4550 ALUT | 176 MHz | 10 Gbps |
| Keccak team | Stratix III | 2641 registers, 4684 ALUT | 206 MHz | 8700 Mbps |
| Keccak team | Stratix III | 242 registers, 855 ALUT | 359 MHz | 70 Mbps |
| [6] Strömbergson | Stratix III | 242 registers, 1026 ALUT | 133 MHz | 35 Mbps |
| Reference | Type | Area | Frequency | Throughput |
|---|---|---|---|---|
| [2] Gai et al. | Virtex V | 1229 CLB | 238 MHz | (*)10 Gbps |
| [9] AIST | Virtex V | 2666 registers, 1433 slices | 205 MHz | 8397 Mbps |
| [2] Gai et al. | Virtex V | 1412 CLB | 195 MHz | (*)7840 Mbps |
| [6] Strömbergson | Virtex V | 2669 registers, 1483 slices | 118 MHz | 6700 Mbps |
| [10] Guo et al. | Virtex V | 1556 slices | 154 MHz | 6570 Mbps |
| [1] Baldwin | Virtex V | 1117 slices | 189 MHz | (*)5895 Mbps |
| [1] Baldwin | Virtex V | 1971 slices | 195 MHz | (*)5895 Mbps |
| Keccak team | Virtex V | 2640 registers, 1330 slices | 122 MHz | 5200 Mbps |
| Keccak team | Virtex V | 244 registers, 448 slices | 265 MHz | 5 Mbps |
Notes
- Frequency: the frequency reported in the table is the one coming from the original report. In some cases, it is not the maximum frequency that the circuit can reach.
- (*): the number is extrapolated from a rate different from the default value 1024 (often from r=1088).
- (a): this value includes the area of the RAM. With external RAM, the coprocessor uses 5kGE (as reported in the Keccak main document). Including the area of the RAM yields 9.3kGE.
References
[1] B. Baldwin, N. Hanley, M. Hamilton, L. Lu, A. Byrne, M. O’Neill and W. P. Marnane, FPGA Implementations of the Round Two SHA-3 Candidates, IEEE FPL 2010
[2] K. Gaj, E. Homsirikamol and M. Rogawski, Fair and Comprehensive Methodology for Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates using FPGAs, CHES2010
[3] X. Guo, S. Huang, L. Nazhandali and P. Schaumont, Fair and Comprehensive Performance Evaluation of 14 Second Round SHA-3 ASIC Implementations, Second SHA-3 Candidate Conference, 2010
[4] L. Henzen, P. Gendotti, P. Guillet, E. Pargaetzi, M. Zoller and F. K. Gürkaynak, Developing a Hardware Evaluation Method for SHA-3 Candidates, CHES2010
[5] E. B. Kavun and T. Yalcin, A Lightweight Implementation of Keccak Hash Function for Radio-Frequency Identification Applications, RFIDsec’10
[6] J. Strömbergson, Implementation of the Keccak Hash Function in FPGA Devices
[7] S. Tillich, M. Feldhofer, M. Kirschbaum, T. Plos, J.-M. Schmidt and A. Szekely, High-Speed Hardware Implementations of BLAKE, Blue Midnight Wish, CubeHash, ECHO, Fugue, Grostl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD, and Skein, ePrint Archive, report 2009/510 v2.0, 2009
[8] S. Tillich, M. Feldhofer, M. Kirschbaum, T. Plos, J.-M. Schmidt and A. Szekely, Uniform Evaluation of Hardware Implementations of the Round-Two SHA-3 Candidates, Second SHA-3 Candidate Conference, 2010
[9] AIST RCIS, SHA-3 Hardware Project, 2010
[10] X. Guo, S. Huang, L. Nazhandali and P. Schaumont, On The Impact of Target Technology in SHA-3 Hardware Benchmark Rankings, ePrint Archive, report 2010/536, 2010