Hybrid PQC in Java: Common Pitfalls (Part 1)
As part of a collaboration project with CZERTAINLY, I've started a Java
repository to experiment hybrid PQC.
Together with Andrei and Danrlei
(some of our hero's undergrad students!), we developed some examples of hybrid PQC using Bouncy Castle (1.77).
In this post, I'll explain some possible problems you might encounter, in particular if you are planning a transition to PQC with java, such as:
- Design Architecture: how can I add hybrids in my Java application?
- Performance: it can get 3-12x worse! (comparing PQ-only vs Hybrid, depending on some choices)
Design alternatives
Recently, KeyFactor announced FIPS 140-3 compliance for Bouncy Castle, which includes post-quantum algorithms. Looking at their docs, one can plan
a migration as follows (considering KEMs):
- Key Agreement (older systems): using "UserKeyingMaterialSpec" class to supply the PQC KEM secret or "DEROtherInfoClass".
- Key Agreement (newer systems):"HybridValueParameterSpec", which uses a Two-step KDF (NIST.SP.800-56C).
- Key Transport: "PQCSecretKeyProcessor". Documentation states that it combines traditional and PQC shared secrets using a XOR function.
We could generalize each step as a different approach to the same end, which is, a way to secure our data.
With crypto agility in mind, a flexible but easy-to-switch API is desirable. In this case, the migration should not be hard to code.
Consider the array sorting as a quick example. The sorting strategy (quick-sort, merge-sort, etc.) is something that a computer-science student would like to switch in order to compare performance.
So, how can I code such a easy-to-switch API?
Well, there are some alternatives and in our repository we choose a pattern: Strategy.
Strategy is kind of a set of black-box methods; easy to switch between them.
I know, there are cons to that, but if we want to do something like traditional -> hybrid -> PQ-only
it means that there are three strategies to deal with.
It's not that simple but this is the general idea that we select for our implementation.
One caveat to the Strategy pattern is that if your application is not using it maybe the refactor efforts may slow down your migration.
Since hybrids can be used for a while (years, decades, forever? What if it is not Java anymore? Who knows...), maybe a pattern would not be necessary.
Anyway, our Strategy instance contains a KEM class, and a HybridKEM class, coded in a way that you can switch between them very easily, like this:
Such a design also helps to minimize the number of required configurations (we can establish defaults after the benchmarks).
There is also a security concern that hybrid keys should not be used outside the "hybrid context" (to avoid downgrade attacks).
This concern is more critical when talking about hybrid signatures (please refer to this talk).
Our design is aligned to this idea, because the operations are performed inside the hybrid class, but obviously we are dependent on the interfaces that Bouncy Castle gives and how the methods are actually used.
After some initial benchmarks, we saw a performance bootleneck and that lead us to a second HybridKEM class: HybridKEMxECDH (which does not fit well in the strategy concept but we left "as-is" just for reference, or for an easy copy-and-paste example).
Now let's see some results.
Performance
We selected Kyber as PQC algorithm and NIST P-curves to do the benchmark. Our setup was configured without Turbo Boost (from here) and disabling other unnecessary services in a Linux system. Flag -n is used to generate a million iterations. The table below shows the result (considering 1M KeyGens, 1M Encaps, 1M Decaps):
Parameter Set | Keygen/s (PQ-only) |
Keygen/s (hybrid) |
Hybrid Penalty |
Encaps/s (PQ-only) |
Encaps/s (hybrid) |
Hybrid Penalty |
Decaps/s (PQ-only) |
Decaps/s (hybrid) |
Hybrid Penalty |
---|---|---|---|---|---|---|---|---|---|
Kyber 512 | 12777.75 | 4008.30 | 3.18x | 12107.27 | 2394.96 | 5.05x | 9730.27 | 2315.45 | 4.20x |
Kyber 768 | 7895.65 | 1643.80 | 4.8x | 8092.77 | 810.05 | 9.99x | 6686.68 | 786.28 | 8.5x |
Kyber 1024 | 5150.84 | 828.53 | 6.21x | 5567.58 | 379.02 | 14.68x | 4717.69 | 374.40 | 12.6x |
By hybrid penalty I mean the slowdown when comparing PQ-only vs hybrid. The above table shows penalties from 3.18x to 12.6x. Ouch!
Literature shows that hybrid modes do not harm performance when considering algorithms at security level 1 (e.g., Kyber512), and reported some penalties when increasing security levels. Some examples
here and
here. However, our results are much worse from what I would expect for the hybrid mode. So what's wrong?
First I thought that something was wrong with the implementation, but after a quick profiling I saw this:
AbstractECMultiplier class seems to be the bootleneck (comparing Key Generation). I could not simply fix it by changing software providers (SunEC provider was slower!). So if that performance is unacceptable, then we should change our hybrid ECC component. By changing it to x25519 and X448 (curves described in RFC 7748), the hybrid penalties were reduced significantly. See below:
Parameter Set | Keygen/s (PQ-only) |
Keygen/s (hybrid) |
Hybrid Penalty |
Encaps/s (PQ-only) |
Encaps/s (hybrid) |
Hybrid Penalty |
Decaps/s (PQ-only) |
Decaps/s (hybrid) |
Hybrid Penalty |
---|---|---|---|---|---|---|---|---|---|
Kyber 512 | 12839.11 | 4943.30 | 2.59x | 12199.58 | 4028.76 | 3.02 | 9701.29 | 3771.96 | 2.57x |
Kyber 768 | 7982.05 | 3973.29 | 2.0x | 8102.80 | 3473.01 | 2.33x | 6770.70 | 3193.44 | 2.12x |
Kyber 1024 | 5188.92 | 1465.73 | 3.54x | 5550.62 | 1245.92 | 4.45x | 4715.06 | 1198.32 | 3.93x |
As expected, such a change improved performance for the hybrid mode. One can also use the x25519 interface directly,
perhaps the performance could be further improved that way.
Therefore, x25519 is a better choice for the hybrid.
Some details are missing, such as the KDF instances that we use (but I don't think they would change the results).
Also, we map x25519 with Kyber512 and Kyber768, because there is no level-3 equivalent (differently from NIST P-384).
Nevertheless, it seems that Kyber768 with x25519 will be used as default: examples
here and
here.
In conclusion, there are some pitfalls in the choices for the algorithms and the design architecture when you migrate your java application.
Hopefully this post might help you avoid some of them. For the next post, we'll see some performance results when using hybrid signatures. Thanks!