As part of a collaboration project with
CZERTAINLY, I've started a Java
repository
to experiment hybrid PQC. Together with
Andrei
and
Danrlei
(some of our hero's undergrad students!), we developed some
examples of hybrid PQC using Bouncy Castle (1.77).
In this post, I'll explain some possible problems you might
encounter, in particular if you are planning a transition to
PQC with java, such as:
-
Design Architecture: how can I add hybrids in my Java
application?
-
Performance: it can get
3-12x worse! (comparing
PQ-only vs Hybrid, depending on some choices)
Let's start on how to develop a Hybrid KEM in Java using
Bouncy Castle.
Design alternatives
Recently, KeyFactor announced
FIPS 140-3 compliance
for Bouncy Castle, which includes post-quantum algorithms.
Looking at their docs, one can plan a migration as follows
(considering KEMs):
-
Certified version of Bouncy Castle (docs
here).
-
Or use non-FIPS KEMs (docs
here).
Regardless of the version, the recommended approach is to
use a traditional algorithm (such as ECDH) with a PQC one
(e.g., Kyber) as follows:
-
Key Agreement (older systems): using
"UserKeyingMaterialSpec" class to supply the PQC KEM
secret or "DEROtherInfoClass".
-
Key Agreement (newer systems):"HybridValueParameterSpec",
which uses a Two-step KDF (NIST.SP.800-56C).
-
Key Transport: "PQCSecretKeyProcessor". Documentation
states that it combines traditional and PQC shared secrets
using a XOR function.
Ok, all of this are in the docs, but if we look from an
implementation perspective, there is much more to deal with.
Theoretically, the migration can be represented as a flow in
which we start from traditional cryptography towards PQC:
We could generalize each step as a different approach to the
same end, which is, a way to secure our data. With
crypto agility
in mind, a flexible but easy-to-switch API is desirable. In
this case, the migration should not be hard to code.
Consider the array sorting as a quick example. The sorting
strategy (quick-sort, merge-sort, etc.) is something that a
computer-science student would like to switch in order to
compare performance.
So, how can I code such a easy-to-switch API? Well, there
are some alternatives and in our repository we choose a
pattern:
Strategy.
Strategy is kind of a set of black-box methods; easy to
switch between them. I know, there are cons to that, but if
we want to do something like
traditional -> hybrid -> PQ-only it means that
there are three strategies to deal with. It's not that
simple but this is the general idea that we select for our
implementation.
One caveat to the Strategy pattern is that if your
application is not using it maybe the refactor efforts may
slow down your migration. Since hybrids can be used for a
while (years, decades, forever? What if it is not Java
anymore? Who knows...), maybe a pattern would not be
necessary.
Anyway, our Strategy instance contains a KEM class, and a
HybridKEM class, coded in a way that you can switch between
them very easily, like this:
Such a design also helps to minimize the number of required
configurations (we can establish defaults after the
benchmarks).
There is also a security concern that hybrid keys should not
be used outside the "hybrid context" (to avoid downgrade
attacks). This concern is more critical when talking about
hybrid signatures (please refer to this
talk). Our
design is aligned to this idea, because the operations are
performed inside the hybrid class, but obviously we are
dependent on the interfaces that Bouncy Castle gives and how
the methods are actually used.
After some initial benchmarks, we saw a performance
bootleneck and that lead us to a second HybridKEM class:
HybridKEMxECDH (which does not fit well in the strategy
concept but we left "as-is" just for reference, or for an
easy copy-and-paste example).
Now let's see some results.
Performance
We selected Kyber as PQC algorithm and NIST P-curves to do
the benchmark. Our setup was configured without Turbo Boost
(from
here) and
disabling other unnecessary services in a Linux system. Flag
-n is used to generate a million iterations. The
table below shows the result (considering 1M KeyGens, 1M
Encaps, 1M Decaps):
By hybrid penalty I mean the slowdown when comparing PQ-only
vs hybrid. The above table shows penalties from
3.18x to 12.6x. Ouch!
Literature shows that hybrid modes do not harm performance
when considering algorithms at security level 1 (e.g.,
Kyber512), and reported some penalties when increasing
security levels. Some examples
here
and
here. However, our results are much worse from what I would
expect for the hybrid mode. So what's wrong?
First I thought that something was wrong with the
implementation, but after a quick profiling I saw this:
AbstractECMultiplier class seems to be the bootleneck
(comparing Key Generation). I could not simply fix it by
changing software providers (SunEC provider was slower!). So
if that performance is unacceptable, then we should change
our hybrid ECC component. By changing it to x25519 and X448
(curves described in RFC 7748), the hybrid penalties were
reduced significantly. See below:
As expected, such a change improved performance for the
hybrid mode. One can also use the
x25519 interface
directly, perhaps the performance could be further improved
that way. Therefore, x25519 is a better choice for the
hybrid.
Some details are missing, such as the KDF instances that we
use (but I don't think they would change the results). Also,
we map x25519 with Kyber512 and Kyber768, because
there is no level-3 equivalent (differently from NIST
P-384). Nevertheless, it seems that Kyber768 with x25519
will be used as default: examples
here and
here.
In conclusion, there are some pitfalls in the choices for
the algorithms and the design architecture when you migrate
your java application. Hopefully this post might help you
avoid some of them. For the next post, we'll see some
performance results when using hybrid signatures. Thanks!