Daan Sprenkels <[email protected]>
"Shamir secret sharing is a method to split secrets into shares, and to later recombine them. However, it does not feature integrity protection of the secret. This article elaborates on Feldman VSS and Pederson VSS, which do protect the message integrity. Furthermore, we show how hashing the shares also protects the message integrity, but is vulnerable to a cheating dealer.
Markdown does not really support footnotes. In this document, I use blockquotes for this purpose.
A couple of weeks ago, Christopher Allen requested me to "summarize [my] post https://dsprenkels.com/sss-34c3.html" and submit it to the RWOT8 repo.
Threshold schemes are cryptographic schemes that allow a group of people to split a secret message
A prominent detail of threshold schemes is that the dealer can specify the threshold of the amount shares that needs to be reached for the message to be recombined. For instance, they can split a message into 5 shares, but allow the message to be recovered if only 4 shares are recombined. Should one share be lost by any of the participants, then the message can still be recovered.
In this article, I will try to set forth a general description on Shamir's secret sharing scheme in Section 2, and the two verifiable secret sharing schemes from Feldman and Pederson in Section 2. In Section 4 I will introduce a patched version of Shamir's scheme, using hash functions and analyze its security properties. In Section 5, I will briefly argue about some details that are important when choosing and implementing these kinds of schemes.
The first and still most relevant scheme is Shamir's secret sharing scheme (SSS). The idea is quite simple. Consider the dealer has a message
-
Distribution. The dealer:
- generates a random polynomial
$f(x)$ of order$t$ , i.e.$f(x) = a_{t-1}x^{t-1} + \ldots + a_{0}$ ; - sets the lowest order coefficient
$a_0$ to$m$ , s.t.$f(0) = m$ ; - evaluates
$n$ points on the polynomial:$s_{1} = (1, f(1)), \ldots, s_{n} = (n, f(n))$ , and distributes them among the participants.
- generates a random polynomial
-
Reconstruction. The participants:
- collect at least
$t$ shares; - fit a polynomial
$f'(x)$ between the points$s_{1}, \ldots, s_{t}$ , which can be done using Lagrange interpolation; and - they recover
$m = f'(0)$ .
- collect at least
Essentially, in Shamir's scheme the dealer creates a system of
Figure 1: In this case the dealer has their secret
Shamir's scheme is---what we call in crypto---information theoretically secure for confidentiality. In this case, this means that the
In the previous paragraph, I explicitly mention that SSS is secure for confidentiality. That is because SSS is very much insecure for integrity. If one of the participants knows something about the actual value of the secret, they can manipulate their share to force
If you are interested in the details of forging shares, I refer you to an explanation I wrote earlier on the Crypto StackExchange.
Of schoolbook SSS, we can write down some benefits and downsides.
- Information-theoretical security for confidentiality.
- Needs no mathematical assumptions.
- No integrity protection.
- Thus, the dealer can cheat.
We could quite easily patch SSS to secure the integrity of the secret, I will get back to this further in this article. However, before rolling our own crypto, I would like us to first take a brief look at the two most popular verifiable threshold schemes. Maybe we can take some inspiration from those, before trying to solve the integrity problem ourselves.
Note, these descriptions come from B. Schoenmakers' lecture notes on Cryptographic Protocols (Chapter 6). These notes contain more in depth security analyses on these threshold schemes.
In, Feldman VSS, let
-
Distribution. The dealer:
- runs Shamir's scheme on
$m$ and gets$f(x), s_{1}, \ldots, s_{n}$ ; and - computes and broadcasts commitments
$B_{j} = g^{a_j}$ for$0 \le j < t$ .
- runs Shamir's scheme on
-
Reconstruction. The participants:
- each verify all the shares by checking whether
$g^{s_i} = \prod_{j=0}^{t-1} B_j^{i^j}$ ; and - run Shamir's reconstruction protocol to recover
$m$ .
- each verify all the shares by checking whether
Benefits:
- DL-based security for confidentiality.
- Information-theoretical security for integrity.
- Dealer cannot cheat.
Downsides:
- Assumes DL assumption.
- Can only share uniformly random messages.
In Pederson VSS, let
-
Distribution. The dealer:
- generates two random polynomials
$f_a(x),f_b(x)$ of order$t$ , i.e.$f_a(x) = a_{t-1}x^{t-1} + \ldots + a_{0}$ and$f_b(x) = b_{t-1}x^{t-1} + \ldots + b_{0}$ ; - sets the lowest order coefficients
$a_0,b_0$ to$m$ , s.t.$f_a(0) = f_b(0) = m$ ; - evaluates
$n$ points on the polynomial$s_{1} = (1, f_a(1), f_b(1)), \ldots, s_{n} = (n, f_a(n), f_b(n))$ , distributes them among the participants; and - computes and broadcasts the commitments
$C_j = g^{a_j}h^{b_j}$ for =$0 \le j < t$ .
- generates two random polynomials
-
Reconstruction. The participants:
- collect at least
$t$ shares; - fit a polynomial
$f'(x)$ between the points$s_{1}, \ldots, s_{t}$ , which can be done using Lagrange interpolation; and - they recover
$m = f'(0)$ .
- collect at least
- Information theoretical security for confidentiality.
- DL-based security for integrity.
- Dealer cannot cheat.
- Assumes DL assumption.
We see that both schemes are based on broadcasting commitment values. These commitments force the shares to be some kind of value, without revealing that actual value. A participant can use these commitments to verify if their share is actually correct, but from these commitments, one cannot reproduce the shares.
From my experiences implementing crypto, I find that correctly implementing discrete logarithm-based schemes is very difficult. We do however know another building block which we can use to build a commitment scheme, that is the hash function. For RWOT8, this was first proposed by Peg in his article:
He says: "This is also something we considered, but feel that it gives custodians more unnecessary extra information and less accountability compared to other methods."
So, instead of committing to the polynomial coefficients, we commit to the values of the shares. I.e. we propose the following scheme:
-
Distribution. The dealer:
- runs Shamir's scheme on
$m$ and gets$f(x), s_{1}, \ldots, s_{n}$ ; and - computes and broadcasts commitments
$D_i = H(s_i)$ for$1 \le i \le n$ , where$H$ is a preimage resistant hash function.
- runs Shamir's scheme on
-
Reconstruction. The participants:
- each verify all the shares by checking whether
$H(s_i) = D_i$ ; and - run Shamir's reconstruction protocol to recover
$m$ .
- each verify all the shares by checking whether
Broadcasting the hashes of the shares makes sure that no participant can manipulate with the shares, because their share's hash would not verify in the reconstruction phase. Futhermore, the other participants will know if a specific share is invalid, instead of ending up with a garbled or different reconstructed message. This property is shown by the following proof:
Claim. Assume
Proof. Assume a participant is able to force the reconstruction of
So if this option was always, possible, why is it not very commonly described by cryptographers as a verifiable secret sharing scheme? Well, the reason is that in this hash-based scheme, the dealer can cheat.
In particular, the dealer can act as follows: They split a message into shares and distribute them to the participants. They compute the commitments
The participants will not know whether this participant was malicious, or if the dealer was malicious. In short, if the dealer is not completely trusted, we should not use this scheme. Instead we should probably use one of Section 3.
In conclusion, here is a listing of benefits and downsides, similar to the listings that were provided in the previous sections.
- Information theoretical security for confidentiality.
- Hash-based security for integrity.
- Dealer can cheat.
- Assumes hash function preimage resistance.
This section will be different from the previous ones. Where the previous sections are mainly about the theory, this will mainly be based on my personal experiences while implementing my personal secret sharing library and other cryptographic software.
As we all probably know, writing cryptographic software is hard. The main nemesis of a piece of cryptographic code is the side-channel attack. On regular PCs, the main attack that must be protected against is the (cache) timing attack. These attacks exploit the time that an algorithm takes to execute, which---on a computer---can be measured with a really precise resolution. A proposed solution to defend against this is to only run code on your computer that is trusted. However, we all know that this is nigh-on impossible to achieve, especially when we realize that web pages run tons of code that is not audited by the user.
Instead, we have to write all the software in a deterministic manner. That means we cannot use if-statements or table-lookups that depend on secret data.
Another consideration is the complexity of our cryptosystem. As a rule of thumb: Symmetric crypto and hash functions are relatively easy; Asymmetric crypto is generally very hard. Therefore, I would personally stay away from any discrete log based system, unless the requirements really ask for it. In my own library, I have opted to use the off-the-shelf AEAD "Salsa20/Poly1305" for integrity protection, and I have implemented all the arithmetic related to SSS in
An interesting case where we saw that the implementation of asymmetric crypto is hard where cryptonote-based currencies forgot to protect against small-subgroup attacks in Curve25519 in 2017.