The BGW Protocol

: Last Updated: January 15 2019

We've learned how to collectively compute a sum in a multiparty computation setting (i.e. to securely compute a sum) using Yao's technique[Y82] of masking a secret with a random value in such a way as to guarantee security, i.e. the correctness and privacy properties of the distributed computation. Unfortunately, we've seen that security would only hold if all computing parties were honest. Even in the semi-honest (gossipy) setting, security couldn't be guaranteed anymore since neither the privacy nor the correctness property could be achieved. In the malicious setting, all bets were off.

Using Shamir's threshold scheme of secret sharing, it turns out that we can build an MPC protocol to securely compute any efficienlty computable arithmetic function of additions and multiplications even in the presence of (not too many) corrupted parties. In a groundbreaking 1988 paper[BGW88], Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson described the BGW protocol that we'll explain below. It should be noted that this protocol was also independently discovered the same year by David Chaum, Claude Crépeau, and Ivan Damgård[CCD88], but the name BGW stuck.

What is so special about the BGW protocol, and why is it so famous? While the protocol per se is beautiful and worth studying on its own, it is the completeness theorems that were proven in [BGW88, AL11] which had a profound influence in Theoretical Cryptography. In a nutshell, they state that on a cluster of \(n\) parties, out of which \(t \le n\) parties are corrupted by an adversary, every efficiently computable arithmetic function can be computed with perfect security (i.e. in the information theoretic sense),

so long as \(t \lt n / 2\) parties are semi-honest (honest-but-curious)
so long as \(t \lt n / 3\) parties are malicious.

Prerequisites

To fully understand this crypto bite, you'll need

Theory of Multiparty Computation to understand what MPC is all about and why we care. There, you'll find what the term semi-honest and malicious parties mean, and what's the difference between information-theoretic and computational sense is and why it matters. You can safely skip the part about the Ideal vs. Real World and about simulations.
Shamir's threshold scheme of key sharing which is the cryptographic primitive which we'll use to securely compute additions and multiplications.

The Story So Far...

In Secure Multiparty Computation (MPC), also called Secure Function Evaluation (SFE), multiple parties \(P_1, P_2, \dots, P_n\) want to collaboratively compute a function \(y = f(x_1, x_2, \dots, x_n)\), where the input \(x_i\) is only known by \(P_i\) for all \(i \in \{1, 2, \dots, n\}\). In other words: each party \(P_i\) is totally oblivious to the other parties' inputs \(x_j, j \ne i\). This is the privacy property. The correctness property being simply that the correct result is being computed (even if some parties are corrupted). We'll use the term "to securely compute something" for this kind of computation, where both the privacy and the correctness properties are being preserved.

We've noted that this problem has a trivial solution, if we allow for an additional trusted third party \(P_\aleph\): each party \(P_i, i \ne \aleph\) securely sends its input to the trusted third party \(P_\aleph\) which collects the inputs of all \(n\) parties. \(P_\aleph\) then computes \(y = f(x_1, x_2, \dots, x_n)\), and publically announces the result \(y\). While this doesn't require secure computing at all, this places the burden of computation on the computing trusted party, effectively negating the potential speedup and privacy advantages that distributed computing offer. Therefore, we want to do without a computing trusted third party altogether and have the "workers" \(P_1, P_2, \dots, P_n\) securely compute (parts of) \(f(x_1, x_2, \dots, x_n)\) in a secure manner, without them knowing the inputs of the other workers.

The concept of secure computing may seem weird at first: how can one compute a function without knowing all of its inputs? That it can be done in the general case is one of the most amazing facts that cryptologist research has ever discovered. Before tackling the general case, we started slowly, to get accustomed to this concept:

In The Game of Like, we've seen how to securely compute the AND function of two Boolean variables \(y = f(x_1, x_2) = x_1 \wedge x_2\) in the analog physical world, using a Game of Cards.

In The Sum of Our Earnings, we've securely computed the Addition function \(y = f(x_1, x_2, \dots, x_n) = \sum_{j=1}^n x_j\) using Yao's technique[Y82] called secret blinding or masking. However, the condition was that the parties were not only semi-honest (i.e. being curious, but following the protocol to the letter), they were also required to be non-gossipy, i.e. (completely) honest. Just one gossipy party, and the privacy property went out of the window.

Now, we want to securely compute not only the addition, but any efficiently computable arithmetic function consisting of additions and multiplications (such as, say, \(y = f(x_1, x_2, x_3) = x_1 x_2 + x_3 - 3 x_1\)) even when some parties are under the control of the adversary, i.e. being semi-honest (gossipy, but otherwise still following the protocol to the letter) or even outright malicious (completely departing from the protocol and running arbitrary code). We can achieve this feast using Shamir's scheme, so long as not too many parties are corrupted.

The BGW Protocol

Roadmap

In the following, we'll first look at the general setup, i.e. what will be sent to each party, and what each party is supposed to compute. Getting the setup right is very important.

We'll then tackle the simple basic task of securely adding two inputs \(x_a + x_b\) on the cluster. All \(n\) workers will collectively securely add both inputs. Securely multiplying a constant with an input on the cluster, i.e. computing \(c x_a\), comes for free, once we've mastered additions. It will turn out that both computations don't require communication between the parties! This is a very desirable property of additions and multiplications with a constant, because the main bottleneck of MPC is the communications overhead between the parties.

Secure multiplication of two inputs \(x_a x_b\) on the cluster will turn out to be more tricky. Because multiplying two polynomials of degree \(t\) results in a much larger polynomial of degree \(2t\), we'll have to somehow reduce the degree back to \(t\) without changing the result of the computation. Doing so turns out to be possible, but it is a little bit more complex than simple secure additions and secure multiplications with a constant.

A bigger issue with secure multiplications is increased communications overhead in the complete cluster of \(n\) parties. Indeed: neither the \(2t\) degree polynomial nor its reduced \(t\) degree version has coefficients uniformly sampled from a random set. This weakens the security of the scheme, since now, an adversary may guess the coefficients with strictly more than negligible probability, and could therefore break the privacy property. To restore security after one (or maybe a small couple of many) multiplication(s), the whole cluster will need to refresh their shares using a mechanism called proactive secret sharing (PSS) [HJKY95, BD17]. Each invokation of this mechanism involves one party contacting all other \(n-1\) parties in the cluster, resulting in an \(O(n)\) communications overhead at each refreshing step. When and how often PSS has to be invoked depends on the level of security and on the number of multiplications that are to be computed.

Up until now, we've used the whole cluster of \(n\) parties to securely compute binary additions, binary multiplications, and unary multiplications with a constant. But we want to compute \(f(x_1, x_2, \dots, x_n)\) which involves a whole set of additions and multiplications. For example, computing \(f(x_1, x_2, x_3) = x_1 + x_2 + x_3\) involves a ternary addition \(+(x_1, x_2, x_3)\). But we don't know yet how to securely compute ternary additions yet. Fortunately, the problem can be reduced to binary and unary operations, since \(x_1 + x_2 + x_3 = (x_1 + x_2) + x_3\). We'll therefore securely compute the binary addition \(y_1 = x_1 + x_2\) on the cluster, and then securely compute the binary addition \(y = y_1 + x_3\) on the cluster.

In the general case, each function consisting of additions, multiplications and constant-multiplications can be modeled as a circuit of cascading gates consisting of many 2-adders (adders which accept two inputs), 2-multipliers (multipliers that accept two inputs), and "1-constant-multipliers" (multipliers with an embedded constant which accept one input). The idea is that initiator \(P_0\) and all parties will order the gates in a topological fashion, and, using the same circuit representation, will proceed to compute one gate after the other until they reach the output.

(... to be done: add a diagram with a sea of cascading gates)

So let's get started!

General Setup

We have two types of parties:

An initiator \(P_0\) which knows all \(n\) inputs \(x_1, x_2, \dots, x_n\) for the function \(y = f(x_1, x_2, \dots, x_n)\) to be securely computed.
A set of \(n\) compute parties \(P_1, P_2, \dots, P_n\) which will collectively compute each gate of the circuit representing \(f\).

All parties are interconnected with perfectly secure bidirectional links, resulting in a full mesh of \(O(n^2)\) links. In practice, the links will be authenticated and encrypted TLS 1.3 connections, but then, we'll of course lose perfect security and fall back to computational security, since those real world links make cryptographic assumptions. We may also assume the availability of a common broadcast channel (which, in practice, will usually be implemented by each broadcaster sending some message out to every point to point link).

The BGW protocol consists of three stages:

the input stage, where the circuit description and the inputs are fed from \(P_0\) to the compute parties \(P_1, P_2, \dots, P_n\)
the computation stage, where the compute parties proceed each gate of the circuit while exchanging messages with shares of intermediate values
the final stage, where parties pool their shares of the output gate, to reconstruct the computed final result.

The Input Stage

The input stage of the BGW protocol consists of two tasks to be run by \(P_0\):

\(P_0\) compiles \(f\) into a circuit \(C_f\) of adders, constant-multipliers, and multipliers. It then sends a description \(\langle C_f \rangle\) of this circuit to every compute party \(P_1, P_2, \dots, P_n\).
\(P_0\) distributes the inputs \(x_1, x_2, \dots, x_n\) of the function \(f\) to the corresponding parties \(P_1, P_2, \dots, P_n\) (i.e. it sends \(x_i\) to \(P_i\) and only to \(P_i\) over a secure link).

Details follow.

Broadcasting the Circuit Description

At the beginning, \(P_0\) compiles the arithmetic function \(f\) into a circuit \(C_f\) of cascading gates, which could be 2-adders, 1-constant-multipliers, and 2-multipliers. A description \(\langle C_f \rangle\) of this circuit is then broadcast from \(P_0\) to all compute parties \(P_1, P_2, \dots, P_n\). As part of the circuit representation sent by \(P_0\) to the compute parties is an ordered list \(G\) of gates obtained by topologically ordering the gates in such a way that gate \(G_i\) follows all gates \(G_j\) which directly or indirectly contribute to \(G_i\)'s input wires. Note that this is always possible, since we can always represent an arithmetic function with a DAG (a directed, acyclic graph)[PhD09], so we won't have any circle of later gates output pointing back to previous gates inputs.

Later, when evaluating the circuit in the computation phase, each compute party will synchroneously evaluate each gate from this ordered list of gates in this order. In other words: if the ordered list of gates happens to be \(G = (G_1, G_2, \dots, G_{|G|})\), then all parties first collectively evaluate gate \(G_1\), then collectively evaluate gate \(G_2\) and so on. Note that this is well-defined, since evaluating gate \(G_i\) is always possible, because both inputs (or single input in the case of constant-multipliers) will already have been computed by evaluating the previous gates. \(G\) doesn't have to be explicit. It could just as well be implicit e.g. by listing the gates in the circuit description in the already sorted order.

Noteworthy is that the length of the computation grows linearly with the depth (size) of the circuit. The complexer the formula for \(f\), the more gates are to be evaluated, and the longer the computation. In other words: the number of rounds grows linearly with the size of the circuit. Other MPC protocols than BGW can run in a constant number of rounds[BiB89, DFKNT06], i.e. are independent from the number of gates.

Example: to compute the previously mentioned function \(y = f(x_1, x_2, x_3) = x_1 x_2 + x_3 - 3 x_1\), one possible circuit description could be (here, \(y^{(i)}\) means the i-th intermediary output, it has nothing to do with exponentiation):

\[ \langle C_f \rangle = (\underbrace{(y^{(1)}, \times, x_1, x_2)}_{G_1}, \underbrace{(y^{(2)}, +, y^{(1)}, x_3)}_{G_2}, \underbrace{(y^{(3)}, \times_c, -3, x_1)}_{G_3}, \underbrace{(y^{(4)}, +, y^{(2)}, y^{(3)})}_{G_4}) \]

The list of gates \(G = (G_1, G_2, G_3, G_4)\) is already ordered topologically.

(... to do: add diagram of this circuit)

Distributing the inputs

The next step is for \(P_0\) to securely distribute the inputs \(x_1, x_2, \dots, x_n\) to the corresponding compute parties. Therefore, \(P_0\) securely sends \(x_1\) to \(P_1\); it securely sends \(x_2\) to \(P_2\), und so on up to securely sending \(x_n\) to \(P_n\). By "\(P_i\) securely sending \(m\) to \(P_j\)", we mean that \(P_i\) will send the message \(m\) via the secure point-to-point link it shares with \(P_j\): no other party or adversary can peek into this link (secrecy) nor can it interfere with its operations (integrity, guaranteed delivery).

In order to ensure that the privacy property holds, it must be guaranteed that no party \(P_i\) sees or infers the input of other parties \(P_j, j \ne i\). This also holds true for all intermediary values, i.e. output of some gates which become inputs to later gates. This is where Shamir's threshold scheme enters the stage!

Upon receiving their inputs \(x_i\) from \(P_0\), each compute party \(P_i\) saves \(x_i\) locally, and then uses Shamir's threshold scheme to "split" \(x_i\) in a \(t\)-out-of-\(n\) way into \(n\) shares \(x_{i_1}, x_{i_2}, \dots, x_{i_n}\). It then sends each share \(x_{i_j}\) to party \(P_j\) over its secure point-to-point link. At the end of this phase, each party \(P_j\) not only possesses its own input \(x_j\), it also built an internal table of shares of other parties' inputs:


Receiving from party \(P_i\)	Input share from \(P_i\)
\(P_0\) (to myself)	\(x_j\)
\(P_1\)	\(x_{1_j}\)
\(P_2\)	\(x_{2_j}\)
\(\dots\)	\(\dots\)
\(P_n\)	\(x_{n_j}\)

How are shares computed?

Before we show how to compute the shares, we fix a number \(E\) such that \(E \gt n\), and we'll compute everything modulo \(E\). This constant will remain fixed througout the whole BGW protocol, and is known by all parties.

Now, for each party \(P_i\) to compute the shares \((x_{i_j})_{1 \le j \le n}\) from its input \(x_i\), \(P_i\) uses Shamir's secret sharing scheme as follows:

\(P_i\) randomly and uniformly samples \(t-1\) coefficients \(a_1, a_2, \dots, a_{t-1}\) from the set \(\{0,1,2,\dots,E\}\), where \(a_{t-1} \ne 0\).
\(P_i\) builds the polynomial \(p_i(x) = a_{t-1} x^{t-1} + a_{t-2} x^{t-2} + \dots + a_1 t + x_i\). Note that the constant term is the input \(x_i\). In other words, no matter what random coefficients we sample, it always holds that \(p_i(0) = x_i\).
\(P_i\) computes the shares \(x_{i_j}\) by evaluating \(p_i(x)\) at the position \(j\), for each \(i \in \{1, 2, \dots, n\}\), e.g. Horner's method. In other words, \(P_i\) sets \(x_{i_j} := p_i(j)\) for each \(j \in \{1, 2, \dots, n\}\). Recall that all computations are to be done modulo \(E\).

Any \(t \le n\) parties can reconstruct the polynomial \(p_i(x)\) at any position, including at the particularly interesting position 0 which corresponds to the shared secret \(x_i = p_i(0) = x_i + \sum_{k=1}^t a_k 0^k\), even without knowing the coefficients \(a_k\), using Lagrange interpolation or, equivalently the formula shown in "Shamir's secret sharing scheme".

(... do we need \(t\) or \(t-1\) random coefficients?)

After computing \(x_i\)'s shares \(x_{i_j}, j \in \{1, 2, \dots, n\}\), party \(P_i\) proceeds to send share these shares (except for its own share \(x_{i_i}\)) to every of its peers. In other words, \(P_i\) sends \(x_{i_1}\) to \(P_1\), it sends \(x_{i_2}\) to \(P_2\), and so on. Note that each party \(P_j\) "knows" its index \(j\), and this index is fixed for the whole run of the BGW protocol.

At the end of this phase, each party \(P_j\) has built the table shown above. In other words, each party \(P_j\) has not only its own input \(x_j\), it also knows one share \(x_{i_j}\) of every other party \(P_i\)'s input \(x_i\). Note that so long as strictly fewer than \(t\) parties collude, \(P_j\) can't reconstruct the other parties inputs, as discussed in Shamir's scheme. This ensures that the privacy property is preserved at this stage.

What about multiple inputs?

If each party is supposed to get multiple inputs, then each input is separatly processed as above. For example, suppose that we want to compute \(y = f(x_a, x_b)\). The initiator \(P_0\) will split the first \(x_a\) in a t-out-of-n fashion into the shares \(x_{a_1}, x_{a_2}, \dots, x_{a_n}\) and will send each \(x_{a_i}\) to party \(P_i\) over the secure link. Then, \(P_0\) will split the second input \(x_b\) in a t-out-of-n fashion into the shares \(x_{b_1}, x_{b_2}, \dots, x_{b_n}\) using a fresh random polynomial, and will send each \(x_{b_i}\) to party \(P_i\) over the secure link.

Freshness: Note that since there are two inputs to share, \(P_0\) uses two different random polynomials to split the inputs. In other words, it splits \(x_a\) with a random polynomial \(p_a(x)\), and it splits \(x_b\) with another random polynomial \(p_b(x)\). Both polynomials of degree \(t-1\) should not only differ in the constant term \(p_a(0) = x_a\) and \(p_b(0) = x_b\), they also need to differ in all their coefficients. In other words: when choosing coefficients for \(p_b(x)\) we need to invoke the pseudorandom generator again to obtain fresh random coefficients instead of reusing the coefficients from \(p_a(x)\).

After receiving its input shares \(x_{a_i}\) and \(x_{b_i}\), party \(P_i\) will split both of them in a t-out-of-n fashion (again using two fresh random polynomials of degree \(t-1\)), resulting in the shares \(x_{{a_i}_j}\), and in the shares \(x_{{b_i}_j}\), which it will send to party \(P_j\).

At the end of this phase, each party \(P_j\) will have built the following table:


Receiving from party \(P_i\)	First input share from \(P_i\)	Second input share from \(P_i\)
\(P_0\) (to myself)	\(x_{a_j}\)	\(x_{b_j}\)
\(P_1\)	\(x_{a_{1_j}}\)	\(x_{b_{1_j}}\)
\(P_2\)	\(x_{a_{2_j}}\)	\(x_{b_{2_j}}\)
\(\dots\)	\(\dots\)	\(\dots\)
\(P_n\)	\(x_{a_{n_j}}\)	\(x_{b_{n_j}}\)

The Computation Stage

Now, all compute parties \(P_1, P_2, \dots, P_n\) each compute the first gate \(G_1\) of the list \(G\) concurrently at the same time. Depending on the circuit description, they'll select all input shares from their internal table and use these to securely compute additions, constant multiplications, and multiplications as we'll show below. Each party \(P_i\) will compute some intermediary output \(y^{(1)}_i\). This output corresponds to \(P_i\)'s view/share of \(G_1\)'s output wire (or rather output register).

Because that intermediary output will be the input of some later gate, it will have to be shared in a t-out-of-n fashion with all other peers. Therefore, \(P_i\) will use Shamir's secret sharing scheme to split \(y^{(1)}_i\) (using a new random polynomial!) into the shares \(y^{(1)}_{i_1}, y^{(1)}_{i_2}, \dots, y^{(1)}_{i_n}\). Then, \(P_i\) will send each of these shares to the corresponding peer, i.e. it will send \(y^{(1)}_{i_j}\) to peer \(P_j\) for every \(j \ne i\).

The peers \(P_j\) will collect these intermediary results, and will store them in their internal table, either in the column of the first or second input, depending on the wiring of the circuit being computed.

Then, when all shares of the intermediary result \(y^{(1)}\) have been computed and disseminated across the whole cluster of \(n\) parties, time has come for them to compute gate \(G_2\), exactly repeating the computation stage just shown, but eventually using the intermediary shares instead of the original input shares from \(P_0\). The computation then proceeds gate by gate until the last gate \(G_{|G|}\).

Freshness: Every time a party \(P_i\) needs to share some value intermediary outputs, it does so using a fresh (new) random polynomial. In other words, the random coefficients are not reused, they are uniformly randomly sampled again for every value that is to be shared.

Example (continued): To resume the previous example, each of the three computing parties \(P_i, i = 1, 2, 3\) will first securely evaluate the gate \((y^{(1)}, \times, x_1, x_2)\) by invoking the secure multiplication algorithm shown below for the multiplication gate on their input shares \(x_{1_{i_1}}, x_{1_{i_2}}, x_{1_{i_3}}\) and \(x_{2_{i_1}}, x_{2_{i_2}}, x_{2_{i_3}}\), resulting in an internal output \(y^{(1)}_i\). You may already guess where this is going: each party \(P_i\) will then split this internal output \(y^{(1)}_i\) using Shamir's scheme into \(n\) shares \(y^{(1)}_{i_1}, y^{(1)}_{i_2}, \dots, y^{(1)}_{i_n}\) which will be distributed to \(P_1, P_2, \dots, P_n\) respectively, either as first or as second input, depending on the (commonly known) circuit description. Each party updates its table of shares accordingly by overwriting its first input or second input column with the newly received shares.

The next step would be for all parties to compute the second gate \((y^{(2)}, +, y^{(1)}, x_3)\). They alreay have their shares of \(x_3\), i.e. \(P_i\) has \(x_{3_i}\). And they now have their shares of the intermediary wire \(y^{(1)}\), i.e. \(P_i\) has \(y^{(1)}_{1_i}, y^{(1)}_{2_i}, \dots, y^{(1)}_{n_i}\). Using the algorithm for secure addition (see below), they can now compute their shares of the new intermediary value \(y^{(2)}\). In other words, \(P_i\) computes its share \(y^{(2)}_i\), which it'll split using Shamir's scheme into shares \(y^{(2)}_{i_1}, y^{(2)}_{i_2}, \dots, y^{(2)}_{i_n}\), which it will send to its corresponding peers...

The computation proceeds gate by gate in exactly the same manner, until the last gate is evaluated by each \(P_i\), and the output \(y^{(4)}_i\) has been Shamir-split into \(y^{(4)}_{i_1}, y^{(4)}_{i_2}, \dots, y^{(4)}_{i_n}\)... but that last result is not yet distributed to the peers.

Just ot emphasize it: the computation in the Computation Stage is "kind of synchroneous" in the sense that all the \(n\) parties first compute \(G_1\), then they all compute \(G_2\) etc... if necessary, they have to wait until every party is ready to compute the next gate. There is some kind of implicit clock in there, because in order to start processing gate \(G_i\), all parties will have to wait until they get the shares of its first input and the shares of its second input from all other peers.

The Final Stage

At the end of last gate's computation, all parties \(P_i\) have computed shares \(y_i\) of last gate's output \(y\). One last time, each of \(P_i\) will Shamir-split these output shares \(y_i\) in a t-out-of-n fashion like this:

Party \(P_i\)	Computed output shares
\(P_1\)	\(y_1\) with shares \(y_{1_1}, y_{1_2}, \dots, y_{1_n}\)
\(P_2\)	\(y_2\) with shares \(y_{2_1}, y_{2_2}, \dots, y_{2_n}\)
\(\dots\)	\(\dots\)
\(P_n\)	\(y_n\) with shares \(y_{n_1}, y_{n_2}, \dots, y_{n_n}\)

In order to reconstruct the final output \(y\), all it takes are \(t \le n\) parties to combine their output shares. If all parties are supposed to learn the final \(y\), then, at this stage, evey party \(P_i\) simply sends its output share \(y_{i_j}\) to party \(P_j\), for all \(j \ne i\), just as in the computation phase. Then, every \(P_j\) has enough shares to calculate the coefficients of the final Shamir polynomial \(p_o(x)\), and to then reconstruct the output \(y = p_o(0)\) at location \(0\).

However, this scheme can do better than that! If we only want a subset of the \(P_j\) to learn the final result (maybe we don't trust some compute servers that are located in doubtful jurisdictions), then we can achieve this as well: all \(n\) parties \(P_i\) will then simply send their output shares only to the \(P_j\)s that are supposed to learn the final result. If nobody cheats, then only these \(P_j\)s will be able to reconstruct \(y\).

Example (continued): Recall that at the end of the computation for the last gate \(G_4\), each party \(P_i, i \in \{1,2,3\}\) has computed its share \(y^{(4)}_i\) of the output wire \(y^{(4)}\). It also computed their shares \(y^{(4)}_{i_1}, y^{(4)}_{i_2}, y^{(4)}_{i_3}\), but didn't distribute them. If all parties are to learn the final result \(y = y^{(4)}\), then \(P_1\) sends \(y^{(4)}_{1_2}\) to \(P_2\) and sends \(y^{(4)}_{1_3}\) to \(P_3\). Similarly, \(P_2\) sends \(y^{(4)}_{2_1}\) to \(P_1\) and sends \(y^{(4)}_{2_3}\) to \(P_3\). Also \(P_3\) sends \(y^{(4)}_{3_1}\) to \(P_1\) and sends \(y^{(4)}_{3_2}\) to \(P_2\). Now, all parties have all the necessary shares to compute \(y^{(4)}\). For example, \(P_1\) computes \(y^{(4)}\) by combining its own \(y^{(4)}_{1_1}\) with \(y^{(4)}_{2_1}\) it got from \(P_2\) and with \(y^{(4)}_{3_1}\) it got from \(P_3\).

If, however, only \(P_2\) is supposed to learn the final result, then \(P_1\), \(P_2\), and \(P_3\) will send their output shares only to \(P_2\). Observe that \(P_2\) will have all shares necessary to compute the output, but than \(P_1\) and \(P_3\) will not have enough shares to compute the result (if \(t \gt 1\)), and won't be able to learn the result.

Securely Computing the Gates

All that remains to be shown is how to securely compute the gates of a 2-addition, of a multiplication with a constant, and of a 2-multiplication. This is, so to speak, the "hard core" of the BGW protocol. It is where the "magic" of secure multiparty computation actually occurs.

Everything so far was merely scaffolding: we went to great lengths to distribute the one or two inputs of a single gate \(G_k\) to all \(n\) computing parties \(P_i, i \in \{1,2,\dots,n\}\). These inputs could have come from the initiator \(P_0\), or from intermediary outputs of the (shared secure) computation of previous gates.

Now, each party \(P_j, j \in \{1,2,\dots,n\}\) has the following table of input shares (repeated from above, assuming two input wires):


Receiving from party \(P_i\)	First input share from \(P_i\)	Second input share from \(P_i\)
\(P_1\)	\(x_{a_{1_j}}\)	\(x_{b_{1_j}}\)
\(P_2\)	\(x_{a_{2_j}}\)	\(x_{b_{2_j}}\)
\(\dots\)	\(\dots\)	\(\dots\)
\(P_n\)	\(x_{a_{n_j}}\)	\(x_{b_{n_j}}\)

(We've removed the first row, because we assume that each party has already split its inputs accordingly)

Now, the mission is to use these shares to either securely compute an addition \(x_a + x_b\), or a multiplication with a constant \(c x_a\) (where \(c\) is hard-coded in the circuit description: note that \(c\) is known in this case, which is okay, since it is not an input anyway), or a multiplication \(x_a x_b\).

Secure Addition

The crucial observation is that adding two Shamir polynomials \(p_a(x) + p_b(x)\) of degree \(t-1\) results in a new Shamir polynomial \(p_o(x)\) of degree \(t-1\) for the output wire, with the remarkable property

\[ p_a(0) + p_b(0) = x_a + x_b \]

Indeed:

\[ p_a(x) + p_b(x) = \sum_{i=0}^{t-1} a_i x_i + \sum_{i=0}^{t-1} b_i x^i = \sum_{i=0}^{t-1} (a_i + b_i) x^i \]

Especially:

\[ p_a(0) + p_b(0) = \sum_{i=0}^{t-1} (a_i + b_i) 0^i = (a_0 + b_0) + \underbrace{\sum_{i=1}^{t-1} (a_i + b_i) 0^i}_{= 0} = x_a + x_b \]

recalling that in Shamir's secret sharing, the constant terms are the secret that is to be shared, i.e. that \(a_0 = x_a\) and \(b_0 = x_b\).

But hold your applause for now, because we still have one important aspect to discuss: is the new polynomial \(p_o(x) = p_a(x) + p_b(x)\) really a Shamir polynomial? In other words, it is not enough to make sure that \(p_o(0)\) encodes the sum, we also need to show that all coefficients \(o_1, o_2, \dots, o_{t-1}\) of \(p_o(x)\) are also randomly and uniformly selected from the set \(\{0, 1, 2, \dots, E-1\}\). This is indeed the case, since every coefficient \(o_i = a_i + b_i \mod{E}\) is the sum of two uniformly randomly chosen elements from this same set. Exercise: Do we also need to make sure that \(o_{t-1}\) isn't 0?

Note that computing modulo \(E\) is necessary here to preserve the property that the resuting coefficients are still uniformly random over the set \(\{0,1,2,\dots,E-1\}\). Discuss!

Therefore, to compute an addition gate, all we (say party \(P_i\)) need to do is to compute the shares of \(y_i = x_a + x_b\). We don't compute \(y_i\) directly though, because, of course, we don't know and are not supposed to know \(x_a + x_b\). Instead, we proceed by computing the output shares \(y_{i_1}, y_{i_2}, \dots, y_{i_n}\) directly, which we'll send to all peers. And this is trivial! We just add the shares we already have

\[ y_{i_j} = x_{a_{i_j}} + x_{b_{i_j}}, \forall j \in \{1, 2, \dots, n\} \]

We then distribute these shares to the corresponding peers als shown above, i.e. we send \(y_{i_j}\) to \(P_j\), for every \(j \in \{1,2,\dots,n\}\).

Why is this correct? If \(t\) or more parties colluded and added their shares, they would reconstruct a Shamir polynomial \(p_{o'}(x)\) of degree \(t-1\) with the desired property \(p_{o'}(0) = x_a + x_b\). In other words, we simply add the polynomials \(p_a(x)\) und \(p_b(x)\) at each point in \(\{1, 2, \dots, n\}\). The resulting \(n\) points define a sum polynomial of degree \(t-1\), which happens to evaluate to the desired sum at the point 0. That's it!

Communications overhead: Observe that all computations occur locally. Once party \(P_i\) has got all shares for the input wires, it doesn't need to communicate with the other parties to compute the shares of the output wire. In other words, additions come for free. This is a very important property of the BGW protocol, because the bottleneck of MPC is communications overhead. Additions are "nice" operations, in the sense that they don't cost more than what the scaffolding which can't be reduced, already incurs.

Secure Constant Multiplication

Let's now move to the computation of multiplication with a constant, i.e. of \(c x_a\). We assume that the constant \(c\) is already stored in the circuit description \(\langle C_f \rangle \) and therefore known by all parties, since all parties have that description. We also have the situation where all parties have a share of all other parties of the input \(x_a\). In other words, \(P_i\) has the shares \(x_{a_{i_1}}, x_{a_{i_2}}, \dots, x_{a_{i_n}}\).

Just as with the addition, observe that the constant term of the polynomial \(p_o(x) = c p_a(x)\) is \(p_o(0) = c p_a(0) = c x_a\), and that the degree of \(p_o(x)\) is \(t-1\), unless \(c = 0\). This is therefore a Shamir polynomial.

Indeed:

\[ c p_a(x) = c \sum_{i=0}^{t-1} a_i x^i = \sum_{i=0}^{t-1} (c a_i) x^i = c a_0 + \underbrace{\sum_{i=1}^{t-1} (c a_i) x^i}_{=0} = c x_a \]

So, to compute its shares of \(c x_a\), party \(P_i\) merely multiplies every input share with \(c\):

\[ y_{i_j} = c x_{a_{i_j}} \forall j \in \{1, 2, \dots, n\} \]

As before, \(P_i\) sends these output shares to every corresponding peer \(P_j\).

Communications overhead: Here too, all computations occur locally and don't incur any additional communications overhead. Multiplication with a constant is a "nice" operation too.

Secure Multiplication

As already said, party \(P_i\) already has shares for \(x_a\) and \(x_b\) from all parties: \(x_{a_{i_1}}, x_{a_{i_2}}, \dots, x_{a_{i_n}}\) for \(x_a\), and \(x_{b_{i_1}}, x_{b_{i_2}}, \dots, x_{b_{i_n}}\) for \(x_b\). The goal of \(P_i\) now to compute its share \(y_i\) of the product \(x_a x_b\). Or, more precisely, \(P_i\) wants to compute the shares of \(y_i\) that it will send to its peers.

First attempt

To securely compute the multiplication \(x_a x_b\), observe, once again, that the constant term of the product \(p_a(x) p_b(x)\) is \(a_0 b_0\), i.e. \(x_a x_b\). Indeed:

\[ p_o(x) := p_a(x) p_b(x) = (\sum_{i=0}^{t-1} a_i x^i) (\sum_{i=0}^{t-1} b_i x^i) = \sum_{i=1}^{2(t-1)} c_i x^i + a_0 b_0 \]

with some coefficients \(c_i\) in \(\{0, 1, 2, \dots, E-1\}\) which we'll ignore for now.

If we evalute this product at point 0, hoping to reveal the shared secret, we get:

\[ p_o(0) = \underbrace{\sum_{i=1}^{2(t-1)} c_i 0^i 0^i}_{=0} + a_0 b_0 = x_a x_b \]

So it looks great, doesn't it? Unfortunately, it doesn't, for two reasons:

the polynomial \(p_o(x)\) has a degree \(2(t-1)\) that is twice as high as the original degree \(t-1\) of \(p_a(x)\) and of \(p_b(x)\). While this may no seem such a big deal (after all, we still get the result in the constant term), it actually is. Why do we care? Recall that in order to reconstruct the coefficients of a polynomial of degree \(t-1\), we need at least \(t\) points. But to reconstruct a polynomial of degree \(2(t-1)\), we need twice as many points. We may be lucky, it \(2(t-1) \le n\), the number of shares (and parties), but with every additional multiplication down the road, the degree of the polynomial doubles again. Soon, we'll run out of points / shares, when \(2^\ell(t-1) \lt n\) for some \(\ell \in \{1, 2, \dots, |G|\}\) where \(|G|\) is the number of (possibly multiplication) gates of the circuit \(C_f\).
the coefficients \(c_i, i \in \{1,2,\dots,2(t-1)\}\) are not random anymore! Indeed, it can be shown that they are not uniformly distributed over the set \(\{0,1,2,\dots,E-1\}\). This is particularly bad, because it weakens the security (privacy) of the scheme. An adversary may guess these coefficients with non-negligible probability, and would therefore also reconstruct the secret \(p_o(0)\) with non-negligible probability.

Both issues can be taken care of in one combined step of degree reduction and randomization:

One Step Degree Reduction and Randomization (GRR)

For lack to time, I didn't finish this section yet, so please have a look at

Subsections "The Degree Reduction Step" and "The Randomization Step" of the [BGW88] paper.
Tal Rabin's talk (see below) from 0:56:27 to 1:07:40 covering multiplication and "One Step Degree Reduction and Randomization (GRR)"

for details

(... section to be expanded)

On the Malicious Setting

Everything that was shown previously was fine in the honest-but-curious setting. But when we add malicious adversaries (that deviate from the protocol), we need additional counter-measures to make BGW more robust. Tal Rabin shows in her talk that we could in particular use

Verifiable Secret Sharing (VSS) [TB89]
Error Correction Codes in the reconstruction phase
Proofs of Correct Actions via computation (zero-knowledge proofs) during the multiplication step

See that talk and the [BGW88] paper for further details.

(... section to be expanded)

Restoring randomness with Proactive Secret Sharing

This is covered in [HJKY95].

(... section to be written)

An Open Question Since 25 Years

We have seen in the Feasibility Results that

information theoretic security is possible with BGW if \(t \lt n/2\), but that the round complexity depends on the depth of the circuit
computational security is possible for any \(t\), even in a constant number of rounds using the BMR protocol (I may later cover BMR in another crypto bite of this MPC series)

But, can we compute any function with information theoretic security in a constant number of rounds? This is (as of 2014) an open question, that has been open since 25 years (see talk below at 1:10:59)

Sources

Tal Rabin's lecture at the Technion's 2014 Computer Engineering Summer School on Secure Party Computation. Part 1, starting at 44:40.

Shafi Goldwasser, the 'G' of the BGW protocol, talks about it in Lecture 21, BGW MPC Malicious Adversaries of the advanced course MIT 6.875 (Cryptography), Spring 2018:

Benny Pinkas covered the BGW protocol in his talk "Unconditionally Secure Multiparty Computation" at the 1st BIU Winter School on Cryptography, Secure Computation and Efficiency, January 30-February 1st, 2011, organized by the Center for Research in Applied Cryptography and Cyber Security of the Bar-Ilan University. Slides.

Literature

[BGW88] Michael Ben-Or, Shafi Goldwasser, Avi Wigderson: Completeness Theorems for Non-Cryptographic Fault-Tolerant Distributed Computation (extended abstract), 1988. In STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing, pages 1-10. (acm.org (paywalled), full pdf, another source)
[AL11] Gilad Asharov, Yehuda Lindell: A Full Proof of the BGW Protocol for Perfectly-Secure Multiparty Computation (iacr.org, full pdf)
[CCD88] David Chaum, Claude Crépeau, Ivan Damgård: Multiparty Unconditionally Secure Protocols. In STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing, pages 11-19. (acm.org (paywalled), full pdf)
[TB89] Tal Rabin, Michael Ben-Or: Verifiable Secret Sharing and Multiparty Protocols with Honest Majority. In STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing, pages 73-85. (acm.org (paywalled), pdf of extended abstract)
[HJKY95] Amir Herzberg, Stanislaw Jarecki, Hugo Krawczyk, Moti Yung: Proactive Secret Sharing Or: How to Cope With Perpetual Leakage. In: Coppersmith D. (eds) Advances in Cryptology, CRYPTO 95. Lecture Notes in Computer Science, LNCS 963. Springer, Berlin, Heidelberg. (link.springer.com, full pdf)
[BD17] Jacqueline Brendel, Denise Demirel: Efficient Proactive Secret Sharing. (iacr.org, full pdf)
[Y82] Andrew C. Yao: Protocols for Secure Computations (extended abstract). (IEEE (paywalled). full pdf)
[BiB89] Judit Bar-Ilan, Donald Beaver: Non-cryptographic fault-tolerant computing in constant number of rounds of interaction. In: PODC '89 Proceedings of the eighth annual ACM Symposium on Principles of distributed computing, pages 201 - 209. (acm.org (paywalled), full pdf).
[DFKNT06] Ivan Damgård, Matthias Fitzi, Eike Kiltz, Jesper Buus Nielsen, Tomas Toft: Unconditionally Secure Constant-Rounds Multi-party Computation for Equality, Comparison, Bits and Exponentiation. In: Halevi S., Rabin T. (eds) Theory of Cryptography. TCC 2006. Lecture Notes in Computer Science, vol 3876. Springer, Berlin, Heidelberg. (springer.com, full pdf)
[PhD09] Ian Pratt-Hartmann, Ivo Düntsch: Functions Definable by Arithmetic Circuits. In: Ambos-Spies K., Löwe B., Merkle W. (eds) Mathematical Theory and Computational Practice. CiE 2009. Lecture Notes in Computer Science, vol 5635, pages 409-418. Springer, Berlin, Heidelberg. (springer.com, paywalled)