Chung, KaiMin, Michael Mitzenmacher, and Salil P. Vadhan. “
Why simple hash functions work: Exploiting the entropy in a data stream.”
Theory of Computing 9 (2013): 897945.
Publisher's VersionAbstract
Version History: Merge of conference papers from SODA ‘08 (with the same title) and RANDOM ‘08 (entitled “Tight Bounds for Hashing Block Sources”).
Hashing is fundamental to many algorithms and data structures widely used in practice. For the theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealized model is unrealistic because a truly random hash function requires an exponential number of bits (in the length of a data item) to describe. Alternatively, one can provide rigorous bounds on performance when explicit families of hash functions are used, such as 2universal or \(O\)(1)wise independent families. For such families, performance guarantees are often noticeably weaker than for ideal hashing.
In practice, however, it is commonly observed that simple hash functions, including 2universal hash functions, perform as predicted by the idealized analysis for truly random hash functions. In this paper, we try to explain this phenomenon. We demonstrate that the strong performance of universal hash functions in practice can arise naturally from a combination of the randomness of the hash function and the data. Specifically, following the large body of literature on random sources and randomness extraction, we model the data as coming from a “block source,” whereby each new data item has some “entropy” given the previous ones. As long as the Rényi entropy per data item is sufficiently large, it turns out that the performance when choosing a hash function from a 2universal family is essentially the same as for a truly random hash function. We describe results for several sample applications, including linear probing, chained hashing, balanced allocations, and Bloom filters.
Towards developing our results, we prove tight bounds for hashing block sources, determining the entropy required per block for the distribution of hashed values to be close to uniformly distributed.
THEORYCOMP2013.pdf RANDOM2008.pdf SODA2008.pdf Haitner, Iftach, Omer Reingold, and Salil Vadhan. “
Efficiency improvements in constructing pseudorandom generators from oneway functions.”
SIAM Journal on Computing 42, no. 3 (2013): 14051430.
Publisher's VersionAbstract
Version History: Special Issue on STOC ‘10.
We give a new construction of pseudorandom generators from any oneway function. The construction achieves better parameters and is simpler than that given in the seminal work of Håstad, Impagliazzo, Levin, and Luby [SICOMP ’99]. The key to our construction is a new notion of nextblock pseudoentropy, which is inspired by the notion of “inaccessible entropy” recently introduced in [Haitner, Reingold, Vadhan, and Wee, STOC ’09]. An additional advan tage over previous constructions is that our pseudorandom generators are parallelizable and invoke the oneway function in a nonadaptive manner. Using [Applebaum, Ishai, and Kushilevitz, SICOMP ’06], this implies the existence of pseudorandom generators in NC\(^0\) based on the existence of oneway functions in NC\(^1\).
SIAM2013.pdf STOC2010.pdf Reshef, Yakir, and Salil Vadhan. “
On extractors and exposureresilient functions for sublogarithmic entropy.”
Random Structures & Algorithms 42, no. 3 (2013): 386401.
ArXiv VersionAbstract
Version History: Preliminary version posted as arXiv:1003.4029 (Dec. 2010).
We study resilient functions and exposure‐resilient functions in the low‐entropy regime. A resilient function (a.k.a. deterministic extractor for oblivious bit‐fixing sources) maps any distribution on n‐bit strings in which k bits are uniformly random and the rest are fixed into an output distribution that is close to uniform. With exposure‐resilient functions, all the input bits are random, but we ask that the output be close to uniform conditioned on any subset of n – k input bits. In this paper, we focus on the case that k is sublogarithmic in n.
We simplify and improve an explicit construction of resilient functions for k sublogarithmic in n due to Kamp and Zuckerman (SICOMP 2006), achieving error exponentially small in krather than polynomially small in k. Our main result is that when k is sublogarithmic in n, the short output length of this construction (\(O(\log k)\) output bits) is optimal for extractors computable by a large class of space‐bounded streaming algorithms.
Next, we show that a random function is a resilient function with high probability if and only if k is superlogarithmic in n, suggesting that our main result may apply more generally. In contrast, we show that a random function is a static (resp. adaptive) exposure‐resilient function with high probability even if k is as small as a constant (resp. log log n). No explicit exposure‐resilient functions achieving these parameters are known.
ArXiv2010.pdf RANDOM2013.pdf Mahmoody, Mohammad, Tal Moran, and Salil Vadhan. “
Publicly verifiable proofs of sequential work.” In
Innovations in Theoretical Computer Science (ITCS ‘13), 373388. ACM, 2013.
Publisher's VersionAbstract
Version History: Preliminary version posted as Cryptology ePrint Archive Report 2011/553, under title “NonInteractive TimeStamping and Proofs of Work in the Random Oracle Model”.
We construct a publicly verifiable protocol for proving computational work based on collision resistant hash functions and a new plausible complexity assumption regarding the existence of “inherently sequential” hash functions. Our protocol is based on a novel construction of timelock puzzles. Given a sampled “puzzle” \(\mathcal{P} \overset{$}\gets \mathbf{D}_n\), where \(n\) is the security parameter and \(\mathbf{D}_n\) is the distribution of the puzzles, a corresponding “solution” can be generated using \(N\) evaluations of the sequential hash function, where \(N > n\) is another parameter, while any feasible adversarial strategy for generating valid solutions must take at least as much time as \(\Omega(N)\) sequential evaluations of the hash function after receiving \(\mathcal{P}\). Thus, valid solutions constitute a “proof” that \(\Omega(N)\) parallel time elapsed since \(\mathcal{P}\) was received. Solutions can be publicly and efficiently verified in time \(\mathrm{poly}(n) \cdot \mathrm{polylog}(N)\). Applications of these “timelock puzzles” include noninteractive timestamping of documents (when the distribution over the possible documents corresponds to the puzzle distribution \(\mathbf{D}_n\)) and universally verifiable CPU benchmarks.
Our construction is secure in the standard model under complexity assumptions (collision resistant hash functions and inherently sequential hash functions), and makes blackbox use of the underlying primitives. Consequently, the corresponding construction in the random oracle model is secure unconditionally. Moreover, as it is a publiccoin protocol, it can be made non interactive in the random oracle model using the FiatShamir Heuristic.
Our construction makes a novel use of “depthrobust” directed acyclic graphs—ones whose depth remains large even after removing a constant fraction of vertices—which were previously studied for the purpose of complexity lower bounds. The construction bypasses a recent negative result of Mahmoody, Moran, and Vadhan (CRYPTO ‘11) for timelock puzzles in the random oracle model, which showed that it is impossible to have timelock puzzles like ours in the random oracle model if the puzzle generator also computes a solution together with the puzzle.
IACR2013.pdf ITCS2013.pdf Rothblum, Guy N., Salil Vadhan, and Avi Wigderson. “
Interactive proofs of proximity: delegating computation in sublinear time.” In
Proceedings of the 45th Annual ACM Symposium on Theory of Computing (STOC ‘13), 793802. New York, NY: ACM, 2013.
Publisher's VersionAbstract
We study interactive proofs with sublineartime verifiers. These proof systems can be used to ensure approximate correctness for the results of computations delegated to an untrusted server. Following the literature on property testing, we seek proof systems where with high probability the verifier accepts every input in the language, and rejects every input that is far from the language. The verifier’s query complexity (and computation complexity), as well as the communication, should all be sublinear. We call such a proof system an Interactive Proof of Proximity (IPP).

On the positive side, our main result is that all languages in \(\mathcal{NC}\) have Interactive Proofs of Proximity with roughly \(\sqrt{n}\) query and communication and complexities, and \(\mathrm{polylog} (n)\) communication rounds.
This is achieved by identifying a natural language, membership in an affine subspace (for a structured class of subspaces), that is complete for constructing interactive proofs of proximity, and providing efficient protocols for it. In building an IPP for this complete language, we show a tradeoff between the query and communication complexity and the number of rounds. For example, we give a 2round protocol with roughly \(n^{3/4}\) queries and communication.

On the negative side, we show that there exist natural languages in \(\mathcal{NC}^1\), for which the sum of queries and communication in any constantround interactive proof of proximity must be polynomially related to n. In particular, for any 2round protocol, the sum of queries and communication must be at least \(\tilde{\Omega}(\sqrt{n})\).

Finally, we construct much better IPPs for specific functions, such as bipartiteness on random or wellmixing graphs, and the majority function. The query complexities of these protocols are provably better (by exponential or polynomial factors) than what is possible in the standard property testing model, i.e. without a prover.
STOC2013.pdf Vadhan, Salil, and Colin Jia Zheng. “
A uniform minmax theorem with applications in cryptography.” In
Ran Canetti and Juan Garay, editors, Advances in Cryptology—CRYPTO ‘13, Lecture Notes on Computer Science, 8042:93110. Springer Verlag, Lecture Notes in Computer Science, 2013.
Publisher's VersionAbstract
Version History:
We present a new, more constructive proof of von Neumann’s MinMax Theorem for twoplayer zerosum game — specifically, an algorithm that builds a nearoptimal mixed strategy for the second player from several bestresponses of the second player to mixed strategies of the first player. The algorithm extends previous work of Freund and Schapire (Games and Economic Behavior ’99) with the advantage that the algorithm runs in poly\((n)\) time even when a pure strategy for the first player is a distribution chosen from a set of distributions over \(\{0,1\}^n\). This extension enables a number of additional applications in cryptography and complexity theory, often yielding uniform security versions of results that were previously only proved for nonuniform security (due to use of the nonconstructive MinMax Theorem).
We describe several applications, including a more modular and improved uniform version of Impagliazzo’s Hardcore Theorem (FOCS ’95), showing impossibility of constructing succinct noninteractive arguments (SNARGs) via blackbox reductions under uniform hardness assumptions (using techniques from Gentry and Wichs (STOC ’11) for the nonuniform setting), and efficiently simulating high entropy distributions within any sufficiently nice convex set (extending a result of Trevisan, Tulsiani and Vadhan (CCC ’09)).
CRYPTO2013.pdf ECCC2013.pdf Reingold, Omer, Thomas Steinke, and Salil Vadhan. “
Pseudorandomness for regular branching programs via Fourier analysis.” In
Sofya Raskhodnikova and José Rolim, editors, Proceedings of the 17th International Workshop on Randomization and Computation (RANDOM ‘13), Lecture Notes in Computer Science, 8096:655670. SpringerVerlag, 2013.
Publisher's VersionAbstract
Version History: Full version posted as ECCC TR13086 and arXiv:1306.3004 [cs.CC].
We present an explicit pseudorandom generator for oblivious, readonce, permutation branching programs of constant width that can read their input bits in any order. The seed length is \(O(\log^2n)\), where \(n\) is the length of the branching program. The previous best seed length known for this model was \(n^{1/2+o(1)}\), which follows as a special case of a generator due to Impagliazzo, Meka, and Zuckerman (FOCS 2012) (which gives a seed length of \(s^{1/2+o(1)}\) for arbitrary branching programs of size \(s\)). Our techniques also give seed length \(n^{1/2+o(1)}\) for general oblivious, readonce branching programs of width \(2^{n^{o(1)}}\)) , which is incomparable to the results of Impagliazzo et al.
Our pseudorandom generator is similar to the one used by Gopalan et al. (FOCS 2012) for readonce CNFs, but the analysis is quite different; ours is based on Fourier analysis of branching programs. In particular, we show that an oblivious, readonce, regular branching program of width \(w\) has Fourier mass at most \((2w^2)^k\) at level \(k\), independent of the length of the program.
RANDOM2013.pdf ArXiv2013.pdf