STOC 2021 – Author Index |
Contents -
Abstracts -
Authors
|
A B C D E F G H J K L M N O P Q R S T U V W X Y Z
Aamand, Anders |
STOC '21: "Load Balancing with Dynamic ..."
Load Balancing with Dynamic Set of Balls and Bins
Anders Aamand, Jakob Bæk Tejs Knudsen, and Mikkel Thorup (University of Copenhagen, Denmark) In dynamic load balancing, we wish to distribute balls into bins in an environment where both balls and bins can be added and removed. We want to minimize the maximum load of any bin but we also want to minimize the number of balls and bins that are affected when adding or removing a ball or a bin. We want a hashing-style solution where we given the ID of a ball can find its bin efficiently. We are given a user-specified balancing parameter c=1+ε, where ε∈ (0,1). Let n and m be the current number of balls and bins. Then we want no bin with load above C=⌈ c n/m⌉, referred to as the capacity of the bins. We present a scheme where we can locate a ball checking 1+O(log1/ε) bins in expectation. When inserting or deleting a ball, we expect to move O(1/ε) balls, and when inserting or deleting a bin, we expect to move O(C/ε) balls. Previous bounds were off by a factor 1/ε. The above bounds are best possible when C=O(1) but for larger C, we can do much better: We define f=ε C when C≤ log1/ε, f=ε√C· √log(1/(ε√C)) when log1/ε≤ C<1/2ε2, and f=1 when C≥ 1/2ε2. We show that we expect to move O(1/f) balls when inserting or deleting a ball, and O(C/f) balls when inserting or deleting a bin. Moreover, when C≥ log1/ε, we can search a ball checking only O(1) bins in expectation. For the bounds with larger C, we first have to resolve a much simpler probabilistic problem. Place n balls in m bins of capacity C, one ball at the time. Each ball picks a uniformly random non-full bin. We show that in expectation and with high probability, the fraction of non-full bins is Θ(f). Then the expected number of bins that a new ball would have to visit to find one that is not full is Θ(1/f). As it turns out, this is also the complexity of an insertion in our more complicated scheme where both balls and bins can be added and removed. @InProceedings{STOC21p1262, author = {Anders Aamand and Jakob Bæk Tejs Knudsen and Mikkel Thorup}, title = {Load Balancing with Dynamic Set of Balls and Bins}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1262--1275}, doi = {10.1145/3406325.3451107}, year = {2021}, } Publisher's Version |
|
Aamari, Eddie |
STOC '21: "Statistical Query Complexity ..."
Statistical Query Complexity of Manifold Estimation
Eddie Aamari and Alexander Knop (LPSM, France; Sorbonne University, France; University of Paris, France; CNRS, France; University of California at San Diego, USA) This paper studies the statistical query (SQ) complexity of estimating d-dimensional submanifolds in ℝn. We propose a purely geometric algorithm called Manifold Propagation, that reduces the problem to three natural geometric routines: projection, tangent space estimation, and point detection. We then provide constructions of these geometric routines in the SQ framework. Given an adversarial STAT(τ) oracle and a target Hausdorff distance precision ε = Ω(τ2/(d+1)), the resulting SQ manifold reconstruction algorithm has query complexity O(n polylog(n) ε−d/2), which is proved to be nearly optimal. In the process, we establish low-rank matrix completion results for SQ’s and lower bounds for randomized SQ estimators in general metric spaces. @InProceedings{STOC21p116, author = {Eddie Aamari and Alexander Knop}, title = {Statistical Query Complexity of Manifold Estimation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {116--122}, doi = {10.1145/3406325.3451135}, year = {2021}, } Publisher's Version |
|
Aaronson, Scott |
STOC '21: "Degree vs. Approximate Degree ..."
Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem
Scott Aaronson, Shalev Ben-David, Robin Kothari, Shravas Rao, and Avishay Tal (University of Texas at Austin, USA; University of Waterloo, Canada; Microsoft Quantum, USA; Microsoft Research, USA; Northwestern University, USA; University of California at Berkeley, USA) Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f, deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function. D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017). We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg conjecture. We show that if f is a nontrivial monotone graph property of an n-vertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal. We also show that the approximate degree of any read-once formula on n variables is Θ(√n). @InProceedings{STOC21p1330, author = {Scott Aaronson and Shalev Ben-David and Robin Kothari and Shravas Rao and Avishay Tal}, title = {Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1330--1342}, doi = {10.1145/3406325.3451047}, year = {2021}, } Publisher's Version |
|
Abboud, Amir |
STOC '21: "Subcubic Algorithms for Gomory–Hu ..."
Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs
Amir Abboud, Robert Krauthgamer, and Ohad Trabelsi (Weizmann Institute of Science, Israel) Every undirected graph G has a (weighted) cut-equivalent tree T, commonly named after Gomory and Hu who discovered it in 1961. Both T and G have the same node set, and for every node pair s,t, the minimum (s,t)-cut in T is also an exact minimum (s,t)-cut in G. We give the first subcubic-time algorithm that constructs such a tree for a simple graph G (unweighted with no parallel edges). Its time complexity is Õ(n2.5), for n=|V(G)|; previously, only Õ(n3) was known, except for restricted cases like sparse graphs. Consequently, we obtain the first algorithm for All-Pairs Max-Flow in simple graphs that breaks the cubic-time barrier. Gomory and Hu compute this tree using n−1 queries to (single-pair) Max-Flow; the new algorithm can be viewed as a fine-grained reduction to Õ(√n) Max-Flow computations on n-node graphs. @InProceedings{STOC21p1725, author = {Amir Abboud and Robert Krauthgamer and Ohad Trabelsi}, title = {Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1725--1737}, doi = {10.1145/3406325.3451073}, year = {2021}, } Publisher's Version |
|
Aho, Alfred V. |
STOC '21: "Computational Thinking in ..."
Computational Thinking in Programming Language and Compiler Design (Keynote)
Alfred V. Aho (Columbia University, USA) Abstractions and algorithms are at the heart of computational thinking. In this talk I will discuss the evolution of the theory and practice of programming language and compiler design through the lens of computational thinking. Many of the key concepts in this area were introduced at the ACM Symposium on the Theory of Computing. @InProceedings{STOC21p1, author = {Alfred V. Aho}, title = {Computational Thinking in Programming Language and Compiler Design (Keynote)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1--1}, doi = {10.1145/3406325.3465350}, year = {2021}, } Publisher's Version |
|
Alimohammadi, Yeganeh |
STOC '21: "Fractionally Log-Concave and ..."
Fractionally Log-Concave and Sector-Stable Polynomials: Counting Planar Matchings and More
Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and Thuy-Duong Vuong (Stanford University, USA) We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by Jerrum (J Stat Phys 1987) to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant. @InProceedings{STOC21p433, author = {Yeganeh Alimohammadi and Nima Anari and Kirankumar Shiragur and Thuy-Duong Vuong}, title = {Fractionally Log-Concave and Sector-Stable Polynomials: Counting Planar Matchings and More}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {433--446}, doi = {10.1145/3406325.3451123}, year = {2021}, } Publisher's Version Info |
|
Alman, Josh |
STOC '21: "Kronecker Products, Low-Depth ..."
Kronecker Products, Low-Depth Circuits, and Matrix Rigidity
Josh Alman (Harvard University, USA) For a matrix M and a positive integer r, the rank r rigidity of M is the smallest number of entries of M which one must change to make its rank at most r. There are many known applications of rigidity lower bounds to a variety of areas in complexity theory, but fewer known applications of rigidity upper bounds. In this paper, we use rigidity upper bounds to prove new upper bounds in a few different models of computation. Our results include: - For any d>1, and over any field F, the N × N Walsh-Hadamard transform has a depth-d linear circuit of size O(d · N1 + 0.96/d). This circumvents a known lower bound of Ω(d · N1 + 1/d) for circuits with bounded coefficients over ℂ by Pudlák (2000), by using coefficients of magnitude polynomial in N. Our construction also generalizes to linear transformations given by a Kronecker power of any fixed 2 × 2 matrix. - The N × N Walsh-Hadamard transform has a linear circuit of size ≤ (1.81 + o(1)) N log2 N, improving on the bound of ≈ 1.88 N log2 N which one obtains from the standard fast Walsh-Hadamard transform. - A new rigidity upper bound, showing that the following classes of matrices are not rigid enough to prove circuit lower bounds using Valiant’s approach: (1) for any field F and any function f : {0,1}n → F, the matrix Vf ∈ F2n × 2n given by, for any x,y ∈ {0,1}n, Vf[x,y] = f(x ∧ y), and (2) for any field F and any fixed-size matrices M1, …, Mn ∈ Fq × q, the Kronecker product M1 ⊗ M2 ⊗ ⋯ ⊗ Mn. This generalizes recent results on non-rigidity, using a simpler approach which avoids needing the polynomial method. - New connections between recursive linear transformations like Fourier and Walsh-Hadamard transforms, and circuits for matrix multiplication. @InProceedings{STOC21p772, author = {Josh Alman}, title = {Kronecker Products, Low-Depth Circuits, and Matrix Rigidity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {772--785}, doi = {10.1145/3406325.3451008}, year = {2021}, } Publisher's Version |
|
Alon, Noga |
STOC '21: "Boosting Simple Learners ..."
Boosting Simple Learners
Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rules-of-thumbs” from an “easy-to-learn class”. (Schapire and Freund ’12, Shalev-Shwartz and Ben-David ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ2) weak hypotheses with γ-margin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorial-geometric parameter which captures the expressivity of base-classes in boosting. As a corollary we provide an affirmative answer to the second question for many well-studied classes, including half-spaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481--489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version STOC '21: "Adversarial Laws of Large ..." Adversarial Laws of Large Numbers and Optimal Regret in Online Classification Noga Alon, Omri Ben-Eliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed sub-population is well-estimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by Ben-Eliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The sample-complexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from Ben-David, Pál, and Shalev-Shwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri Ben-Eliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447--455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version |
|
Alweiss, Ryan |
STOC '21: "Discrepancy Minimization via ..."
Discrepancy Minimization via a Self-Balancing Walk
Ryan Alweiss, Yang P. Liu, and Mehtaab Sawhney (Princeton University, USA; Stanford University, USA; Massachusetts Institute of Technology, USA) We study discrepancy minimization for vectors in ℝn under various settings. The main result is the analysis of a new simple random process in high dimensions through a comparison argument. As corollaries, we obtain bounds which are tight up to logarithmic factors for online vector balancing against oblivious adversaries, resolving several questions posed by Bansal, Jiang, Singla, and Sinha (STOC 2020), as well as a linear time algorithm for logarithmic bounds for the Komlós conjecture. @InProceedings{STOC21p14, author = {Ryan Alweiss and Yang P. Liu and Mehtaab Sawhney}, title = {Discrepancy Minimization via a Self-Balancing Walk}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14--20}, doi = {10.1145/3406325.3450994}, year = {2021}, } Publisher's Version Video |
|
Anari, Nima |
STOC '21: "Log-Concave Polynomials in ..."
Log-Concave Polynomials in Theory and Applications (Tutorial)
Nima Anari and Cynthia Vinzant (Stanford University, USA; North Carolina State University, USA; University of Washington, USA) Log-concave polynomials give rise to discrete probability distributions with several nice properties. In particular, log-concavity of the generating polynomial guarantees the existence of efficient algorithms for approximately sampling from a distribution and finding the size of its support. This class of distributions contains several important examples, including uniform measures over bases or independent sets of matroids, determinantal point processes and strongly Rayleigh measures, measures defined by mixed volumes in Mikowski sums, the random cluster model in certain regimes, and more. In this tutorial, we will introduce the theory and applications of log-concave polynomials and survey some of the recent developments in this area. @InProceedings{STOC21p12, author = {Nima Anari and Cynthia Vinzant}, title = {Log-Concave Polynomials in Theory and Applications (Tutorial)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12--12}, doi = {10.1145/3406325.3465351}, year = {2021}, } Publisher's Version STOC '21: "Fractionally Log-Concave and ..." Fractionally Log-Concave and Sector-Stable Polynomials: Counting Planar Matchings and More Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and Thuy-Duong Vuong (Stanford University, USA) We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomer-dimer systems in planar, not-necessarily-bipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting non-perfect matchings was shown by Jerrum (J Stat Phys 1987) to be #P-hard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multi-site Glauber dynamics on the set of monomers in a monomer-dimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downward-closed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multi-site Glauber dynamics from partition-constrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multi-site Glauber dynamics, we establish two notions for generating polynomials of discrete set-valued distributions: sector-stability and fractional log-concavity. These notions generalize well-studied properties like real-stability and log-concavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional log-concavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a half-plane must be log-concave over the positive orthant. @InProceedings{STOC21p433, author = {Yeganeh Alimohammadi and Nima Anari and Kirankumar Shiragur and Thuy-Duong Vuong}, title = {Fractionally Log-Concave and Sector-Stable Polynomials: Counting Planar Matchings and More}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {433--446}, doi = {10.1145/3406325.3451123}, year = {2021}, } Publisher's Version Info STOC '21: "Log-Concave Polynomials IV: ..." Log-Concave Polynomials IV: Approximate Exchange, Tight Mixing Times, and Near-Optimal Sampling of Forests Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and Thuy-Duong Vuong (Stanford University, USA; University of Washington, USA; North Carolina State University, USA) We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with log-concave polynomials of homogeneous degree k on n variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over size-k subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a kO(k)-approximation of maxS µ(S) when µ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog2 n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almost-linear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k1 and k2 in time O(n log2 n), for fixed parameters k1, k2. @InProceedings{STOC21p408, author = {Nima Anari and Kuikui Liu and Shayan Oveis Gharan and Cynthia Vinzant and Thuy-Duong Vuong}, title = {Log-Concave Polynomials IV: Approximate Exchange, Tight Mixing Times, and Near-Optimal Sampling of Forests}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {408--420}, doi = {10.1145/3406325.3451091}, year = {2021}, } Publisher's Version Info |
|
Arenas, Marcelo |
STOC '21: "A Polynomial-Time Approximation ..."
A Polynomial-Time Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Counting the number of words of a certain length accepted by a non-deterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature. The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GEN-NFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k. It is known that #NFA is #P-complete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomial-time randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)). Besides, we will mention some consequences and applications of our results. In particular, from well-known results on counting and uniform generation, we obtain that GEN-NFA admits a fully polynomial-time almost uniform generator. Moreover, as #NFA is SpanL-complete under polynomial-time parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS. @InProceedings{STOC21p4, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {A Polynomial-Time Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {4--4}, doi = {10.1145/3406325.3465353}, year = {2021}, } Publisher's Version STOC '21: "When Is Approximate Counting ..." When Is Approximate Counting for Conjunctive Queries Tractable? Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomial-time randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomial-time algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries. The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasi-polynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits. @InProceedings{STOC21p1015, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {When Is Approximate Counting for Conjunctive Queries Tractable?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1015--1027}, doi = {10.1145/3406325.3451014}, year = {2021}, } Publisher's Version |
|
Argue, C. J. |
STOC '21: "Chasing Convex Bodies with ..."
Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)
C. J. Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang (Carnegie Mellon University, USA; Google Research, USA) The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over d-dimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs ||x t − x t − 1 || is minimized. This problem generalizes questions in online algorithms such as caching and the k-server problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)-competitive algorithm, based on the notion of the Steiner point of a convex body. @InProceedings{STOC21p5, author = {C. J. Argue and Anupam Gupta and Guru Guruganesh and Ziye Tang}, title = {Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {5--5}, doi = {10.1145/3406325.3465354}, year = {2021}, } Publisher's Version |
|
Assadi, Sepehr |
STOC '21: "Graph Streaming Lower Bounds ..."
Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma
Sepehr Assadi and Vishvajeet N (Rutgers University, USA) We study space-pass tradeoffs in graph streaming algorithms for parameter estimation and property testing problems such as estimating the size of maximum matchings and maximum cuts, weight of minimum spanning trees, or testing if a graph is connected or cycle-free versus being far from these properties. We develop a new lower bound technique that proves that for many problems of interest, including all the above, obtaining a (1+є)-approximation requires either nΩ(1) space or Ω(1/є) passes, even on highly restricted families of graphs such as bounded-degree planar graphs. For multiple of these problems, this bound matches those of existing algorithms and is thus (asymptotically) optimal. Our results considerably strengthen prior lower bounds even for arbitrary graphs: starting from the influential work of [Verbin, Yu; SODA 2011], there has been a plethora of lower bounds for single-pass algorithms for these problems; however, the only multi-pass lower bounds proven very recently in [Assadi, Kol, Saxena, Yu; FOCS 2020] rules out sublinear-space algorithms with exponentially smaller o(log(1/є)) passes for these problems. One key ingredient of our proofs is a simple streaming XOR Lemma, a generic hardness amplification result, that we prove: informally speaking, if a p-pass s-space streaming algorithm can only solve a decision problem with advantage δ > 0 over random guessing, then it cannot solve XOR of ℓ independent copies of the problem with advantage much better than δℓ. This result can be of independent interest and useful for other streaming lower bounds as well. @InProceedings{STOC21p612, author = {Sepehr Assadi and Vishvajeet N}, title = {Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {612--625}, doi = {10.1145/3406325.3451110}, year = {2021}, } Publisher's Version |
|
Azar, Yossi |
STOC '21: "Flow Time Scheduling with ..."
Flow Time Scheduling with Uncertain Processing Time
Yossi Azar, Stefano Leonardi, and Noam Touitou (Tel Aviv University, Israel; Sapienza University of Rome, Italy) We consider the problem of online scheduling on a single machine in order to minimize weighted flow time. The existing algorithms for this problem (STOC ’01, SODA ’03, FOCS ’18) all require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in real-life scenarios. In this paper, we present the first algorithm for weighted flow time which do not require exact knowledge of the processing times of jobs. Specifically, we introduce the Scheduling with Predicted Processing Time (SPPT) problem, where the algorithm is given a prediction for the processing time of each job, instead of its real processing time. For the case of a constant factor distortion between the predictions and the real processing time, our algorithms match all the best known competitiveness bounds for weighted flow time – namely O(logP), O(logD) and O(logW), where P,D,W are the maximum ratios of processing times, densities, and weights, respectively. For larger errors, the competitiveness of our algorithms degrades gracefully. @InProceedings{STOC21p1070, author = {Yossi Azar and Stefano Leonardi and Noam Touitou}, title = {Flow Time Scheduling with Uncertain Processing Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1070--1080}, doi = {10.1145/3406325.3451023}, year = {2021}, } Publisher's Version |
|
Babichenko, Yakov |
STOC '21: "Settling the Complexity of ..."
Settling the Complexity of Nash Equilibrium in Congestion Games
Yakov Babichenko and Aviad Rubinstein (Technion, Israel; Stanford University, USA) We consider (i) the problem of finding a (possibly mixed) Nash equilibrium in congestion games, and (ii) the problem of finding an (exponential precision) fixed point of the gradient descent dynamics of a smooth function f:[0,1]n → ℝ. We prove that these problems are equivalent. Our result holds for various explicit descriptions of f, ranging from (almost general) arithmetic circuits, to degree-5 polynomials. By a very recent result of [Fearnley et al., STOC 2021], this implies that these problems are PPAD ∩ PLS-complete. As a corollary, we also obtain the following equivalence of complexity classes: CCLS = PPAD ⋂ PLS. @InProceedings{STOC21p1426, author = {Yakov Babichenko and Aviad Rubinstein}, title = {Settling the Complexity of Nash Equilibrium in Congestion Games}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1426--1437}, doi = {10.1145/3406325.3451039}, year = {2021}, } Publisher's Version |
|
Bădescu, Costin |
STOC '21: "Improved Quantum Data Analysis ..."
Improved Quantum Data Analysis
Costin Bădescu and Ryan O'Donnell (Carnegie Mellon University, USA) We provide more sample-efficient versions of some basic routines in quantum data analysis, along with simpler proofs. Particularly, we give a quantum ”Threshold Search” algorithm that requires only O((log2 m)/є2) samples of a d-dimensional state ρ. That is, given observables 0 ≤ A1, A2, …, Am ≤ 1 such that (ρ Ai) ≥ 1/2 for at least one i, the algorithm finds j with (ρ Aj) ≥ 1/2−є. As a consequence, we obtain a Shadow Tomography algorithm requiring only O((log2 m)(logd)/є4) samples, which simultaneously achieves the best known dependence on each parameter m, d, є. This yields the same sample complexity for quantum Hypothesis Selection among m states; we also give an alternative Hypothesis Selection method using O((log3 m)/є2) samples. @InProceedings{STOC21p1398, author = {Costin Bădescu and Ryan O'Donnell}, title = {Improved Quantum Data Analysis}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1398--1411}, doi = {10.1145/3406325.3451109}, year = {2021}, } Publisher's Version |
|
Bafna, Mitali |
STOC '21: "Playing Unique Games on Certified ..."
Playing Unique Games on Certified Small-Set Expanders
Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer (Harvard University, USA; Carnegie Mellon University, USA; Stanford University, USA; ETH Zurich, Switzerland) We give an algorithm for solving unique games (UG) instances whenever low-degree sum-of-squares proofs certify good bounds on the small-set-expansion of the underlying constraint graph via a hypercontractive inequality. Our algorithm is in fact more versatile, and succeeds even when the constraint graph is not a small-set expander as long as the structure of non-expanding small sets is (informally speaking) “characterized” by a low-degree sum-of-squares proof. Our results are obtained by rounding low-entropy solutions — measured via a new global potential function — to sum-of-squares (SoS) semidefinite programs. This technique adds to the (currently short) list of general tools for analyzing SoS relaxations for worst-case optimization problems. As corollaries, we obtain the first polynomial-time algorithms for solving any UG instance where the constraint graph is either the noisy hypercube, the short code or the Johnson graph. The prior best algorithm for such instances was the eigenvalue enumeration algorithm of Arora, Barak, and Steurer (2010) which requires quasi-polynomial time for the noisy hypercube and nearly-exponential time for the short code and Johnson graphs. All of our results achieve an approximation of 1−є vs δ for UG instances, where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent of the alphabet size. @InProceedings{STOC21p1629, author = {Mitali Bafna and Boaz Barak and Pravesh K. Kothari and Tselil Schramm and David Steurer}, title = {Playing Unique Games on Certified Small-Set Expanders}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1629--1642}, doi = {10.1145/3406325.3451099}, year = {2021}, } Publisher's Version |
|
Bakshi, Ainesh |
STOC '21: "Robust Linear Regression: ..."
Robust Linear Regression: Optimal Rates in Polynomial Time
Ainesh Bakshi and Adarsh Prasad (Carnegie Mellon University, USA) We obtain robust and computationally efficient estimators for learning several linear models that achieve statistically optimal convergence rate under minimal distributional assumptions. Concretely, we assume our data is drawn from a k-hypercontractive distribution and an є-fraction is adversarially corrupted. We then describe an estimator that converges to the optimal least-squares minimizer for the true distribution at a rate proportional to є2−2/k, when the noise is independent of the covariates. We note that no such estimator was known prior to our work, even with access to unbounded computation. The rate we achieve is information-theoretically optimal and thus we resolve the main open question in Klivans, Kothari and Meka [COLT’18]. Our key insight is to identify an analytic condition that serves as a polynomial relaxation of independence of random variables. In particular, we show that when the moments of the noise and covariates are negatively-correlated, we obtain the same rate as independent noise. Further, when the condition is not satisfied, we obtain a rate proportional to є2−4/k, and again match the information-theoretic lower bound. Our central technical contribution is to algorithmically exploit independence of random variables in the ”sum-of-squares” framework by formulating it as the aforementioned polynomial inequality. @InProceedings{STOC21p102, author = {Ainesh Bakshi and Adarsh Prasad}, title = {Robust Linear Regression: Optimal Rates in Polynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {102--115}, doi = {10.1145/3406325.3451001}, year = {2021}, } Publisher's Version |
|
Balcan, Maria-Florina |
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design
Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case guarantee. Worst-case instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that data-driven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, application-specific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing case-by-case analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewise-constant, -linear, or—more generally—piecewise-structured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {Maria-Florina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919--932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version |
|
Bansal, Nikhil |
STOC '21: "k-Forrelation Optimally Separates ..."
k-Forrelation Optimally Separates Quantum and Classical Query Complexity
Nikhil Bansal and Makrand Sinha (CWI, Netherlands; Eindhoven University of Technology, Netherlands) Aaronson and Ambainis (SICOMP ‘18) showed that any partial function on N bits that can be computed with an advantage δ over a random guess by making q quantum queries, can also be computed classically with an advantage δ/2 by a randomized decision tree making Oq(N1−1/2qδ−2) queries. Moreover, they conjectured the k-Forrelation problem — a partial function that can be computed with q = ⌈ k/2 ⌉ quantum queries — to be a suitable candidate for exhibiting such an extremal separation. We prove their conjecture by showing a tight lower bound of Ω(N1−1/k) for the randomized query complexity of k-Forrelation, where δ = 2−O(k). By standard amplification arguments, this gives an explicit partial function that exhibits an Oє(1) vs Ω(N1−є) separation between bounded-error quantum and randomized query complexities, where є>0 can be made arbitrarily small. Our proof also gives the same bound for the closely related but non-explicit k-Rorrelation function introduced by Tal (FOCS ‘20). Our techniques rely on classical Gaussian tools, in particular, Gaussian interpolation and Gaussian integration by parts, and in fact, give a more general statement. We show that to prove lower bounds for k-Forrelation against a family of functions, it suffices to bound the ℓ1-weight of the Fourier coefficients between levels k and (k−1)k. We also prove new interpolation and integration by parts identities that might be of independent interest in the context of rounding high-dimensional Gaussian vectors. @InProceedings{STOC21p1303, author = {Nikhil Bansal and Makrand Sinha}, title = {k-Forrelation Optimally Separates Quantum and Classical Query Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1303--1316}, doi = {10.1145/3406325.3451040}, year = {2021}, } Publisher's Version |
|
Barak, Boaz |
STOC '21: "Playing Unique Games on Certified ..."
Playing Unique Games on Certified Small-Set Expanders
Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer (Harvard University, USA; Carnegie Mellon University, USA; Stanford University, USA; ETH Zurich, Switzerland) We give an algorithm for solving unique games (UG) instances whenever low-degree sum-of-squares proofs certify good bounds on the small-set-expansion of the underlying constraint graph via a hypercontractive inequality. Our algorithm is in fact more versatile, and succeeds even when the constraint graph is not a small-set expander as long as the structure of non-expanding small sets is (informally speaking) “characterized” by a low-degree sum-of-squares proof. Our results are obtained by rounding low-entropy solutions — measured via a new global potential function — to sum-of-squares (SoS) semidefinite programs. This technique adds to the (currently short) list of general tools for analyzing SoS relaxations for worst-case optimization problems. As corollaries, we obtain the first polynomial-time algorithms for solving any UG instance where the constraint graph is either the noisy hypercube, the short code or the Johnson graph. The prior best algorithm for such instances was the eigenvalue enumeration algorithm of Arora, Barak, and Steurer (2010) which requires quasi-polynomial time for the noisy hypercube and nearly-exponential time for the short code and Johnson graphs. All of our results achieve an approximation of 1−є vs δ for UG instances, where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent of the alphabet size. @InProceedings{STOC21p1629, author = {Mitali Bafna and Boaz Barak and Pravesh K. Kothari and Tselil Schramm and David Steurer}, title = {Playing Unique Games on Certified Small-Set Expanders}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1629--1642}, doi = {10.1145/3406325.3451099}, year = {2021}, } Publisher's Version |
|
Bartal, Yair |
STOC '21: "Near-Linear Time Approximation ..."
Near-Linear Time Approximation Schemes for Steiner Tree and Forest in Low-Dimensional Spaces
Yair Bartal and Lee-Ad Gottlieb (Hebrew University of Jerusalem, Israel; Ariel University, Israel) We give an algorithm that computes a (1+є)-approximate Steiner forest in near-linear time n · 2(1/є)O(ddim2) (loglogn)2, where ddim is the doubling dimension of the metric space. This improves upon the best previous result due to Chan et al. (SIAM J. Comput. 4 (2018)), who gave a runtime of about n2O(ddim) · 2(ddim/є)O(ddim) √logn. For Steiner tree our methods achieve an even better runtime n (logn)(1/є)O(ddim2). @InProceedings{STOC21p1028, author = {Yair Bartal and Lee-Ad Gottlieb}, title = {Near-Linear Time Approximation Schemes for Steiner Tree and Forest in Low-Dimensional Spaces}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1028--1041}, doi = {10.1145/3406325.3451063}, year = {2021}, } Publisher's Version |
|
Ben-David, Shai |
STOC '21: "Learnability Can Be Independent ..."
Learnability Can Be Independent of Set Theory (Invited Paper)
Shai Ben-David, Pavel Hrubes, Shay Moran, Amir Shpilka, and Amir Yehudayoff (University of Waterloo, Canada; Czech Academy of Sciences, Czechia; Technion, Israel; Tel Aviv University, Israel) A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its Vapnik-Chervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VC-dimension characterization of binary classification learnability. @InProceedings{STOC21p11, author = {Shai Ben-David and Pavel Hrubes and Shay Moran and Amir Shpilka and Amir Yehudayoff}, title = {Learnability Can Be Independent of Set Theory (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11--11}, doi = {10.1145/3406325.3465360}, year = {2021}, } Publisher's Version |
|
Ben-David, Shalev |
STOC '21: "Degree vs. Approximate Degree ..."
Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem
Scott Aaronson, Shalev Ben-David, Robin Kothari, Shravas Rao, and Avishay Tal (University of Texas at Austin, USA; University of Waterloo, Canada; Microsoft Quantum, USA; Microsoft Research, USA; Northwestern University, USA; University of California at Berkeley, USA) Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f, deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function. D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017). We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg conjecture. We show that if f is a nontrivial monotone graph property of an n-vertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal. We also show that the approximate degree of any read-once formula on n variables is Θ(√n). @InProceedings{STOC21p1330, author = {Scott Aaronson and Shalev Ben-David and Robin Kothari and Shravas Rao and Avishay Tal}, title = {Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1330--1342}, doi = {10.1145/3406325.3451047}, year = {2021}, } Publisher's Version |
|
Ben-Eliezer, Omri |
STOC '21: "Adversarial Laws of Large ..."
Adversarial Laws of Large Numbers and Optimal Regret in Online Classification
Noga Alon, Omri Ben-Eliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed sub-population is well-estimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by Ben-Eliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The sample-complexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from Ben-David, Pál, and Shalev-Shwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri Ben-Eliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447--455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version |
|
Beniamini, Gal |
STOC '21: "Bipartite Perfect Matching ..."
Bipartite Perfect Matching as a Real Polynomial
Gal Beniamini and Noam Nisan (Hebrew University of Jerusalem, Israel) We obtain a description of the Bipartite Perfect Matching decision problem as a multilinear polynomial over the Reals. We show that it has full degree and (1−on(1))· 2n2 monomials with non-zero coefficients. In contrast, we show that in the dual representation (switching the roles of 0 and 1) the number of monomials is only exponential in Θ(n logn). Our proof relies heavily on the fact that the lattice of graphs which are “matching-covered” is Eulerian. @InProceedings{STOC21p1118, author = {Gal Beniamini and Noam Nisan}, title = {Bipartite Perfect Matching as a Real Polynomial}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1118--1131}, doi = {10.1145/3406325.3451002}, year = {2021}, } Publisher's Version |
|
Bernstein, Aaron |
STOC '21: "A Framework for Dynamic Matching ..."
A Framework for Dynamic Matching in Weighted Graphs
Aaron Bernstein, Aditi Dudeja, and Zachary Langley (Rutgers University, USA) We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap. For bipartite graphs, we almost completely settle the gap with a general reduction that converts any algorithm for α-approximate unweighted matching to an algorithm for (1−)α-approximate weighted matching, while only increasing the update time by an O(logn) factor for constant . We also show that our framework leads to significant improvements for non-bipartite graphs, though not in the form of a universal reduction. In particular, we give two algorithms for weighted non-bipartite matching: 1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)-approximate maximum weight matching in worst-case update time O(polylog n) with high probability against an adaptive adversary. Our bounds are essentially the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic fully dynamic algorithm that maintains a (2/3−)-approximate maximum weight matching in amortized update time O(m1/4). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein and Stein [SODA 2016]. A key feature of our framework is that it uses existing algorithms for unweighted matching as black-boxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models. @InProceedings{STOC21p668, author = {Aaron Bernstein and Aditi Dudeja and Zachary Langley}, title = {A Framework for Dynamic Matching in Weighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {668--681}, doi = {10.1145/3406325.3451113}, year = {2021}, } Publisher's Version |
|
Bhandari, Siddharth |
STOC '21: "Decoding Multivariate Multiplicity ..."
Decoding Multivariate Multiplicity Codes on Product Sets
Siddharth Bhandari, Prahladh Harsha, Mrinal Kumar, and Madhu Sudan (Tata Institute of Fundamental Research, India; IIT Bombay, India; Harvard University, USA) The multiplicity Schwartz-Zippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct high-rate locally-decodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial list-decodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in . @InProceedings{STOC21p1489, author = {Siddharth Bhandari and Prahladh Harsha and Mrinal Kumar and Madhu Sudan}, title = {Decoding Multivariate Multiplicity Codes on Product Sets}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1489--1501}, doi = {10.1145/3406325.3451027}, year = {2021}, } Publisher's Version |
|
Bhangale, Amey |
STOC '21: "Optimal Inapproximability ..."
Optimal Inapproximability of Satisfiable k-LIN over Non-Abelian Groups
Amey Bhangale and Subhash Khot (University of California at Riverside, USA; New York University, USA) A seminal result of Håstad (2001) shows that it is NP-hard to find an assignment that satisfies 1/|G|+ε fraction of the constraints of a given k-LIN instance over an abelian group, even if there is an assignment that satisfies (1−ε) fraction of the constraints, for any constant ε>0. Engebretsen, Holmerin and Russell (2004) later showed that the same hardness result holds for k-LIN instances over any finite non-abelian group. Unlike the abelian case, where we can efficiently find a solution if the instance is satisfiable, in the non-abelian case, it is NP-complete to decide if a given system of linear equations is satisfiable or not, as shown by Goldmann and Russell (1999). Surprisingly, for certain non-abelian groups G, given a satisfiable k-LIN instance over G, one can in fact do better than just outputting a random assignment using a simple but clever algorithm. The approximation factor achieved by this algorithm varies with the underlying group. In this paper, we show that this algorithm is optimal by proving a tight hardness of approximation of satisfiable k-LIN instance over any non-abelian G, assuming P≠ NP. As a corollary, we also get 3-query probabilistically checkable proofs with perfect completeness over large alphabets with improved soundness. Our proof crucially uses the quasi-random properties of the non-abelian groups defined by Gowers (2008). @InProceedings{STOC21p1615, author = {Amey Bhangale and Subhash Khot}, title = {Optimal Inapproximability of Satisfiable k-LIN over Non-Abelian Groups}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1615--1628}, doi = {10.1145/3406325.3451003}, year = {2021}, } Publisher's Version |
|
Bhargava, Vishwas |
STOC '21: "Reconstruction Algorithms ..."
Reconstruction Algorithms for Low-Rank Tensors and Depth-3 Multilinear Circuits
Vishwas Bhargava, Shubhangi Saraf, and Ilya Volkovich (Rutgers University, USA; Boston College, USA) We give new and efficient black-box reconstruction algorithms for some classes of depth-3 arithmetic circuits. As a consequence, we obtain the first efficient algorithm for computing the tensor rank and for finding the optimal tensor decomposition as a sum of rank-one tensors when then input is a constant-rank tensor. More specifically, we provide efficient learning algorithms that run in randomized polynomial time over general fields and in deterministic polynomial time over and for the following classes: 1) Set-multilinear depth-3 circuits of constant top fan-in ((k) circuits). As a consequence of our algorithm, we obtain the first polynomial time algorithm for tensor rank computation and optimal tensor decomposition of constant-rank tensors. This result holds for d dimensional tensors for any d, but is interesting even for d=3. 2) Sums of powers of constantly many linear forms ((k) circuits). As a consequence we obtain the first polynomial-time algorithm for tensor rank computation and optimal tensor decomposition of constant-rank symmetric tensors. 3) Multilinear depth-3 circuits of constant top fan-in (multilinear (k) circuits). Our algorithm works over all fields of characteristic 0 or large enough characteristic. Prior to our work the only efficient algorithms known were over polynomially-sized finite fields (see. Karnin-Shpilka 09’). Prior to our work, the only polynomial-time or even subexponential-time algorithms known (deterministic or randomized) for subclasses of (k) circuits that also work over large/infinite fields were for the setting when the top fan-in k is at most 2 (see Sinha 16’ and Sinha 20’). @InProceedings{STOC21p809, author = {Vishwas Bhargava and Shubhangi Saraf and Ilya Volkovich}, title = {Reconstruction Algorithms for Low-Rank Tensors and Depth-3 Multilinear Circuits}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {809--822}, doi = {10.1145/3406325.3451096}, year = {2021}, } Publisher's Version |
|
Bhattacharyya, Arnab |
STOC '21: "Near-Optimal Learning of Tree-Structured ..."
Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu
Arnab Bhattacharyya, Sutanu Gayen, Eric Price, and N. V. Vinodchandran (National University of Singapore, Singapore; University of Texas at Austin, USA; University of Nebraska-Lincoln, USA) We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans. Inform. Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution P on Σn and a tree T on n nodes, we say T is an ε-approximate tree for P if there is a T-structured distribution Q such that D(P || Q) is at most ε more than the best possible tree-structured distribution for P. We show that if P itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator for mutual information with O(|Σ|3 nε−1) i.i.d. samples outputs an ε-approximate tree for P with constant probability. In contrast, for a general P (which may not be tree-structured), Ω(n2ε−2) samples are necessary to find an ε-approximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart (STOC, 2018): we prove that for three random variables X,Y,Z each over Σ, testing if I(X; Y ∣ Z) is 0 or ≥ ε is possible with O(|Σ|3/ε) samples. Finally, we show that for a specific tree T, with O(|Σ|2nε−1) samples from a distribution P over Σn, one can efficiently learn the closest T-structured distribution in KL divergence by applying the add-1 estimator at each node. @InProceedings{STOC21p147, author = {Arnab Bhattacharyya and Sutanu Gayen and Eric Price and N. V. Vinodchandran}, title = {Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {147--160}, doi = {10.1145/3406325.3451066}, year = {2021}, } Publisher's Version Info |
|
Bhattiprolu, Vijay |
STOC '21: "A Framework for Quadratic ..."
A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations
Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor (Institute for Advanced Study at Princeton, USA; Princeton University, USA; University of Michigan, USA) We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an origin-symmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using case-specific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constant-approximability necessitates that K has type-2 constant that grows slowly with n. However, we show that even when the type-2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type-2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constant-factor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type-2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type-2 constant nearly characterizes the approximability of quadratic maximization. @InProceedings{STOC21p870, author = {Vijay Bhattiprolu and Euiwoong Lee and Assaf Naor}, title = {A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {870--881}, doi = {10.1145/3406325.3451128}, year = {2021}, } Publisher's Version |
|
Blais, Eric |
STOC '21: "VC Dimension and Distribution-Free ..."
VC Dimension and Distribution-Free Sample-Based Testing
Eric Blais, Renato Ferreira Pinto Jr., and Nathaniel Harms (University of Waterloo, Canada; Google, Canada) We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems. @InProceedings{STOC21p504, author = {Eric Blais and Renato Ferreira Pinto Jr. and Nathaniel Harms}, title = {VC Dimension and Distribution-Free Sample-Based Testing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {504--517}, doi = {10.1145/3406325.3451104}, year = {2021}, } Publisher's Version |
|
Blanca, Antonio |
STOC '21: "Entropy Decay in the Swendsen–Wang ..."
Entropy Decay in the Swendsen–Wang Dynamics on ℤd
Antonio Blanca, Pietro Caputo, Daniel Parisi, Alistair Sinclair, and Eric Vigoda (Pennsylvania State University, USA; Roma Tre University, Italy; University of California at Berkeley, USA; Georgia Institute of Technology, USA) We study the mixing time of the Swendsen-Wang dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤd. This dynamics is a widely used Markov chain that has largely resisted sharp analysis because it is non-local, i.e., it changes the entire configuration in one step. We prove that, whenever strong spatial mixing (SSM) holds, the mixing time on any n-vertex cube in ℤd is O(logn), and we prove this is tight by establishing a matching lower bound. The previous best known bound was O(n). SSM is a standard condition corresponding to exponential decay of correlations with distance between spins on the lattice and is known to hold in d=2 dimensions throughout the high-temperature (single phase) region. Our result follows from a modified log-Sobolev inequality, which expresses the fact that the dynamics contracts relative entropy at a constant rate at each step. The proof of this fact utilizes a new factorization of the entropy in the joint probability space over spins and edges that underlies the Swendsen-Wang dynamics, which extends to general bipartite graphs of bounded degree. This factorization leads to several additional results, including mixing time bounds for a number of natural local and non-local Markov chains on the joint space, as well as for the standard random-cluster dynamics. @InProceedings{STOC21p1551, author = {Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda}, title = {Entropy Decay in the Swendsen–Wang Dynamics on ℤ<sup><i>d</i></sup>}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1551--1564}, doi = {10.1145/3406325.3451095}, year = {2021}, } Publisher's Version |
|
Blikstad, Joakim |
STOC '21: "Breaking the Quadratic Barrier ..."
Breaking the Quadratic Barrier for Matroid Intersection
Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, and Danupon Nanongkai (KTH, Sweden; University of Copenhagen, Denmark) The matroid intersection problem is a fundamental problem that has been extensively studied for half a century. In the classic version of this problem, we are given two matroids M1 = (V, I1) and M2 = (V, I2) on a comment ground set V of n elements, and then we have to find the largest common independent set S ∈ I1 ∩ I2 by making independence oracle queries of the form ”Is S ∈ I1?” or ”Is S ∈ I2?” for S ⊆ V. The goal is to minimize the number of queries. Beating the existing Õ(n2) bound, known as the quadratic barrier, is an open problem that captures the limits of techniques from two lines of work. The first one is the classic Cunningham’s algorithm [SICOMP 1986], whose Õ(n2)-query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally, these algorithms take Õ(nr) queries where r denotes the rank which can be as big as n). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requires either approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm with o(n2) independence queries was known. In this work, we break the quadratic barrier with a randomized algorithm guaranteeing Õ(n9/5) independence queries with high probability, and a deterministic algorithm guaranteeing Õ(n11/6) independence queries. Our key insight is simple and fast algorithms to solve a graph reachability problem that arose in the standard augmenting path framework [Edmonds 1968]. Combining this with previous exact and approximation algorithms leads to our results. @InProceedings{STOC21p421, author = {Joakim Blikstad and Jan van den Brand and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Breaking the Quadratic Barrier for Matroid Intersection}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {421--432}, doi = {10.1145/3406325.3451092}, year = {2021}, } Publisher's Version |
|
Bonamy, Marthe |
STOC '21: "Optimal Labelling Schemes ..."
Optimal Labelling Schemes for Adjacency, Comparability, and Reachability
Marthe Bonamy, Louis Esperet, Carla Groenland, and Alex Scott (CNRS, France; Labri, France; University of Bordeaux, France; G-SCOP, France; Grenoble Alps University, France; University of Oxford, UK) We construct asymptotically optimal adjacency labelling schemes for every hereditary class containing 2Ω(n2) n-vertex graphs as n→ ∞. This regime contains many classes of interest, for instance perfect graphs or comparability graphs, for which we obtain an adjacency labelling scheme with labels of n/4+o(n) bits per vertex. This implies the existence of a reachability labelling scheme for digraphs with labels of n/4+o(n) bits per vertex and comparability labelling scheme for posets with labels of n/4+o(n) bits per element. All these results are best possible, up to the lower order term. @InProceedings{STOC21p1109, author = {Marthe Bonamy and Louis Esperet and Carla Groenland and Alex Scott}, title = {Optimal Labelling Schemes for Adjacency, Comparability, and Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1109--1117}, doi = {10.1145/3406325.3451102}, year = {2021}, } Publisher's Version |
|
Bousquet, Olivier |
STOC '21: "A Theory of Universal Learning ..."
A Theory of Universal Learning
Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, and Amir Yehudayoff (Google, Switzerland; Toyota Technological Institute at Chicago, USA; Technion, Israel; Google Research, Israel; Princeton University, USA) How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of Vapnik-Chervonenkis and Valiant, does not explain the behavior of learning curves: the distribution-free PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios. @InProceedings{STOC21p532, author = {Olivier Bousquet and Steve Hanneke and Shay Moran and Ramon van Handel and Amir Yehudayoff}, title = {A Theory of Universal Learning}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {532--541}, doi = {10.1145/3406325.3451087}, year = {2021}, } Publisher's Version |
|
Braverman, Mark |
STOC '21: "New Separations Results for ..."
New Separations Results for External Information
Mark Braverman and Dor Minzer (Princeton University, USA; Massachusetts Institute of Technology, USA) We obtain new separation results for the two-party external information complexity of Boolean functions. The external information complexity of a function f(x,y) is the minimum amount of information a two-party protocol computing f must reveal to an outside observer about the input. We prove an exponential separation between external and internal information complexity, which is the best possible; previously no separation was known. We use this result in order to then prove a near-quadratic separation between amortized zero-error communication complexity and external information complexity for total functions, disproving a conjecture of the first author. Finally, we prove a matching upper bound showing that our separation result is tight. @InProceedings{STOC21p248, author = {Mark Braverman and Dor Minzer}, title = {New Separations Results for External Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {248--258}, doi = {10.1145/3406325.3451044}, year = {2021}, } Publisher's Version |
|
Bressan, Marco |
STOC '21: "Efficient and Near-Optimal ..."
Efficient and Near-Optimal Algorithms for Sampling Connected Subgraphs
Marco Bressan (University of Milan, Italy) We study the graphlet sampling problem: given an integer k ≥ 3 and a graph G=(V,E), sample a connected induced k-node subgraph of G (also called k-graphlet) uniformly at random. This is a fundamental graph mining primitive, with applications in social network analysis and bioinformatics. The two state-of-the-art techniques are random walks and color coding. The random walk is elegant, but the current upper bounds and lower bounds on its mixing time suffer a gap of Δk−1 where Δ is the maximum degree of G. Color coding is better understood, but requires a 2O(k) m-time preprocessing over the entire graph. Moreover, no efficient algorithm is known for sampling graphlets uniformly — random walks and color coding yield only є-uniform samples. In this work, we provide the following results: (i) A near-optimal mixing time bound for the classic k-graphlet random walk, as a function of the mixing time of G. In particular, ignoring kO(k) factors, we show that the k-graphlet random walk mixes in Θ(t(G) · ρ(G)k−1) steps, where t(G) is the mixing time of G and ρ(G) is the ratio between its maximum and minimum degree, and on some graphs this is tight up to lgn factors. (ii) The first efficient algorithm for uniform graphlet sampling. The algorithm has a preprocessing phase that uses time O(n k2 lgk + m) and space O(n), and a sampling phase that takes kO(k) lgΔ time per sample. It is based on ordering G in a simple way, so to virtually partition the graphlets into buckets, and then sampling from those buckets using rejection sampling. The algorithm can be used also for counting, with additive guarantees. (iii) A near-optimal algorithm for є-uniform graphlet sampling, with a preprocessing phase that runs in time O(k6 є−1 n lgn) and space O(n), and a sampling phase that takes kO(k)(1/є)10 lg1/є expected time per sample. The algorithm is based on a nontrivial sketching of the ordering of G, followed by emulating uniform sampling through coupling arguments. @InProceedings{STOC21p1132, author = {Marco Bressan}, title = {Efficient and Near-Optimal Algorithms for Sampling Connected Subgraphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1132--1143}, doi = {10.1145/3406325.3451042}, year = {2021}, } Publisher's Version |
|
Bringmann, Karl |
STOC '21: "Sparse Nonnegative Convolution ..."
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Karl Bringmann, Nick Fischer, and Vasileios Nakos (Saarland University, Germany; MPI-INF, Germany) Computing the convolution A ⋆ B of two length-n vectors A,B is an ubiquitous computational primitive, with applications in a variety of disciplines. Within theoretical computer science, applications range from string problems to Knapsack-type problems, and from 3SUM to All-Pairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of A,B are nonnegative integers. The classical algorithm to compute A⋆ B uses the Fast Fourier Transform (FFT) and runs in time O(n logn). However, in many cases A and B might satisfy sparsity conditions, and hence one could hope for significant gains compared to the standard FFT algorithm. The ideal goal would be an O(k logk)-time algorithm, where k is the number of non-zero elements in the output, i.e., the size of the support of A ⋆ B. This problem is referred to as sparse nonnegative convolution, and has received a considerable amount of attention in the literature; the fastest algorithms to date run in time O(k log2 n). The main result of this paper is the first O(k logk)-time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length n and the largest entry of A and B are subexponential in k. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constant-error randomized algorithms. Specifically, if D(n) is the time to convolve two nonnegative length-n vectors with success probability 2/3, and S(k) is the time to convolve two nonnegative vectors with output size k with success probability 2/3, then S(k) = O(D(k) + k (loglogk)2). Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function. @InProceedings{STOC21p1711, author = {Karl Bringmann and Nick Fischer and Vasileios Nakos}, title = {Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1711--1724}, doi = {10.1145/3406325.3451090}, year = {2021}, } Publisher's Version |
|
Brown, Gavin |
STOC '21: "When Is Memorization of Irrelevant ..."
When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?
Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, and Kunal Talwar (Boston University, USA; Apple, USA) Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are high-dimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the next-symbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text- and image-related prediction problems. To establish our results, we reduce from a family of one-way communication problems for which we prove new information complexity lower bounds. @InProceedings{STOC21p123, author = {Gavin Brown and Mark Bun and Vitaly Feldman and Adam Smith and Kunal Talwar}, title = {When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {123--132}, doi = {10.1145/3406325.3451131}, year = {2021}, } Publisher's Version |
|
Bruna, Joan |
STOC '21: "Continuous LWE ..."
Continuous LWE
Joan Bruna, Oded Regev, Min Jae Song, and Yi Tang (New York University, USA; University of Michigan, USA) We introduce a continuous analogue of the Learning with Errors (LWE) problem, which we name CLWE. We give a polynomial-time quantum reduction from worst-case lattice problems to CLWE, showing that CLWE enjoys similar hardness guarantees to those of LWE. Alternatively, our result can also be seen as opening new avenues of (quantum) attacks on lattice problems. Our work resolves an open problem regarding the computational complexity of learning mixtures of Gaussians without separability assumptions (Diakonikolas 2016, Moitra 2018). As an additional motivation, (a slight variant of) CLWE was considered in the context of robust machine learning (Diakonikolas et al.~FOCS 2017), where hardness in the statistical query (SQ) model was shown; our work addresses the open question regarding its computational hardness (Bubeck et al.~ICML 2019). @InProceedings{STOC21p694, author = {Joan Bruna and Oded Regev and Min Jae Song and Yi Tang}, title = {Continuous LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {694--707}, doi = {10.1145/3406325.3451000}, year = {2021}, } Publisher's Version |
|
Bun, Mark |
STOC '21: "When Is Memorization of Irrelevant ..."
When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?
Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, and Kunal Talwar (Boston University, USA; Apple, USA) Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are high-dimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the next-symbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text- and image-related prediction problems. To establish our results, we reduce from a family of one-way communication problems for which we prove new information complexity lower bounds. @InProceedings{STOC21p123, author = {Gavin Brown and Mark Bun and Vitaly Feldman and Adam Smith and Kunal Talwar}, title = {When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {123--132}, doi = {10.1145/3406325.3451131}, year = {2021}, } Publisher's Version |
|
Caputo, Pietro |
STOC '21: "Entropy Decay in the Swendsen–Wang ..."
Entropy Decay in the Swendsen–Wang Dynamics on ℤd
Antonio Blanca, Pietro Caputo, Daniel Parisi, Alistair Sinclair, and Eric Vigoda (Pennsylvania State University, USA; Roma Tre University, Italy; University of California at Berkeley, USA; Georgia Institute of Technology, USA) We study the mixing time of the Swendsen-Wang dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤd. This dynamics is a widely used Markov chain that has largely resisted sharp analysis because it is non-local, i.e., it changes the entire configuration in one step. We prove that, whenever strong spatial mixing (SSM) holds, the mixing time on any n-vertex cube in ℤd is O(logn), and we prove this is tight by establishing a matching lower bound. The previous best known bound was O(n). SSM is a standard condition corresponding to exponential decay of correlations with distance between spins on the lattice and is known to hold in d=2 dimensions throughout the high-temperature (single phase) region. Our result follows from a modified log-Sobolev inequality, which expresses the fact that the dynamics contracts relative entropy at a constant rate at each step. The proof of this fact utilizes a new factorization of the entropy in the joint probability space over spins and edges that underlies the Swendsen-Wang dynamics, which extends to general bipartite graphs of bounded degree. This factorization leads to several additional results, including mixing time bounds for a number of natural local and non-local Markov chains on the joint space, as well as for the standard random-cluster dynamics. @InProceedings{STOC21p1551, author = {Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda}, title = {Entropy Decay in the Swendsen–Wang Dynamics on ℤ<sup><i>d</i></sup>}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1551--1564}, doi = {10.1145/3406325.3451095}, year = {2021}, } Publisher's Version |
|
Cecchetto, Federica |
STOC '21: "Bridging the Gap between Tree ..."
Bridging the Gap between Tree and Connectivity Augmentation: Unified and Stronger Approaches
Federica Cecchetto, Vera Traub, and Rico Zenklusen (ETH Zurich, Switzerland) We consider the Connectivity Augmentation Problem (CAP), a classical problem in the area of Survivable Network Design. It is about increasing the edge-connectivity of a graph by one unit in the cheapest possible way. More precisely, given a k-edge-connected graph G=(V,E) and a set of extra edges, the task is to find a minimum cardinality subset of extra edges whose addition to G makes the graph (k+1)-edge-connected. If k is odd, the problem is known to reduce to the Tree Augmentation Problem (TAP)—i.e., G is a spanning tree—for which significant progress has been achieved recently, leading to approximation factors below 1.5 (the currently best factor is 1.458). However, advances on TAP did not carry over to CAP so far. Indeed, only very recently, Byrka, Grandoni, and Ameli (STOC 2020) managed to obtain the first approximation factor below 2 for CAP by presenting a 1.91-approximation algorithm based on a method that is disjoint from recent advances for TAP. We first bridge the gap between TAP and CAP, by presenting techniques that allow for leveraging insights and methods from TAP to approach CAP. We then introduce a new way to get approximation factors below 1.5, based on a new analysis technique. Through these ingredients, we obtain a -approximation algorithm for CAP, and therefore also TAP. This leads to the currently best approximation result for both problems in a unified way, by significantly improving on the above-mentioned 1.91-approximation for CAP and also the previously best approximation factor of 1.458 for TAP by Grandoni, Kalaitzis, and Zenklusen (STOC 2018). Additionally, a feature we inherit from recent TAP advances is that our approach can deal with the weighted setting when the ratio between the largest to smallest cost on extra links is bounded, in which case we obtain approximation factors below 1.5. @InProceedings{STOC21p370, author = {Federica Cecchetto and Vera Traub and Rico Zenklusen}, title = {Bridging the Gap between Tree and Connectivity Augmentation: Unified and Stronger Approaches}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {370--383}, doi = {10.1145/3406325.3451086}, year = {2021}, } Publisher's Version |
|
Chase, Zachary |
STOC '21: "Separating Words and Trace ..."
Separating Words and Trace Reconstruction
Zachary Chase (University of Oxford, UK) We prove that for any distinct x,y ∈ {0,1}n, there is a deterministic finite automaton with O(n1/3) states that accepts x but not y. This improves Robson’s 1989 bound of O(n2/5). Using a similar complex analytic technique, we improve the upper bound on worst case trace reconstruction, showing that any unknown string x ∈ {0,1}n can be reconstructed with high probability from exp(O(n1/5)) independently generated traces. @InProceedings{STOC21p21, author = {Zachary Chase}, title = {Separating Words and Trace Reconstruction}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {21--31}, doi = {10.1145/3406325.3451118}, year = {2021}, } Publisher's Version |
|
Chattopadhyay, Arkadev |
STOC '21: "Lower Bounds for Monotone ..."
Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Arkadev Chattopadhyay, Rajit Datta, and Partha Mukhopadhyay (Tata Institute of Fundamental Research, India; Chennai Mathematical Institute, India) Valiant (1980) showed that general arithmetic circuits with negation can be exponentially more powerful than monotone ones. We give the first improvement to this classical result: we construct a family of polynomials Pn in n variables, each of its monomials has non-negative coefficient, such that Pn can be computed by a polynomial-size depth-three formula but every monotone circuit computing it has size 2Ω(n1/4/log(n)). The polynomial Pn embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif (2020) to refute the Log-Approximate-Rank Conjecture in communication complexity. To prove our lower bound for Pn, we develop a general connection between corruption of combinatorial rectangles by any function f ∘ XOR and corruption of product polynomials by a certain polynomial Pf that is an arithmetic embedding of f. This connection should be of independent interest. Using further ideas from communication complexity, we construct another family of set-multilinear polynomials fn,m such that both Fn,m − є· fn,m and Fn,m + є· fn,m have monotone circuit complexity 2Ω(n/log(n)) if є ≥ 2− Ω( m ) and Fn,m ∏i=1n (xi,1 +⋯+xi,m), with m = O( n/logn ). The polynomials fn,m have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits has been advocated recently by Hrubeš (2020) as a first step towards proving lower bounds against general circuits via his new approach. @InProceedings{STOC21p786, author = {Arkadev Chattopadhyay and Rajit Datta and Partha Mukhopadhyay}, title = {Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {786--799}, doi = {10.1145/3406325.3451069}, year = {2021}, } Publisher's Version |
|
Chen, Lijie |
STOC '21: "Simple and Fast Derandomization ..."
Simple and Fast Derandomization from Very Hard Functions: Eliminating Randomness at Almost No Cost
Lijie Chen and Roei Tell (Massachusetts Institute of Technology, USA) Extending the classical “hardness-to-randomness” line-of-works, Doron, Moshkovitz, Oh, and Zuckerman (FOCS 2020) recently proved that derandomization with near-quadratic time overhead is possible, under the assumption that there exists a function in DTIME[2n] that cannot be computed by randomized SVN circuits of size 2(1−є)· n for a small є. In this work we extend their inquiry and answer several open questions that arose from their work. For a time function T(n), consider the following assumption: Non-uniformly secure one-way functions exist, and for δ=δ(є) and k=kT(є) there exists a problem in DTIME[2k· n] that is hard for algorithms that run in time 2(k−δ)· n and use 2(1−δ)· n bits of advice. Under this assumption, we show that: 1. (Worst-case derandomization.) Probabilistic algorithms that run in time T(n) can be deterministically simulated in time n· T(n)1+є. 2. (Average-case derandomization.) For polynomial time functions T(n)=poly(n), we can improve the derandomization time to nє· T(n) if we allow the derandomization to succeed only on average, rather than in the worst-case. 3. (Conditional optimality.) For worst-case derandomization, the multiplicative time overhead of n is essentially optimal, conditioned on a counting version of the non-deterministic strong exponential-time hypothesis (i.e., on #NSETH). Lastly, we present an alternative proof for the result of Doron, Moshkovitz, Oh, and Zuckerman that is simpler and more versatile. In fact, we show how to simplify the analysis not only of their construction, but of any construction that “extracts randomness from a pseudoentropic string”. @InProceedings{STOC21p283, author = {Lijie Chen and Roei Tell}, title = {Simple and Fast Derandomization from Very Hard Functions: Eliminating Randomness at Almost No Cost}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {283--291}, doi = {10.1145/3406325.3451059}, year = {2021}, } Publisher's Version STOC '21: "Inverse-Exponential Correlation ..." Inverse-Exponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma Lijie Chen and Xin Lyu (Massachusetts Institute of Technology, USA; Tsinghua University, China) In this work we prove that there is a function f ∈ E NP such that, for every sufficiently large n and d = √n/logn, fn (f restricted to n-bit inputs) cannot be (1/2 + 2−d)-approximated by F2-polynomials of degree d. We also observe that a minor improvement (e.g., improving d to n1/2+ε for any ε > 0) over our result would imply E NP cannot be computed by depth-3 AC0-circuits of 2n1/2 + ε size, which is a notoriously hard open question in complexity theory. Using the same proof techniques, we are also able to construct extremely rigid matrices over F2 in P NP. More specifically, we show that for every constant ε ∈ (0,1), there is a P NP algorithm which on input 1n outputs an n× n F2-matrix Hn satisfying RHn(2log1 − ε n) ≥ (1/2 − exp(−log2/3 · ε n) ) · n2, for every sufficiently large n. This improves the recent P NP constructions of rigid matrices in [Alman and Chen, FOCS 2019] and [Bhangale et al., FOCS 2020], which only give Ω(n2) rigidity. The key ingredient in the proof of our new results is a new derandomized XOR lemma based on approximate linear sums, which roughly says that given an n-input function f which cannot be 0.99-approximated by certain linear sum of s many functions in F within ℓ1-distance, one can construct a new function Ampf with O(n) input bits, which cannot be (1/2+sΩ(1))-approximated by F-functions. Taking F to be a function collection containing low-degree F2-polynomials or low-rank F2-matrices, our results are then obtained by first using the algorithmic method to construct a function which is weakly hard against linear sums of F in the above sense, and then applying the derandomized XOR lemma to f. We obtain our new derandomized XOR lemma by giving a generalization of the famous hardcore lemma by Impagliazzo. Our generalization in some sense constructs a non-Boolean hardcore of a weakly hard function f with respect to F-functions, from the weak inapproximability of f by any linear sum of F with bounded ℓp-norm. This generalization recovers the original hardcore lemma by considering the ℓ∞-norm. Surprisingly, when we switch to the ℓ1-norm, we immediately rediscover Levin’s proof of Yao’s XOR Lemma. That is, these first two proofs of Yao’s XOR Lemma can be unified with our new perspective. For proving the correlation bounds, our new derandomized XOR lemma indeed works with the ℓ4/3-norm. @InProceedings{STOC21p761, author = {Lijie Chen and Xin Lyu}, title = {Inverse-Exponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {761--771}, doi = {10.1145/3406325.3451132}, year = {2021}, } Publisher's Version Info STOC '21: "Almost Optimal Super-Constant-Pass ..." Almost Optimal Super-Constant-Pass Streaming Lower Bounds for Reachability Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu (Massachusetts Institute of Technology, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA) We give an almost quadratic n2−o(1) lower bound on the space consumption of any o(√logn)-pass streaming algorithm solving the (directed) s-t reachability problem. This means that any such algorithm must essentially store the entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional fundamental problems, including maximum matching, shortest path, matrix rank, and linear programming. Our main technical contribution is the definition and construction of set hiding graphs, that may be of independent interest: we give a general way of encoding a set S ⊆ [k] as a directed graph with n = k 1 + o( 1 ) vertices, such that deciding whether i ∈ S boils down to deciding if ti is reachable from si, for a specific pair of vertices (si,ti) in the graph. Furthermore, we prove that our graph “hides” S, in the sense that no low-space streaming algorithm with a small number of passes can learn (almost) anything about S. @InProceedings{STOC21p570, author = {Lijie Chen and Gillat Kol and Dmitry Paramonov and Raghuvansh R. Saxena and Zhao Song and Huacheng Yu}, title = {Almost Optimal Super-Constant-Pass Streaming Lower Bounds for Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {570--583}, doi = {10.1145/3406325.3451038}, year = {2021}, } Publisher's Version |
|
Chen, Sitan |
STOC '21: "Algorithmic Foundations for ..."
Algorithmic Foundations for the Diffraction Limit
Sitan Chen and Ankur Moitra (Massachusetts Institute of Technology, USA) For more than a century and a half it has been widely-believed (but was never rigorously shown) that the physics of diffraction imposes certain fundamental limits on the resolution of an optical system. However our understanding of what exactly can and cannot be resolved has never risen above heuristic arguments which, even worse, appear contradictory. In this work we remedy this gap by studying the diffraction limit as a statistical inverse problem and, based on connections to provable algorithms for learning mixture models, we rigorously prove upper and lower bounds on the statistical and algorithmic complexity needed to resolve closely spaced point sources. In particular we show that there is a phase transition where the sample complexity goes from polynomial to exponential. Surprisingly, we show that this does not occur at the Abbe limit, which has long been presumed to be the true diffraction limit. @InProceedings{STOC21p490, author = {Sitan Chen and Ankur Moitra}, title = {Algorithmic Foundations for the Diffraction Limit}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {490--503}, doi = {10.1145/3406325.3451078}, year = {2021}, } Publisher's Version |
|
Chen, Zongchen |
STOC '21: "Optimal Mixing of Glauber ..."
Optimal Mixing of Glauber Dynamics: Entropy Factorization via High-Dimensional Expansion
Zongchen Chen, Kuikui Liu, and Eric Vigoda (Georgia Institute of Technology, USA; University of Washington, USA) We prove an optimal mixing time bound for the single-site update Markov chain known as the Glauber dynamics or Gibbs sampling in a variety of settings. Our work presents an improved version of the spectral independence approach of Anari et al. (2020) and shows O(nlogn) mixing time on any n-vertex graph of bounded degree when the maximum eigenvalue of an associated influence matrix is bounded. As an application of our results, for the hard-core model on independent sets weighted by a fugacity λ, we establish O(nlogn) mixing time for the Glauber dynamics on any n-vertex graph of constant maximum degree Δ when λ<λc(Δ) where λc(Δ) is the critical point for the uniqueness/non-uniqueness phase transition on the Δ-regular tree. More generally, for any antiferromagnetic 2-spin system we prove O(nlogn) mixing time of the Glauber dynamics on any bounded degree graph in the corresponding tree uniqueness region. Our results apply more broadly; for example, we also obtain O(nlogn) mixing for q-colorings of triangle-free graphs of maximum degree Δ when the number of colors satisfies q > α Δ where α ≈ 1.763, and O(mlogn) mixing for generating random matchings of any graph with bounded degree and m edges. Our approach is based on two steps. First, we show that the approximate tensorization of entropy (i.e., factorizing entropy into single vertices), which is a key step for establishing the modified log-Sobolev inequality in many previous works, can be deduced from entropy factorization into blocks of fixed linear size. Second, we adapt the local-to-global scheme of Alev and Lau (2020) to establish such block factorization of entropy in a more general setting of pure weighted simplicial complexes satisfying local spectral expansion; this also substantially generalizes the result of Cryan et al. (2019). @InProceedings{STOC21p1537, author = {Zongchen Chen and Kuikui Liu and Eric Vigoda}, title = {Optimal Mixing of Glauber Dynamics: Entropy Factorization via High-Dimensional Expansion}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1537--1550}, doi = {10.1145/3406325.3451035}, year = {2021}, } Publisher's Version |
|
Cheu, Albert |
STOC '21: "The Limits of Pan Privacy ..."
The Limits of Pan Privacy and Shuffle Privacy for Learning and Estimation
Albert Cheu and Jonathan Ullman (Northeastern University, USA) There has been a recent wave of interest in intermediate trust models for differential privacy that eliminate the need for a fully trusted central data collector, but overcome the limitations of local differential privacy. This interest has led to the introduction of the shuffle model (Cheu et al., EUROCRYPT 2019; Erlingsson et al., SODA 2019) and revisiting the pan-private model (Dwork et al., ITCS 2010). The message of this line of work is that, for a variety of low-dimensional problems—such as counts, means, and histograms—these intermediate models offer nearly as much power as central differential privacy. However, there has been considerably less success using these models for high-dimensional learning and estimation problems. In this work we prove the first non-trivial lower bounds for high-dimensional learning and estimation in both the pan-private model and the general multi-message shuffle model. Our lower bounds apply to a variety of problems—for example, we show that, private agnostic learning of parity functions over d bits requires Ω(2d/2) samples in these models, and privately selecting the most common attribute from a set of d choices requires Ω(d1/2) samples, both of which are exponential separations from the central model. @InProceedings{STOC21p1081, author = {Albert Cheu and Jonathan Ullman}, title = {The Limits of Pan Privacy and Shuffle Privacy for Learning and Estimation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1081--1094}, doi = {10.1145/3406325.3450995}, year = {2021}, } Publisher's Version |
|
Chuzhoy, Julia |
STOC '21: "Decremental All-Pairs Shortest ..."
Decremental All-Pairs Shortest Paths in Deterministic Near-Linear Time
Julia Chuzhoy (Toyota Technological Institute at Chicago, USA) We study the decremental All-Pairs Shortest Paths (APSP) problem in undirected edge-weighted graphs. The input to the problem is an undirected n-vertex m-edge graph G with non-negative lengths on edges, that undergoes an online sequence of edge deletions. The goal is to support approximate shortest-paths queries: given a pair x,y of vertices of G, return a path P connecting x to y, whose length is within factor α of the length of the shortest x-y path, in time Õ(|E(P)|), where α is the approximation factor of the algorithm. APSP is one of the most basic and extensively studied dynamic graph problems. A long line of work culminated in the algorithm of [Chechik, FOCS 2018] with near optimal guarantees: for any constant 0<є≤ 1 and parameter k≥ 1, the algorithm achieves approximation factor (2+є)k−1, and total update time O(mn1/k+o(1)log(nL)), where L is the ratio of longest to shortest edge lengths. Unfortunately, as much of prior work, the algorithm is randomized and needs to assume an oblivious adversary; that is, the input edge-deletion sequence is fixed in advance and may not depend on the algorithm’s behavior. In many real-world scenarios, and in applications of APSP to static graph problems, it is crucial that the algorithm works against an adaptive adversary, where the edge deletion sequence may depend on the algorithm’s past behavior arbitrarily; ideally, such an algorithm should be deterministic. Unfortunately, unlike the oblivious-adversary setting, its adaptive-adversary counterpart is still poorly understood. For unweighted graphs, the algorithm of [Henzinger, Krinninger and Nanongkai, FOCS ’13, SICOMP ’16] achieves a (1+є)-approximation with total update time Õ(mn/є); the best current total update time guarantee of n2.5+O(є) is achieved by the recent deterministic algorithm of [Chuzhoy, Saranurak, SODA’21], with 2O(1/є)-multiplicative and 2O(log3/4n/є)-additive approximation. To the best of our knowledge, for arbitrary non-negative edge weights, the fastest current adaptive-update algorithm has total update time O(n3logL/є), achieving a (1+є)-approximation. Even if we are willing to settle for any o(n)-approximation factor, no currently known algorithm has a better than Θ(n3) total update time in weighted graphs and better than Θ(n2.5) total update time in unweighted graphs. Several conditional lower bounds suggest that no algorithm with a sufficiently small approximation factor can achieve an o(n3) total update time. Our main result is a deterministic algorithm for decremental APSP in undirected edge-weighted graphs, that, for any Ω(1/loglogm)≤ є< 1, achieves approximation factor (logm)2O(1/є), with total update time O(m1+O(є)· (logm)O(1/є2)· logL). In particular, we obtain a (polylogm)-approximation in time Õ(m1+є) for any constant є, and, for any slowly growing function f(m), we obtain (logm)f(m)-approximation in time m1+o(1). We also provide an algorithm with similar guarantees for decremental Sparse Neighborhood Covers. @InProceedings{STOC21p626, author = {Julia Chuzhoy}, title = {Decremental All-Pairs Shortest Paths in Deterministic Near-Linear Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {626--639}, doi = {10.1145/3406325.3451025}, year = {2021}, } Publisher's Version |
|
Cohen, Alex |
STOC '21: "Structure vs. Randomness for ..."
Structure vs. Randomness for Bilinear Maps
Alex Cohen and Guy Moshkovitz (Yale University, USA; City University of New York, USA) We prove that the slice rank of a 3-tensor (a combinatorial notion introduced by Tao in the context of the cap-set problem), the analytic rank (a Fourier-theoretic notion introduced by Gowers and Wolf), and the geometric rank (a recently introduced algebro-geometric notion) are all equivalent up to an absolute constant. As a corollary, we obtain strong trade-offs on the arithmetic complexity of a biased bililnear map, and on the separation between computing a bilinear map exactly and on average. Our result settles open questions of Haramaty and Shpilka [STOC 2010], and of Lovett [Discrete Anal., 2019] for 3-tensors. @InProceedings{STOC21p800, author = {Alex Cohen and Guy Moshkovitz}, title = {Structure vs. Randomness for Bilinear Maps}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {800--808}, doi = {10.1145/3406325.3451007}, year = {2021}, } Publisher's Version |
|
Cohen, Gil |
STOC '21: "Expander Random Walks: A Fourier-Analytic ..."
Expander Random Walks: A Fourier-Analytic Approach
Gil Cohen, Noam Peri, and Amnon Ta-Shma (Tel Aviv University, Israel) In this work we ask the following basic question: assume the vertices of an expander graph are labelled by 0,1. What “test” functions f : { 0,1}t → {0,1} cannot distinguish t independent samples from those obtained by a random walk? The expander hitting property due to Ajtai, Komlos and Szemeredi (STOC 1987) is captured by the AND test function, whereas the fundamental expander Chernoff bound due to Gillman (SICOMP 1998), Heally (Computational Complexity 2008) is about test functions indicating whether the weight is close to the mean. In fact, it is known that all threshold functions are fooled by a random walk (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Recently, it was shown that even the highly sensitive PARITY function is fooled by a random walk Ta-Shma (STOC 2017). We focus on balanced labels. Our first main result is proving that all symmetric functions are fooled by a random walk. Put differently, we prove a central limit theorem (CLT) for expander random walks with respect to the total variation distance, significantly strengthening the classic CLT for Markov Chains that is established with respect to the Kolmogorov distance (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Our approach significantly deviates from prior works. We first study how well a Fourier character χS is fooled by a random walk as a function of S. Then, given a test function f, we expand f in the Fourier basis and combine the above with known results on the Fourier spectrum of f. We also proceed further and consider general test functions - not necessarily symmetric. As our approach is Fourier analytic, it is general enough to analyze such versatile test functions. For our second result, we prove that random walks on sufficiently good expander graphs fool tests functions computed by AC0 circuits, read-once branching programs, and functions with bounded query complexity. @InProceedings{STOC21p1643, author = {Gil Cohen and Noam Peri and Amnon Ta-Shma}, title = {Expander Random Walks: A Fourier-Analytic Approach}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1643--1655}, doi = {10.1145/3406325.3451049}, year = {2021}, } Publisher's Version Info |
|
Cohen-Addad, Vincent |
STOC '21: "A Quasipolynomial (2 + ε)-Approximation ..."
A Quasipolynomial (2 + ε)-Approximation for Planar Sparsest Cut
Vincent Cohen-Addad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor approximations are known. We consider the problem for planar graphs, and give a (2+)-approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a Sherali-Adams-style linear programming relaxation of the problem, which we then round. This should be compared with the polynomial-time approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ1-embeddings, and achieves an O(√logn)-approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent Cohen-Addad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)-Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1056--1069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version STOC '21: "A New Coreset Framework for ..." A New Coreset Framework for Clustering Vincent Cohen-Addad, David Saulpic, and Chris Schwiegelshohn (Google, Switzerland; Sorbonne University, France; CNRS, France; LIP6, France; Aarhus University, Denmark) Given a metric space, the (k,z)-clustering problem consists of finding k centers such that the sum of the of distances raised to the power z of every point to its closest center is minimized. This encapsulates the famous k-median (z=1) and k-means (z=2) clustering problems. Designing small-space sketches of the data that approximately preserves the cost of the solutions, also known as coresets, has been an important research direction over the last 15 years. In this paper, we present a new, simple coreset framework that simultaneously improves upon the best known bounds for a large variety of settings, ranging from Euclidean space, doubling metric, minor-free metric, and the general metric cases. @InProceedings{STOC21p169, author = {Vincent Cohen-Addad and David Saulpic and Chris Schwiegelshohn}, title = {A New Coreset Framework for Clustering}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {169--182}, doi = {10.1145/3406325.3451022}, year = {2021}, } Publisher's Version Info |
|
Croquevielle, Luis Alberto |
STOC '21: "A Polynomial-Time Approximation ..."
A Polynomial-Time Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Counting the number of words of a certain length accepted by a non-deterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature. The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GEN-NFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k. It is known that #NFA is #P-complete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomial-time randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)). Besides, we will mention some consequences and applications of our results. In particular, from well-known results on counting and uniform generation, we obtain that GEN-NFA admits a fully polynomial-time almost uniform generator. Moreover, as #NFA is SpanL-complete under polynomial-time parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS. @InProceedings{STOC21p4, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {A Polynomial-Time Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {4--4}, doi = {10.1145/3406325.3465353}, year = {2021}, } Publisher's Version STOC '21: "When Is Approximate Counting ..." When Is Approximate Counting for Conjunctive Queries Tractable? Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomial-time randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomial-time algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries. The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasi-polynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits. @InProceedings{STOC21p1015, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {When Is Approximate Counting for Conjunctive Queries Tractable?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1015--1027}, doi = {10.1145/3406325.3451014}, year = {2021}, } Publisher's Version |
|
Curticapean, Radu |
STOC '21: "A Full Complexity Dichotomy ..."
A Full Complexity Dichotomy for Immanant Families
Radu Curticapean (IT University of Copenhagen, Denmark) Given an integer n ≥ 1 and an irreducible character χλ of Sn for some partition λ of n, the immanant immλ:ℂn× n→ℂ maps matrices A∈ℂn× n to immλ(A)=∑π∈ Snχλ(π)∏i=1nAi,π(i). Important special cases include the determinant and permanent, which are obtained from the sign and trivial character, respectively. It is known that immanants can be evaluated in polynomial time for characters that are “close” to the sign character: Given a partition λ of n with s parts, let b(λ):=n−s count the boxes to the right of the first column in the Young diagram of λ. For a family of partitions Λ, let b(Λ) := maxλ∈Λb(λ) and write Imm(Λ) for the problem of evaluating immλ(A) on input A and λ∈Λ. On the positive side, if b(Λ)<∞, then Imm(Λ) is known to be polynomial-time computable. This subsumes the case of the determinant. Conversely, if b(Λ)=∞, then previously known hardness results suggest that Imm(Λ) cannot be solved in polynomial time. However, these results only address certain restricted classes of families Λ. In this paper, we show that the assumption FPT≠ #W[1] from parameterized complexity rules out polynomial-time algorithms for Imm(Λ) for any computationally reasonable family of partitions Λ with b(Λ)=∞. We give an analogous result in algebraic complexity under the assumption VFPT≠ VW[1]. Furthermore, if b(λ) even grows polynomially in Λ, we show that Imm(Λ) is hard for #P and VNP. This concludes a series of partial results on the complexity of immanants obtained over the last 35 years. @InProceedings{STOC21p1770, author = {Radu Curticapean}, title = {A Full Complexity Dichotomy for Immanant Families}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1770--1783}, doi = {10.1145/3406325.3451124}, year = {2021}, } Publisher's Version |
|
Dagan, Yuval |
STOC '21: "Adversarial Laws of Large ..."
Adversarial Laws of Large Numbers and Optimal Regret in Online Classification
Noga Alon, Omri Ben-Eliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed sub-population is well-estimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by Ben-Eliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The sample-complexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from Ben-David, Pál, and Shalev-Shwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri Ben-Eliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447--455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version STOC '21: "Learning Ising Models from ..." Learning Ising Models from One or Multiple Samples Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the afore-described multiple-sample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anti-concentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161--168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version |
|
Dalirrooyfard, Mina |
STOC '21: "Tight Conditional Lower Bounds ..."
Tight Conditional Lower Bounds for Approximating Diameter in Directed Graphs
Mina Dalirrooyfard and Nicole Wein (Massachusetts Institute of Technology, USA) Among the most fundamental graph parameters is the Diameter, the largest distance between any pair of vertices in a graph. Computing the Diameter of a graph with m edges requires m2−o(1) time under the Strong Exponential Time Hypothesis (SETH), which can be prohibitive for very large graphs, so efficient approximation algorithms for Diameter are desired. There is a folklore algorithm that gives a 2-approximation for Diameter in Õ(m) time (where Õ notation suppresses logarithmic factors). Additionally, a line of work [SODA’96, STOC’13, SODA’14] concludes with a 3/2-approximation algorithm for Diameter in weighted directed graphs that runs in Õ(m3/2) time. For directed graphs, these are the only known approximation algorithms for Diameter. The 3/2-approximation algorithm is known to be tight under SETH: Roditty and Vassilevska W. [STOC’13] proved that under SETH any 3/2−ε approximation algorithm for Diameter in undirected unweighted graphs requires m2−o(1) time, and then Backurs, Roditty, Segal, Vassilevska W., and Wein [STOC’18] and the follow-up work of Li proved that under SETH any 5/3−ε approximation algorithm for Diameter in undirected unweighted graphs requires m3/2−o(1) time. Whether or not the folklore 2-approximation algorithm is tight, however, is unknown, and has been explicitly posed as an open problem in numerous papers. Towards this question, Bonnet recently proved that under SETH, any 7/4−ε approximation requires m4/3−o(1), only for directed weighted graphs. We completely resolve this question for directed graphs by proving that the folklore 2-approximation algorithm is conditionally optimal. In doing so, we obtain a series of conditional lower bounds that together with prior work, give a complete time-accuracy trade-off that is tight with the three known algorithms for directed graphs. Specifically, we prove that under SETH for any δ>0, a (2k−1/k−δ)-approximation algorithm for Diameter on directed unweighted graphs requires mk/k−1−o(1) time. @InProceedings{STOC21p1697, author = {Mina Dalirrooyfard and Nicole Wein}, title = {Tight Conditional Lower Bounds for Approximating Diameter in Directed Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1697--1710}, doi = {10.1145/3406325.3451130}, year = {2021}, } Publisher's Version |
|
Daskalakis, Constantinos |
STOC '21: "Sample-Optimal and Efficient ..."
Sample-Optimal and Efficient Learning of Tree Ising Models
Constantinos Daskalakis and Qinxuan Pan (Massachusetts Institute of Technology, USA) We show that n-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance є from an optimal O(n lnn/є2) samples, where O(·) hides an absolute constant which, importantly, does not depend on the model being learned—neither its tree nor the magnitude of its edge strengths, on which we place no assumptions. Our guarantees hold, in fact, for the celebrated Chow-Liu algorithm [1968], using the plug-in estimator for estimating mutual information. While this (or any other) algorithm may fail to identify the structure of the underlying model correctly from a finite sample, we show that it will still learn a tree-structured model that is є-close to the true one in total variation distance, a guarantee called “proper learning.” Our guarantees do not follow from known results for the Chow-Liu algorithm and the ensuing literature on learning graphical models, including the very recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or sample-suboptimal and/or time-inefficient algorithms, unless further assumptions are placed on the model, such as bounds on the “strengths” of the model’s edges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sample-optimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation. @InProceedings{STOC21p133, author = {Constantinos Daskalakis and Qinxuan Pan}, title = {Sample-Optimal and Efficient Learning of Tree Ising Models}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {133--146}, doi = {10.1145/3406325.3451006}, year = {2021}, } Publisher's Version STOC '21: "The Complexity of Constrained ..." The Complexity of Constrained Min-Max Optimization Constantinos Daskalakis, Stratis Skoulakis, and Manolis Zampetakis (Massachusetts Institute of Technology, USA; Singapore University of Technology and Design, Singapore; University of California at Berkeley, USA) Despite its important applications in Machine Learning, min-max optimization of objective functions that are nonconvex-nonconcave remains elusive. Not only are there no known first-order methods converging to even approximate local min-max equilibria (a.k.a. approximate saddle points), but the computational complexity of identifying them is also poorly understood. In this paper, we provide a characterization of the computational complexity as well as of the limitations of first-order methods in this problem. Specifically, we show that in linearly constrained min-max optimization problems with nonconvex-nonconcave objectives an approximate local min-max equilibrium of large enough approximation is guaranteed to exist, but computing such a point is PPAD-complete. The same is true of computing an approximate fixed point of the (Projected) Gradient Descent/Ascent update dynamics, which is computationally equivalent to computing approximate local min-max equilibria. An important byproduct of our proof is to establish an unconditional hardness result in the Nemirovsky-Yudin 1983 oracle optimization model, where we are given oracle access to the values of some function f : P → [−1, 1] and its gradient ∇ f, where P ⊆ [0, 1]d is a known convex polytope. We show that any algorithm that uses such first-order oracle access to f and finds an ε-approximate local min-max equilibrium needs to make a number of oracle queries that is exponential in at least one of 1/ε, L, G, or d, where L and G are respectively the smoothness and Lipschitzness of f. This comes in sharp contrast to minimization problems, where finding approximate local minima in the same setting can be done with Projected Gradient Descent using O(L/ε) many queries. Our result is the first to show an exponential separation between these two fundamental optimization problems in the oracle model. @InProceedings{STOC21p1466, author = {Constantinos Daskalakis and Stratis Skoulakis and Manolis Zampetakis}, title = {The Complexity of Constrained Min-Max Optimization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1466--1478}, doi = {10.1145/3406325.3451125}, year = {2021}, } Publisher's Version STOC '21: "Learning Ising Models from ..." Learning Ising Models from One or Multiple Samples Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the afore-described multiple-sample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anti-concentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161--168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version |
|
Datta, Rajit |
STOC '21: "Lower Bounds for Monotone ..."
Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Arkadev Chattopadhyay, Rajit Datta, and Partha Mukhopadhyay (Tata Institute of Fundamental Research, India; Chennai Mathematical Institute, India) Valiant (1980) showed that general arithmetic circuits with negation can be exponentially more powerful than monotone ones. We give the first improvement to this classical result: we construct a family of polynomials Pn in n variables, each of its monomials has non-negative coefficient, such that Pn can be computed by a polynomial-size depth-three formula but every monotone circuit computing it has size 2Ω(n1/4/log(n)). The polynomial Pn embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif (2020) to refute the Log-Approximate-Rank Conjecture in communication complexity. To prove our lower bound for Pn, we develop a general connection between corruption of combinatorial rectangles by any function f ∘ XOR and corruption of product polynomials by a certain polynomial Pf that is an arithmetic embedding of f. This connection should be of independent interest. Using further ideas from communication complexity, we construct another family of set-multilinear polynomials fn,m such that both Fn,m − є· fn,m and Fn,m + є· fn,m have monotone circuit complexity 2Ω(n/log(n)) if є ≥ 2− Ω( m ) and Fn,m ∏i=1n (xi,1 +⋯+xi,m), with m = O( n/logn ). The polynomials fn,m have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits has been advocated recently by Hrubeš (2020) as a first step towards proving lower bounds against general circuits via his new approach. @InProceedings{STOC21p786, author = {Arkadev Chattopadhyay and Rajit Datta and Partha Mukhopadhyay}, title = {Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {786--799}, doi = {10.1145/3406325.3451069}, year = {2021}, } Publisher's Version |
|
De, Anindya |
STOC '21: "Robust Testing of Low Dimensional ..."
Robust Testing of Low Dimensional Functions
Anindya De, Elchanan Mossel, and Joe Neeman (University of Pennsylvania, USA; Massachusetts Institute of Technology, USA; University of Texas at Austin, USA) A natural problem in high-dimensional inference is to decide if a classifier f:ℝn → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝn → {−1,1}, a linear k-junta if it is completely determined by some k-dimensional subspace of the input space. A recent work of the authors showed that linear k-juntas are testable. Thus there exists an algorithm to distinguish between: (1) f: ℝn → {−1,1} which is a linear k-junta with surface area s. (2) f is є-far from any linear k-junta with surface area (1+є)s. The query complexity of the algorithm is independent of the ambient dimension n. Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between: (1) f: ℝn → {−1,1} has correlation at least c with some linear k-junta with surface area s. (2) f has correlation at most c−є with any linear k-junta with surface area at most s. The query complexity of our tester is kpoly(s/є). Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear k-juntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity kO(poly(logk/є)) for the class of intersection of k-halfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no non-trivial noise tolerant testers were known even for a single halfspace. @InProceedings{STOC21p584, author = {Anindya De and Elchanan Mossel and Joe Neeman}, title = {Robust Testing of Low Dimensional Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {584--597}, doi = {10.1145/3406325.3451115}, year = {2021}, } Publisher's Version |
|
DeBlasio, Dan |
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design
Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case guarantee. Worst-case instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that data-driven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, application-specific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing case-by-case analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewise-constant, -linear, or—more generally—piecewise-structured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {Maria-Florina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919--932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version |
|
De Kroon, Jari J. H. |
STOC '21: "Vertex Deletion Parameterized ..."
Vertex Deletion Parameterized by Elimination Distance and Even Less
Bart M. P. Jansen, Jari J. H. de Kroon, and Michał Włodarczyk (Eindhoven University of Technology, Netherlands) We study the parameterized complexity of various classic vertex-deletion problems such as Odd cycle transversal, Vertex planarization, and Chordal vertex deletion under hybrid parameterizations. Existing FPT algorithms for these problems either focus on the parameterization by solution size, detecting solutions of size k in time f(k) · nO(1), or width parameterizations, finding arbitrarily large optimal solutions in time f(w) · nO(1) for some width measure w like treewidth. We unify these lines of research by presenting FPT algorithms for parameterizations that can simultaneously be arbitrarily much smaller than the solution size and the treewidth. The first class of parameterizations is based on the notion of elimination distance of the input graph to the target graph class , which intuitively measures the number of rounds needed to obtain a graph in by removing one vertex from each connected component in each round. The second class of parameterizations consists of a relaxation of the notion of treewidth, allowing arbitrarily large bags that induce subgraphs belonging to the target class of the deletion problem as long as these subgraphs have small neighborhoods. Both kinds of parameterizations have been introduced recently and have already spawned several independent results. Our contribution is twofold. First, we present a framework for computing approximately optimal decompositions related to these graph measures. Namely, if the cost of an optimal decomposition is k, we show how to find a decomposition of cost kO(1) in time f(k) · nO(1). This is applicable to any class for which we can solve the so-called separation problem. Secondly, we exploit the constructed decompositions for solving vertex-deletion problems by extending ideas from algorithms using iterative compression and the finite state property. For the three mentioned vertex-deletion problems, and all problems which can be formulated as hitting a finite set of connected forbidden (a) minors or (b) (induced) subgraphs, we obtain FPT algorithms with respect to both studied parameterizations. For example, we present an algorithm running in time nO(1) + 2kO(1)·(n+m) and polynomial space for Odd cycle transversal parameterized by the elimination distance k to the class of bipartite graphs. @InProceedings{STOC21p1757, author = {Bart M. P. Jansen and Jari J. H. de Kroon and Michał Włodarczyk}, title = {Vertex Deletion Parameterized by Elimination Distance and Even Less}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1757--1769}, doi = {10.1145/3406325.3451068}, year = {2021}, } Publisher's Version |
|
De Rezende, Susanna F. |
STOC '21: "Automating Algebraic Proof ..."
Automating Algebraic Proof Systems Is NP-Hard
Susanna F. de Rezende, Mika Göös, Jakob Nordström, Toniann Pitassi, Robert Robere, and Dmitry Sokolov (Czech Academy of Sciences, Czechia; EPFL, Switzerland; University of Copenhagen, Denmark; Lund University, Sweden; University of Toronto, Canada; Institute for Advanced Study at Princeton, USA; McGill University, Canada; St. Petersburg State University, Russia; Russian Academy of Sciences, Russia) We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula F, it is NP-hard to find a refutation of F in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time polynomial in the size of the shortest such refutation. Our work extends, and gives a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020) that established an analogous result for Resolution. @InProceedings{STOC21p209, author = {Susanna F. de Rezende and Mika Göös and Jakob Nordström and Toniann Pitassi and Robert Robere and Dmitry Sokolov}, title = {Automating Algebraic Proof Systems Is NP-Hard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {209--222}, doi = {10.1145/3406325.3451080}, year = {2021}, } Publisher's Version |
|
Dershowitz, Nachum |
STOC '21: "The Communication Complexity ..."
The Communication Complexity of Multiparty Set Disjointness under Product Distributions
Nachum Dershowitz, Rotem Oshman, and Tal Roth (Tel Aviv University, Israel) In the multiparty number-in-hand set disjointness problem, we have k players, with private inputs X1,…,Xk ⊆ [n]. The players’ goal is to check whether ∩ℓ=1k Xℓ = ∅. It is known that in the shared blackboard model of communication, set disjointness requires Ω(n logk + k) bits of communication, and in the coordinator model, it requires Ω(kn) bits. However, these two lower bounds require that the players’ inputs can be highly correlated. We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the two-party case. Our main result is a nearly-tight bound of Θ̃(n1−1/k + k) for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the two-party case do not scale to more than two players. @InProceedings{STOC21p1194, author = {Nachum Dershowitz and Rotem Oshman and Tal Roth}, title = {The Communication Complexity of Multiparty Set Disjointness under Product Distributions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1194--1207}, doi = {10.1145/3406325.3451106}, year = {2021}, } Publisher's Version |
|
Diakonikolas, Ilias |
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of Wisconsin-Madison, USA; MPI-INF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are є-far from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sample-optimal testers for a range of properties. While one can always boost the confidence probability of any such tester by black-box amplification, this generic boosting method typically leads to sub-optimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors, as a function of all relevant parameters. We also show matching information-theoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542--555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version STOC '21: "Efficiently Learning Halfspaces ..." Efficiently Learning Halfspaces with Tsybakov Noise Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, and Nikos Zarifis (University of Wisconsin-Madison, USA; University of California at San Diego, USA) We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomial-time algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of well-behaved distributions including log-concave distributions. This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the non-optimality of a candidate halfspace and gave a quasi-polynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomial-time certificate algorithm. @InProceedings{STOC21p88, author = {Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, title = {Efficiently Learning Halfspaces with Tsybakov Noise}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88--101}, doi = {10.1145/3406325.3450998}, year = {2021}, } Publisher's Version |
|
Dick, Travis |
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design
Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case guarantee. Worst-case instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that data-driven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, application-specific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing case-by-case analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewise-constant, -linear, or—more generally—piecewise-structured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {Maria-Florina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919--932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version |
|
Dikkala, Nishanth |
STOC '21: "Learning Ising Models from ..."
Learning Ising Models from One or Multiple Samples
Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the afore-described multiple-sample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anti-concentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161--168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version |
|
Dobzinski, Shahar |
STOC '21: "The Communication Complexity ..."
The Communication Complexity of Payment Computation
Shahar Dobzinski and Shiri Ron (Weizmann Institute of Science, Israel) Let (f,P) be an incentive compatible mechanism where f is the social choice function and P is the payment function. In many important settings, f uniquely determines P (up to a constant) and therefore a common approach is to focus on the design of f and neglect the role of the payment function. Fadel and Segal [JET, 2009] question this approach by taking the lenses of communication complexity: can it be that the communication complexity of an incentive compatible mechanism that implements f (that is, computes both the output and the payments) is much larger than the communication complexity of computing the output? I.e., can it be that ccIC(f)>>cc(f)? Fadel and Segal show that for every f, ccIC(f)≤ exp(cc(f)). They also show that fully computing the incentive compatible mechanism is strictly harder than computing only the output: there exists a social choice function f such that ccIC(f)=cc(f)+1. In a follow-up work, Babaioff, Blumrosen, Naor, and Schapira [EC’08] provide a social choice function f such that ccIC(f)=Θ(n· cc(f)), where n is the number of players. The question of whether the exponential upper bound of Fadel and Segal is tight remained wide open. In this paper we solve this question by explicitly providing a function f such that ccIC(f)= exp(cc(f)). In fact, we establish this via two very different proofs. In contrast, we show that if the players are risk-neutral and we can compromise on a randomized truthful-in-expectation implementation (and not on deterministic ex-post implementation) gives that ccTIE(f)=poly(n,cc(f)) for every function f, as long as the domain of f is single parameter or a convex multi-parameter domain. We also provide efficient algorithms for deterministic computation of payments in several important domains. @InProceedings{STOC21p933, author = {Shahar Dobzinski and Shiri Ron}, title = {The Communication Complexity of Payment Computation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {933--946}, doi = {10.1145/3406325.3451083}, year = {2021}, } Publisher's Version |
|
Dong, Sally |
STOC '21: "A Nearly-Linear Time Algorithm ..."
A Nearly-Linear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path
Sally Dong, Yin Tat Lee, and Guanghao Ye (University of Washington, USA; Microsoft Research, USA) Arising from structural graph theory, treewidth has become a focus of study in fixed-parameter tractable algorithms. Many NP-hard problems are known to be solvable in O(n · 2O(τ)) time, where τ is the treewidth of the input graph. Analogously, many problems in P should be solvable in O(n · τO(1)) time; however, due to the lack of appropriate tools, only a few such results are currently known. In our paper, we show this holds for linear programs: Given a linear program of the form minAx=b,ℓ ≤ x≤ u c⊤ x whose dual graph GA has treewidth τ, and a corresponding width-τ tree decomposition, we show how to solve it in time O(n · τ2 log(1/ε)), where n is the number of variables and ε is the relative accuracy. When a tree decomposition is not given, we use existing techniques in vertex separators to obtain algorithms with O(n · τ4 log(1/ε)) and O(n · τ2 log(1/ε) + n1.5) run-times. Besides being the first of its kind, our algorithm has run-time nearly matching the fastest run-time for solving the sub-problem Ax=b (under the assumption that no fast matrix multiplication is used). We obtain these results by combining recent techniques in interior-point methods (IPMs), sketching, and a novel representation of the solution under a multiscale basis similar to the wavelet basis. This representation further yields the first IPM with o(rank(A)) time per iteration when the treewidth is small. @InProceedings{STOC21p1784, author = {Sally Dong and Yin Tat Lee and Guanghao Ye}, title = {A Nearly-Linear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1784--1797}, doi = {10.1145/3406325.3451056}, year = {2021}, } Publisher's Version |
|
Dory, Michal |
STOC '21: "Distributed Weighted Min-Cut ..."
Distributed Weighted Min-Cut in Nearly-Optimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimum-weight cut (min-cut) is a basic measure of a network’s connectivity strength. While the min-cut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own min-cut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearly-optimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)-approximation at best while the exact Õ(n0.8D0.2 + n0.9)-time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hop-diameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)-time algorithm for computing min-cut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of Mukhopadhyay-Nanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted Min-Cut in Nearly-Optimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1144--1153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version |
|
Dudeja, Aditi |
STOC '21: "A Framework for Dynamic Matching ..."
A Framework for Dynamic Matching in Weighted Graphs
Aaron Bernstein, Aditi Dudeja, and Zachary Langley (Rutgers University, USA) We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap. For bipartite graphs, we almost completely settle the gap with a general reduction that converts any algorithm for α-approximate unweighted matching to an algorithm for (1−)α-approximate weighted matching, while only increasing the update time by an O(logn) factor for constant . We also show that our framework leads to significant improvements for non-bipartite graphs, though not in the form of a universal reduction. In particular, we give two algorithms for weighted non-bipartite matching: 1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)-approximate maximum weight matching in worst-case update time O(polylog n) with high probability against an adaptive adversary. Our bounds are essentially the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic fully dynamic algorithm that maintains a (2/3−)-approximate maximum weight matching in amortized update time O(m1/4). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein and Stein [SODA 2016]. A key feature of our framework is that it uses existing algorithms for unweighted matching as black-boxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models. @InProceedings{STOC21p668, author = {Aaron Bernstein and Aditi Dudeja and Zachary Langley}, title = {A Framework for Dynamic Matching in Weighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {668--681}, doi = {10.1145/3406325.3451113}, year = {2021}, } Publisher's Version |
|
Dütting, Paul |
STOC '21: "Efficient Two-Sided Markets ..."
Efficient Two-Sided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for two-sided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with near-optimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimum-possible amount of prior information, from each seller distribution is sufficient for a large class of two-sided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any one-sided mechanism into a two-sided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackbox-type mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient Two-Sided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1452--1465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version |
|
Dwork, Cynthia |
STOC '21: "Outcome Indistinguishability ..."
Outcome Indistinguishability
Cynthia Dwork, Michael P. Kim, Omer Reingold, Guy N. Rothblum, and Gal Yona (Harvard University, USA; University of California at Berkeley, USA; Stanford University, USA; Weizmann Institute of Science, Israel) Prediction algorithms assign numbers to individuals that are popularly understood as individual “probabilities”—what is the probability of 5-year survival after cancer diagnosis?—and which increasingly form the basis for life-altering decisions. Drawing on an understanding of computational indistinguishability developed in complexity theory and cryptography, we introduce Outcome Indistinguishability. Predictors that are Outcome Indistinguishable (OI) yield a generative model for outcomes that cannot be efficiently refuted on the basis of the real-life observations produced by . We investigate a hierarchy of OI definitions, whose stringency increases with the degree to which distinguishers may access the predictor in question. Our findings reveal that OI behaves qualitatively differently than previously studied notions of indistinguishability. First, we provide constructions at all levels of the hierarchy. Then, leveraging recently-developed machinery for proving average-case fine-grained hardness, we obtain lower bounds on the complexity of the more stringent forms of OI. This hardness result provides the first scientific grounds for the political argument that, when inspecting algorithmic risk prediction instruments, auditors should be granted oracle access to the algorithm, not simply historical predictions. @InProceedings{STOC21p1095, author = {Cynthia Dwork and Michael P. Kim and Omer Reingold and Guy N. Rothblum and Gal Yona}, title = {Outcome Indistinguishability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1095--1108}, doi = {10.1145/3406325.3451064}, year = {2021}, } Publisher's Version |
|
Efremenko, Klim |
STOC '21: "Optimal Error Resilience of ..."
Optimal Error Resilience of Adaptive Message Exchange
Klim Efremenko, Gillat Kol, and Raghuvansh R. Saxena (Ben-Gurion University of the Negev, Israel; Princeton University, USA) We study the error resilience of the message exchange task: Two parties, each holding a private input, want to exchange their inputs. However, the channel connecting them is governed by an adversary that may corrupt a constant fraction of the transmissions. What is the maximum fraction of corruptions that still allows the parties to exchange their inputs? For the non-adaptive channel, where the parties must agree in advance on the order in which they communicate, the maximum error resilience was shown to be 1/4 (see Braverman and Rao, STOC 2011). The problem was also studied over the adaptive channel, where the order in which the parties communicate may not be predetermined (Ghaffari, Haeupler, and Sudan, STOC 2014; Efremenko, Kol, and Saxena, STOC 2020). These works show that the adaptive channel admits much richer set of protocols but leave open the question of finding its maximum error resilience. In this work, we show that the maximum error resilience of a protocol for message exchange over the adaptive channel is 5/16, thereby settling the above question. Our result requires improving both the known upper bounds and the known lower bounds for the problem. @InProceedings{STOC21p1235, author = {Klim Efremenko and Gillat Kol and Raghuvansh R. Saxena}, title = {Optimal Error Resilience of Adaptive Message Exchange}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1235--1247}, doi = {10.1145/3406325.3451077}, year = {2021}, } Publisher's Version |
|
Efron, Yuval |
STOC '21: "Distributed Weighted Min-Cut ..."
Distributed Weighted Min-Cut in Nearly-Optimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimum-weight cut (min-cut) is a basic measure of a network’s connectivity strength. While the min-cut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own min-cut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearly-optimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)-approximation at best while the exact Õ(n0.8D0.2 + n0.9)-time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hop-diameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)-time algorithm for computing min-cut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of Mukhopadhyay-Nanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted Min-Cut in Nearly-Optimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1144--1153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version |
|
Esperet, Louis |
STOC '21: "Optimal Labelling Schemes ..."
Optimal Labelling Schemes for Adjacency, Comparability, and Reachability
Marthe Bonamy, Louis Esperet, Carla Groenland, and Alex Scott (CNRS, France; Labri, France; University of Bordeaux, France; G-SCOP, France; Grenoble Alps University, France; University of Oxford, UK) We construct asymptotically optimal adjacency labelling schemes for every hereditary class containing 2Ω(n2) n-vertex graphs as n→ ∞. This regime contains many classes of interest, for instance perfect graphs or comparability graphs, for which we obtain an adjacency labelling scheme with labels of n/4+o(n) bits per vertex. This implies the existence of a reachability labelling scheme for digraphs with labels of n/4+o(n) bits per vertex and comparability labelling scheme for posets with labels of n/4+o(n) bits per element. All these results are best possible, up to the lower order term. @InProceedings{STOC21p1109, author = {Marthe Bonamy and Louis Esperet and Carla Groenland and Alex Scott}, title = {Optimal Labelling Schemes for Adjacency, Comparability, and Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1109--1117}, doi = {10.1145/3406325.3451102}, year = {2021}, } Publisher's Version |
|
Fearnley, John |
STOC '21: "The Complexity of Gradient ..."
The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
John Fearnley, Paul W. Goldberg, Alexandros Hollender, and Rahul Savani (University of Liverpool, UK; University of Oxford, UK) We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two well-known classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a Karush-Kuhn-Tucker (KKT) point of a continuously differentiable function over the domain [0,1]2 is PPAD ∩ PLS-complete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search) - which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD ∩ PLS and contains many interesting problems - is itself equal to PPAD ∩ PLS. @InProceedings{STOC21p46, author = {John Fearnley and Paul W. Goldberg and Alexandros Hollender and Rahul Savani}, title = {The Complexity of Gradient Descent: CLS = PPAD ∩ PLS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {46--59}, doi = {10.1145/3406325.3451052}, year = {2021}, } Publisher's Version |
|
Fefferman, Bill |
STOC '21: "Eliminating Intermediate Measurements ..."
Eliminating Intermediate Measurements in Space-Bounded Quantum Computation
Bill Fefferman and Zachary Remscrim (University of Chicago, USA) A foundational result in the theory of quantum computation, known as the "principle of safe storage," shows that it is always possible to take a quantum circuit and produce an equivalent circuit that makes all measurements at the end of the computation. While this procedure is time efficient, meaning that it does not introduce a large overhead in the number of gates, it uses extra ancillary qubits, and so is not generally space efficient. It is quite natural to ask whether it is possible to eliminate intermediate measurements without increasing the number of ancillary qubits. We give an affirmative answer to this question by exhibiting a procedure to eliminate all intermediate measurements that is simultaneously space efficient and time efficient. In particular, this shows that the definition of a space-bounded quantum complexity class is robust to allowing or forbidding intermediate measurements. A key component of our approach, which may be of independent interest, involves showing that the well-conditioned versions of many standard linear-algebraic problems may be solved by a quantum computer in less space than seems possible by a classical computer. @InProceedings{STOC21p1343, author = {Bill Fefferman and Zachary Remscrim}, title = {Eliminating Intermediate Measurements in Space-Bounded Quantum Computation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1343--1356}, doi = {10.1145/3406325.3451051}, year = {2021}, } Publisher's Version |
|
Feldman, Vitaly |
STOC '21: "When Is Memorization of Irrelevant ..."
When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?
Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, and Kunal Talwar (Boston University, USA; Apple, USA) Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are high-dimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the next-symbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text- and image-related prediction problems. To establish our results, we reduce from a family of one-way communication problems for which we prove new information complexity lower bounds. @InProceedings{STOC21p123, author = {Gavin Brown and Mark Bun and Vitaly Feldman and Adam Smith and Kunal Talwar}, title = {When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {123--132}, doi = {10.1145/3406325.3451131}, year = {2021}, } Publisher's Version |
|
Feng, Weiming |
STOC '21: "Sampling Constraint Satisfaction ..."
Sampling Constraint Satisfaction Solutions in the Local Lemma Regime
Weiming Feng, Kun He, and Yitong Yin (Nanjing University, China; Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China) We give a Markov chain based algorithm for sampling almost uniform solutions of constraint satisfaction problems (CSPs). Assuming a canonical setting for the Lovász local lemma, where each constraint is violated by a small number of forbidden local configurations, our sampling algorithm is accurate in a local lemma regime, and the running time is a fixed polynomial whose dependency on n is close to linear, where n is the number of variables. Our main approach is a new technique called state compression, which generalizes the “mark/unmark” paradigm of Moitra, and can give fast local-lemma-based sampling algorithms. As concrete applications of our technique, we give the current best almost-uniform samplers for hypergraph colorings and for CNF solutions. @InProceedings{STOC21p1565, author = {Weiming Feng and Kun He and Yitong Yin}, title = {Sampling Constraint Satisfaction Solutions in the Local Lemma Regime}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1565--1578}, doi = {10.1145/3406325.3451101}, year = {2021}, } Publisher's Version |
|
Feng, Yiding |
STOC '21: "Revelation Gap for Pricing ..."
Revelation Gap for Pricing from Samples
Yiding Feng, Jason D. Hartline, and Yingkai Li (Northwestern University, USA) This paper considers prior-independent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a non-trivial revelation gap for revenue maximization. Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the sample-bid mechanism (a simple non-truthful mechanism) and upper-bound its prior-independent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve prior-independent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a non-trivial revelation gap is shown as the sample-bid mechanism outperforms the optimal prior-independent truthful mechanism. On the hardness side, we prove that no (possibly non-truthful) mechanism can achieve prior-independent approximation ratio better than 1.073 even for uniform distributions. @InProceedings{STOC21p1438, author = {Yiding Feng and Jason D. Hartline and Yingkai Li}, title = {Revelation Gap for Pricing from Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1438--1451}, doi = {10.1145/3406325.3451057}, year = {2021}, } Publisher's Version |
|
Ferreira Pinto Jr., Renato |
STOC '21: "VC Dimension and Distribution-Free ..."
VC Dimension and Distribution-Free Sample-Based Testing
Eric Blais, Renato Ferreira Pinto Jr., and Nathaniel Harms (University of Waterloo, Canada; Google, Canada) We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems. @InProceedings{STOC21p504, author = {Eric Blais and Renato Ferreira Pinto Jr. and Nathaniel Harms}, title = {VC Dimension and Distribution-Free Sample-Based Testing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {504--517}, doi = {10.1145/3406325.3451104}, year = {2021}, } Publisher's Version |
|
Filtser, Arnold |
STOC '21: "Clan Embeddings into Trees, ..."
Clan Embeddings into Trees, and Low Treewidth Graphs
Arnold Filtser and Hung Le (Columbia University, USA; University of Massachusetts, USA) In low distortion metric embeddings, the goal is to embed a host “hard” metric space into a “simpler” target space while approximately preserving pairwise distances. A highly desirable target space is that of a tree metric. Unfortunately, such embedding will result in a huge distortion. A celebrated bypass to this problem is stochastic embedding with logarithmic expected distortion. Another bypass is Ramsey-type embedding, where the distortion guarantee applies only to a subset of the points. However, both these solutions fail to provide an embedding into a single tree with a worst-case distortion guarantee on all pairs. In this paper, we propose a novel third bypass called clan embedding. Here each point x is mapped to a subset of points f(x), called a clan, with a special chief point χ(x)∈ f(x). The clan embedding has multiplicative distortion t if for every pair (x,y) some copy y′∈ f(y) in the clan of y is close to the chief of x: miny′∈ f(y)d(y′,χ(x))≤ t· d(x,y). Our first result is a clan embedding into a tree with multiplicative distortion O(logn/є) such that each point has 1+є copies (in expectation). In addition, we provide a “spanning” version of this theorem for graphs and use it to devise the first compact routing scheme with constant size routing tables. We then focus on minor-free graphs of diameter prameterized by D, which were known to be stochastically embeddable into bounded treewidth graphs with expected additive distortion є D. We devise Ramsey-type embedding and clan embedding analogs of the stochastic embedding. We use these embeddings to construct the first (bicriteria quasi-polynomial time) approximation scheme for the metric ρ-dominating set and metric ρ-independent set problems in minor-free graphs. @InProceedings{STOC21p342, author = {Arnold Filtser and Hung Le}, title = {Clan Embeddings into Trees, and Low Treewidth Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {342--355}, doi = {10.1145/3406325.3451043}, year = {2021}, } Publisher's Version |
|
Fischer, Nick |
STOC '21: "Sparse Nonnegative Convolution ..."
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Karl Bringmann, Nick Fischer, and Vasileios Nakos (Saarland University, Germany; MPI-INF, Germany) Computing the convolution A ⋆ B of two length-n vectors A,B is an ubiquitous computational primitive, with applications in a variety of disciplines. Within theoretical computer science, applications range from string problems to Knapsack-type problems, and from 3SUM to All-Pairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of A,B are nonnegative integers. The classical algorithm to compute A⋆ B uses the Fast Fourier Transform (FFT) and runs in time O(n logn). However, in many cases A and B might satisfy sparsity conditions, and hence one could hope for significant gains compared to the standard FFT algorithm. The ideal goal would be an O(k logk)-time algorithm, where k is the number of non-zero elements in the output, i.e., the size of the support of A ⋆ B. This problem is referred to as sparse nonnegative convolution, and has received a considerable amount of attention in the literature; the fastest algorithms to date run in time O(k log2 n). The main result of this paper is the first O(k logk)-time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length n and the largest entry of A and B are subexponential in k. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constant-error randomized algorithms. Specifically, if D(n) is the time to convolve two nonnegative length-n vectors with success probability 2/3, and S(k) is the time to convolve two nonnegative vectors with output size k with success probability 2/3, then S(k) = O(D(k) + k (loglogk)2). Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function. @InProceedings{STOC21p1711, author = {Karl Bringmann and Nick Fischer and Vasileios Nakos}, title = {Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1711--1724}, doi = {10.1145/3406325.3451090}, year = {2021}, } Publisher's Version |
|
Fusco, Federico |
STOC '21: "Efficient Two-Sided Markets ..."
Efficient Two-Sided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for two-sided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with near-optimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimum-possible amount of prior information, from each seller distribution is sufficient for a large class of two-sided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any one-sided mechanism into a two-sided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackbox-type mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient Two-Sided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1452--1465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version |
|
Gabriel, Franck |
STOC '21: "Neural Tangent Kernel: Convergence ..."
Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
Arthur Jacot, Franck Gabriel, and Clément Hongler (EPFL, Switzerland) The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features. @InProceedings{STOC21p6, author = {Arthur Jacot and Franck Gabriel and Clément Hongler}, title = {Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {6--6}, doi = {10.1145/3406325.3465355}, year = {2021}, } Publisher's Version |
|
Garg, Jugal |
STOC '21: "Approximating Nash Social ..."
Approximating Nash Social Welfare under Rado Valuations
Jugal Garg, Edin Husić, and László A. Végh (University of Illinois at Urbana-Champaign, USA; London School of Economics and Political Science, UK) We consider the problem of approximating maximum Nash social welfare (NSW) while allocating a set of indivisible items to n agents. The NSW is a popular objective that provides a balanced tradeoff between the often conflicting requirements of fairness and efficiency, defined as the weighted geometric mean of the agents’ valuations. For the symmetric additive case of the problem, where agents have the same weight with additive valuations, the first constant-factor approximation algorithm was obtained in 2015. Subsequent work has obtained constant-factor approximation algorithms for the symmetric case under mild generalizations of additive, and O(n)-approximation algorithms for subadditive valuations and for the asymmetric case. In this paper, we make significant progress towards both symmetric and asymmetric NSW problems. We present the first constant-factor approximation algorithm for the symmetric case under Rado valuations. Rado valuations form a general class of valuation functions that arise from maximum cost independent matching problems, including as special cases assignment (OXS) valuations and weighted matroid rank functions. Furthermore, our approach also gives the first constant-factor approximation algorithm for the asymmetric case under Rado valuations, provided that the maximum ratio between the weights is bounded by a constant. @InProceedings{STOC21p1412, author = {Jugal Garg and Edin Husić and László A. Végh}, title = {Approximating Nash Social Welfare under Rado Valuations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1412--1425}, doi = {10.1145/3406325.3451031}, year = {2021}, } Publisher's Version |
|
Gartland, Peter |
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C>t -Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C>t-free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an n-vertex C>t-free graph G with weights on vertices, finds in time n(log3 n) a maximum-weight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n(log2 n) assuming G is Pt-free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on Pt-free graphs in two directions: by encompassing the more general setting of C>t-free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> -Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330--341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version |
|
Gawrychowski, Paweł |
STOC '21: "Fully Dynamic Approximation ..."
Fully Dynamic Approximation of LIS in Polylogarithmic Time
Paweł Gawrychowski and Wojciech Janczewski (University of Wrocław, Poland) We revisit the problem of maintaining the longest increasing subsequence (LIS) of an array under (i) inserting an element, and (ii) deleting an element of an array. In a recent breakthrough, Mitzenmacher and Seddighin [STOC 2020] designed an algorithm that maintains an O((1/є)O(1/є))-approximation of LIS under both operations with worst-case update time Õ(nє), for any constant є>0 (Õ hides factors polynomial in logn, where n is the length of the input). We exponentially improve on their result by designing an algorithm that maintains an (1+є) approximation of LIS under both operations with worst-case update time Õ(є−5). Instead of working with the grid packing technique introduced by Mitzenmacher and Seddighin, we take a different approach building on a new tool that might be of independent interest: LIS sparsification. A particularly interesting consequence of our result is an improved solution for the so-called Erdős-Szekeres partitioning, in which we seek a partition of a given permutation of {1,2,…,n} into O(√n) monotone subsequences. This problem has been repeatedly stated as one of the natural examples in which we see a large gap between the decision-tree complexity and algorithmic complexity. The result of Mitzenmacher and Seddighin implies an O(n1+є) time solution for this problem, for any є>0. Our algorithm (in fact, its simpler decremental version) further improves this to Õ(n). @InProceedings{STOC21p654, author = {Paweł Gawrychowski and Wojciech Janczewski}, title = {Fully Dynamic Approximation of LIS in Polylogarithmic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {654--667}, doi = {10.1145/3406325.3451137}, year = {2021}, } Publisher's Version |
|
Gay, Romain |
STOC '21: "Indistinguishability Obfuscation ..."
Indistinguishability Obfuscation from Circular Security
Romain Gay and Rafael Pass (IBM Research, Switzerland; Cornell Tech, USA) We show the existence of indistinguishability obfuscators (iO) for general circuits assuming subexponential security of: (a) the Learning with Errors (LWE) assumption (with subexponential modulus-to-noise ratio); (b) a circular security conjecture regarding the Gentry-Sahai-Waters' (GSW) encryption scheme and a Packed version of Regev's encryption scheme. The circular security conjecture states that a notion of leakage-resilient security, that we prove is satisfied by GSW assuming LWE, is retained in the presence of an encrypted key-cycle involving GSW and Packed Regev. @InProceedings{STOC21p736, author = {Romain Gay and Rafael Pass}, title = {Indistinguishability Obfuscation from Circular Security}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {736--749}, doi = {10.1145/3406325.3451070}, year = {2021}, } Publisher's Version |
|
Gayen, Sutanu |
STOC '21: "Near-Optimal Learning of Tree-Structured ..."
Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu
Arnab Bhattacharyya, Sutanu Gayen, Eric Price, and N. V. Vinodchandran (National University of Singapore, Singapore; University of Texas at Austin, USA; University of Nebraska-Lincoln, USA) We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans. Inform. Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution P on Σn and a tree T on n nodes, we say T is an ε-approximate tree for P if there is a T-structured distribution Q such that D(P || Q) is at most ε more than the best possible tree-structured distribution for P. We show that if P itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator for mutual information with O(|Σ|3 nε−1) i.i.d. samples outputs an ε-approximate tree for P with constant probability. In contrast, for a general P (which may not be tree-structured), Ω(n2ε−2) samples are necessary to find an ε-approximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart (STOC, 2018): we prove that for three random variables X,Y,Z each over Σ, testing if I(X; Y ∣ Z) is 0 or ≥ ε is possible with O(|Σ|3/ε) samples. Finally, we show that for a specific tree T, with O(|Σ|2nε−1) samples from a distribution P over Σn, one can efficiently learn the closest T-structured distribution in KL divergence by applying the add-1 estimator at each node. @InProceedings{STOC21p147, author = {Arnab Bhattacharyya and Sutanu Gayen and Eric Price and N. V. Vinodchandran}, title = {Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {147--160}, doi = {10.1145/3406325.3451066}, year = {2021}, } Publisher's Version Info |
|
Ghaffari, Mohsen |
STOC '21: "Hop-Constrained Oblivious ..."
Hop-Constrained Oblivious Routing
Mohsen Ghaffari, Bernhard Haeupler, and Goran Zuzic (ETH Zurich, Switzerland; Carnegie Mellon University, USA) We prove the existence of an oblivious routing scheme that is poly(logn)-competitive in terms of (congestion + dilation), thus resolving a well-known question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to minimize (congestion + dilation), defined as follows: The dilation is the maximum path hop-length, and the congestion is the maximum number of paths that include any single edge. The routing scheme obliviously and randomly selects a path for each packet independent of (the existence of) the other packets. Despite this obliviousness, the selected paths have (congestion + dilation) within a poly(logn) factor of the best possible value. More precisely, for any integer hop-bound h, this oblivious routing scheme selects paths of length at most h · poly(logn) and is poly(logn)-competitive in terms of congestion in comparison to the best possible congestion achievable via paths of length at most h hops. These paths can be sampled in polynomial time. This result can be viewed as an analogue of the celebrated oblivious routing results of R'acke [FOCS 2002, STOC 2008], which are O(logn)-competitive in terms of congestion, but are not competitive in terms of dilation. @InProceedings{STOC21p1208, author = {Mohsen Ghaffari and Bernhard Haeupler and Goran Zuzic}, title = {Hop-Constrained Oblivious Routing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1208--1220}, doi = {10.1145/3406325.3451098}, year = {2021}, } Publisher's Version |
|
Gharan, Shayan Oveis |
STOC '21: "Log-Concave Polynomials IV: ..."
Log-Concave Polynomials IV: Approximate Exchange, Tight Mixing Times, and Near-Optimal Sampling of Forests
Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and Thuy-Duong Vuong (Stanford University, USA; University of Washington, USA; North Carolina State University, USA) We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with log-concave polynomials of homogeneous degree k on n variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over size-k subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a kO(k)-approximation of maxS µ(S) when µ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog2 n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almost-linear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k1 and k2 in time O(n log2 n), for fixed parameters k1, k2. @InProceedings{STOC21p408, author = {Nima Anari and Kuikui Liu and Shayan Oveis Gharan and Cynthia Vinzant and Thuy-Duong Vuong}, title = {Log-Concave Polynomials IV: Approximate Exchange, Tight Mixing Times, and Near-Optimal Sampling of Forests}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {408--420}, doi = {10.1145/3406325.3451091}, year = {2021}, } Publisher's Version Info STOC '21: "A (Slightly) Improved Approximation ..." A (Slightly) Improved Approximation Algorithm for Metric TSP Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan (University of Washington, USA) For some > 10−36 we give a randomized 3/2− approximation algorithm for metric TSP. @InProceedings{STOC21p32, author = {Anna R. Karlin and Nathan Klein and Shayan Oveis Gharan}, title = {A (Slightly) Improved Approximation Algorithm for Metric TSP}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {32--45}, doi = {10.1145/3406325.3451009}, year = {2021}, } Publisher's Version |
|
Ghazi, Badih |
STOC '21: "Sample-Efficient Proper PAC ..."
Sample-Efficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d6), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2O(d) on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {Sample-Efficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183--196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version |
|
Ghoshal, Suprovat |
STOC '21: "Hardness of Learning DNFs ..."
Hardness of Learning DNFs using Halfspaces
Suprovat Ghoshal and Rishi Saket (University of Michigan, USA; IBM Research, India) The problem of learning t-term DNF formulas (for t = O(1)) has been studied extensively in the PAC model since its introduction by Valiant (STOC 1984). A t-term DNF can be efficiently learnt using a t-term DNF only if t = 1 i.e., when it is an AND, while even weakly learning a 2-term DNF using a constant term DNF was shown to be NP-hard by Khot and Saket (FOCS 2008). On the other hand, Feldman, Guruswami, Raghavendra and Wu (FOCS 2009) showed the hardness of weakly learning a noisy AND using a halfspace – the latter being a generalization of an AND, while Khot and Saket (STOC 2008) showed that an intersection of two halfspaces is hard to weakly learn using any function of constantly many halfspaces. The question of whether a 2-term DNF is efficiently learnable using 2 or constantly many halfspaces remained open. In this work we answer this question in the negative by showing the hardness of weakly learning a 2-term DNF as well as a noisy AND using any function of a constant number of halfspaces. In particular we prove the following. For any constants ν, ζ > 0 and ℓ ∈ N, given a distribution over point-value pairs {0,1}n × {0,1}, it is NP-hard to decide whether, (i) YES Case. There is a 2-term DNF that classifies all the points of the distribution, and an AND that classifies at least 1−ζ fraction of the points correctly. (ii) NO Case. Any boolean function depending on at most ℓ halfspaces classifies at most 1/2 + ν fraction of the points of the distribution correctly. Our result generalizes and strengthens the previous best results mentioned above on the hardness of learning a 2-term DNF, learning an intersection of two halfspaces, and learning a noisy AND. @InProceedings{STOC21p467, author = {Suprovat Ghoshal and Rishi Saket}, title = {Hardness of Learning DNFs using Halfspaces}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {467--480}, doi = {10.1145/3406325.3451067}, year = {2021}, } Publisher's Version |
|
Giakkoupis, George |
STOC '21: "Efficient Randomized DCAS ..."
Efficient Randomized DCAS
George Giakkoupis, Mehrdad Jafari Giv, and Philipp Woelfel (Inria, France; University of Rennes, France; CNRS, France; IRISA, France; University of Calgary, Canada) Double Compare-And-Swap (DCAS) is a tremendously useful synchronization primitive, which is also notoriously difficult to implement efficiently from objects that are provided by hardware. We present a randomized implementation of DCAS with O(logn) expected amortized step complexity against the oblivious adversary, where n is the number of processes in the system. This is the only algorithm to-date that achieves sub-linear step complexity. We achieve that by first implementing two novel algorithms as building blocks. One is a mechanism that allows processes to repeatedly agree on a random value among multiple proposed ones, and the other one is a restricted bipartite version of DCAS. @InProceedings{STOC21p1221, author = {George Giakkoupis and Mehrdad Jafari Giv and Philipp Woelfel}, title = {Efficient Randomized DCAS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1221--1234}, doi = {10.1145/3406325.3451133}, year = {2021}, } Publisher's Version |
|
Gilyén, András |
STOC '21: "(Sub)Exponential Advantage ..."
(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem
András Gilyén, Matthew B. Hastings, and Umesh Vazirani (California Institute of Technology, USA; Microsoft Quantum, USA; Microsoft Research, USA; University of California at Berkeley, USA) We demonstrate the possibility of (sub)exponential quantum speedup via a quantum algorithm that follows an adiabatic path of a gapped Hamiltonian with no sign problem. The Hamiltonian that exhibits this speed-up comes from the adjacency matrix of an undirected graph whose vertices are labeled by n-bit strings, and we can view the adiabatic evolution as an efficient O(poly(n))-time quantum algorithm for finding a specific “EXIT” vertex in the graph given the “ENTRANCE” vertex. On the other hand we show that if the graph is given via an adjacency-list oracle, there is no classical algorithm that finds the “EXIT” with probability greater than exp(−nδ) using at most exp(nδ) queries for δ= 1/5 − o(1). Our construction of the graph is somewhat similar to the “welded-trees” construction of Childs et al., but uses additional ideas of Hastings for achieving a spectral gap and a short adiabatic path. @InProceedings{STOC21p1357, author = {András Gilyén and Matthew B. Hastings and Umesh Vazirani}, title = {(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1357--1369}, doi = {10.1145/3406325.3451060}, year = {2021}, } Publisher's Version |
|
Giv, Mehrdad Jafari |
STOC '21: "Efficient Randomized DCAS ..."
Efficient Randomized DCAS
George Giakkoupis, Mehrdad Jafari Giv, and Philipp Woelfel (Inria, France; University of Rennes, France; CNRS, France; IRISA, France; University of Calgary, Canada) Double Compare-And-Swap (DCAS) is a tremendously useful synchronization primitive, which is also notoriously difficult to implement efficiently from objects that are provided by hardware. We present a randomized implementation of DCAS with O(logn) expected amortized step complexity against the oblivious adversary, where n is the number of processes in the system. This is the only algorithm to-date that achieves sub-linear step complexity. We achieve that by first implementing two novel algorithms as building blocks. One is a mechanism that allows processes to repeatedly agree on a random value among multiple proposed ones, and the other one is a restricted bipartite version of DCAS. @InProceedings{STOC21p1221, author = {George Giakkoupis and Mehrdad Jafari Giv and Philipp Woelfel}, title = {Efficient Randomized DCAS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1221--1234}, doi = {10.1145/3406325.3451133}, year = {2021}, } Publisher's Version |
|
Goldberg, Paul W. |
STOC '21: "The Complexity of Gradient ..."
The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
John Fearnley, Paul W. Goldberg, Alexandros Hollender, and Rahul Savani (University of Liverpool, UK; University of Oxford, UK) We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two well-known classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a Karush-Kuhn-Tucker (KKT) point of a continuously differentiable function over the domain [0,1]2 is PPAD ∩ PLS-complete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search) - which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD ∩ PLS and contains many interesting problems - is itself equal to PPAD ∩ PLS. @InProceedings{STOC21p46, author = {John Fearnley and Paul W. Goldberg and Alexandros Hollender and Rahul Savani}, title = {The Complexity of Gradient Descent: CLS = PPAD ∩ PLS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {46--59}, doi = {10.1145/3406325.3451052}, year = {2021}, } Publisher's Version |
|
Golowich, Noah |
STOC '21: "Sample-Efficient Proper PAC ..."
Sample-Efficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d6), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2O(d) on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {Sample-Efficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183--196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version |
|
Gonen, Alon |
STOC '21: "Boosting Simple Learners ..."
Boosting Simple Learners
Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rules-of-thumbs” from an “easy-to-learn class”. (Schapire and Freund ’12, Shalev-Shwartz and Ben-David ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ2) weak hypotheses with γ-margin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorial-geometric parameter which captures the expressivity of base-classes in boosting. As a corollary we provide an affirmative answer to the second question for many well-studied classes, including half-spaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481--489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version |
|
Göös, Mika |
STOC '21: "Automating Algebraic Proof ..."
Automating Algebraic Proof Systems Is NP-Hard
Susanna F. de Rezende, Mika Göös, Jakob Nordström, Toniann Pitassi, Robert Robere, and Dmitry Sokolov (Czech Academy of Sciences, Czechia; EPFL, Switzerland; University of Copenhagen, Denmark; Lund University, Sweden; University of Toronto, Canada; Institute for Advanced Study at Princeton, USA; McGill University, Canada; St. Petersburg State University, Russia; Russian Academy of Sciences, Russia) We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula F, it is NP-hard to find a refutation of F in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time polynomial in the size of the shortest such refutation. Our work extends, and gives a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020) that established an analogous result for Resolution. @InProceedings{STOC21p209, author = {Susanna F. de Rezende and Mika Göös and Jakob Nordström and Toniann Pitassi and Robert Robere and Dmitry Sokolov}, title = {Automating Algebraic Proof Systems Is NP-Hard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {209--222}, doi = {10.1145/3406325.3451080}, year = {2021}, } Publisher's Version |
|
Gottlieb, Lee-Ad |
STOC '21: "Near-Linear Time Approximation ..."
Near-Linear Time Approximation Schemes for Steiner Tree and Forest in Low-Dimensional Spaces
Yair Bartal and Lee-Ad Gottlieb (Hebrew University of Jerusalem, Israel; Ariel University, Israel) We give an algorithm that computes a (1+є)-approximate Steiner forest in near-linear time n · 2(1/є)O(ddim2) (loglogn)2, where ddim is the doubling dimension of the metric space. This improves upon the best previous result due to Chan et al. (SIAM J. Comput. 4 (2018)), who gave a runtime of about n2O(ddim) · 2(ddim/є)O(ddim) √logn. For Steiner tree our methods achieve an even better runtime n (logn)(1/є)O(ddim2). @InProceedings{STOC21p1028, author = {Yair Bartal and Lee-Ad Gottlieb}, title = {Near-Linear Time Approximation Schemes for Steiner Tree and Forest in Low-Dimensional Spaces}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1028--1041}, doi = {10.1145/3406325.3451063}, year = {2021}, } Publisher's Version |
|
Gouleakis, Themis |
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of Wisconsin-Madison, USA; MPI-INF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are є-far from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sample-optimal testers for a range of properties. While one can always boost the confidence probability of any such tester by black-box amplification, this generic boosting method typically leads to sub-optimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors, as a function of all relevant parameters. We also show matching information-theoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542--555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version |
|
Groenland, Carla |
STOC '21: "Optimal Labelling Schemes ..."
Optimal Labelling Schemes for Adjacency, Comparability, and Reachability
Marthe Bonamy, Louis Esperet, Carla Groenland, and Alex Scott (CNRS, France; Labri, France; University of Bordeaux, France; G-SCOP, France; Grenoble Alps University, France; University of Oxford, UK) We construct asymptotically optimal adjacency labelling schemes for every hereditary class containing 2Ω(n2) n-vertex graphs as n→ ∞. This regime contains many classes of interest, for instance perfect graphs or comparability graphs, for which we obtain an adjacency labelling scheme with labels of n/4+o(n) bits per vertex. This implies the existence of a reachability labelling scheme for digraphs with labels of n/4+o(n) bits per vertex and comparability labelling scheme for posets with labels of n/4+o(n) bits per element. All these results are best possible, up to the lower order term. @InProceedings{STOC21p1109, author = {Marthe Bonamy and Louis Esperet and Carla Groenland and Alex Scott}, title = {Optimal Labelling Schemes for Adjacency, Comparability, and Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1109--1117}, doi = {10.1145/3406325.3451102}, year = {2021}, } Publisher's Version |
|
Grosof, Isaac |
STOC '21: "Load Balancing Guardrails: ..."
Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)
Isaac Grosof, Ziv Scully, and Mor Harchol-Balter (Carnegie Mellon University, USA) This talk is about scheduling and load balancing in a multi-server system, with the goal of minimizing mean response time in a general stochastic setting. We will specifically concentrate on the common case of a load balancing system, where a front-end load balancer (a.k.a. dispatcher) dispatches requests to multiple back-end servers, each with their own queue. Much is known about load balancing in the case where the scheduling at the servers is First-Come-First-Served (FCFS). However, to minimize mean response time, we need to use Shortest-Remaining-Processing-Time (SRPT) scheduling at the servers. Unfortunately, there is almost nothing known about optimal dispatching when SRPT scheduling is used at the servers. To make things worse, it turns out that the traditional dispatching policies that are used in practice with FCFS servers often have poor performance in systems with SRPT servers. In this talk, we devise a simple fix that can be applied to any dispatching policy. This fix, called "guardrails" ensures that the dispatching policy yields optimal mean response time under heavy traffic, when used in a system with SRPT servers. Any dispatching policy, when augmented with guardrails becomes heavy-traffic optimal. Our results also yield the first analytical bounds on mean response time for load balancing systems with SRPT scheduling at the servers. Load balancing and scheduling are highly studied both in the stochastic and the worst-case scheduling communities. One aim of this talk is to contrast some differences in the approaches of the two communities when tackling multi-server scheduling problems. @InProceedings{STOC21p10, author = {Isaac Grosof and Ziv Scully and Mor Harchol-Balter}, title = {Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10--10}, doi = {10.1145/3406325.3465359}, year = {2021}, } Publisher's Version |
|
Guo, Zeyu |
STOC '21: "Efficient List-Decoding with ..."
Efficient List-Decoding with Constant Alphabet and List Sizes
Zeyu Guo and Noga Ron-Zewi (University of Haifa, Israel) We present an explicit and efficient algebraic construction of capacity-achieving list decodable codes with both constant alphabet and constant list sizes. More specifically, for any R ∈ (0,1) and є>0, we give an algebraic construction of an infinite family of error-correcting codes of rate R, over an alphabet of size (1/є)O(1/є2), that can be list decoded from a (1−R−є)-fraction of errors with list size at most exp(poly(1/є)). Moreover, the codes can be encoded in time poly(1/є, n), the output list is contained in a linear subspace of dimension at most poly(1/є), and a basis for this subspace can be found in time poly(1/є, n). Thus, both encoding and list decoding can be performed in fully polynomial-time poly(1/є, n), except for pruning the subspace and outputting the final list which takes time exp(poly(1/є)) · poly(n). In contrast, prior explicit and efficient constructions of capacity-achieving list decodable codes either required a much higher complexity in terms of 1/є (and were additionally much less structured), or had super-constant alphabet or list sizes. Our codes are quite natural and structured. Specifically, we use algebraic-geometric (AG) codes with evaluation points restricted to a subfield, and with the message space restricted to a (carefully chosen) linear subspace. Our main observation is that the output list of AG codes with subfield evaluation points is contained in an affine shift of the image of a block-triangular-Toeplitz (BTT) matrix, and that the list size can potentially be reduced to a constant by restricting the message space to a BTT evasive subspace, which is a large subspace that intersects the image of any BTT matrix in a constant number of points. We further show how to explicitly construct such BTT evasive subspaces, based on the explicit subspace designs of Guruswami and Kopparty (Combinatorica, 2016), and composition. @InProceedings{STOC21p1502, author = {Zeyu Guo and Noga Ron-Zewi}, title = {Efficient List-Decoding with Constant Alphabet and List Sizes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1502--1515}, doi = {10.1145/3406325.3451046}, year = {2021}, } Publisher's Version |
|
Gupta, Anupam |
STOC '21: "Chasing Convex Bodies with ..."
Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)
C. J. Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang (Carnegie Mellon University, USA; Google Research, USA) The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over d-dimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs ||x t − x t − 1 || is minimized. This problem generalizes questions in online algorithms such as caching and the k-server problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)-competitive algorithm, based on the notion of the Steiner point of a convex body. @InProceedings{STOC21p5, author = {C. J. Argue and Anupam Gupta and Guru Guruganesh and Ziye Tang}, title = {Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {5--5}, doi = {10.1145/3406325.3465354}, year = {2021}, } Publisher's Version STOC '21: "A Quasipolynomial (2 + ε)-Approximation ..." A Quasipolynomial (2 + ε)-Approximation for Planar Sparsest Cut Vincent Cohen-Addad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor approximations are known. We consider the problem for planar graphs, and give a (2+)-approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a Sherali-Adams-style linear programming relaxation of the problem, which we then round. This should be compared with the polynomial-time approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ1-embeddings, and achieves an O(√logn)-approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent Cohen-Addad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)-Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1056--1069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version |
|
Guruganesh, Guru |
STOC '21: "Chasing Convex Bodies with ..."
Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)
C. J. Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang (Carnegie Mellon University, USA; Google Research, USA) The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over d-dimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs ||x t − x t − 1 || is minimized. This problem generalizes questions in online algorithms such as caching and the k-server problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)-competitive algorithm, based on the notion of the Steiner point of a convex body. @InProceedings{STOC21p5, author = {C. J. Argue and Anupam Gupta and Guru Guruganesh and Ziye Tang}, title = {Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {5--5}, doi = {10.1145/3406325.3465354}, year = {2021}, } Publisher's Version |
|
Gurvits, Leonid |
STOC '21: "Capacity Lower Bounds via ..."
Capacity Lower Bounds via Productization
Leonid Gurvits and Jonathan Leake (City College of New York, USA; TU Berlin, Germany) We give a sharp lower bound on the capacity of a real stable polynomial, depending only on the value of its gradient at x = 1. This result implies a sharp improvement to a similar inequality proved by Linial-Samorodnitsky-Wigderson in 2000, which was crucial to the analysis of their permanent approximation algorithm. Such inequalities have played an important role in the recent work on operator scaling and its generalizations and applications, and in fact we use our bound to construct a new scaling algorithm for real stable polynomials. Our bound is also quite similar to one used very recently by Karlin-Klein-Oveis Gharan to give an improved approximation factor for metric TSP. The new technique we develop to prove this bound is productization, which says that any real stable polynomial can be approximated at any point in the positive orthant by a product of linear forms. Beyond the results of this paper, our main hope is that this new technique will allow us to avoid ”frightening technicalities”, in the words of Laurent and Schrijver, that often accompany combinatorial lower bounds. In particular, we believe that this technique will be useful towards simplifying and improving further the approximation factor given in the fantastic work of Karlin-Klein-Oveis Gharan on metric TSP. @InProceedings{STOC21p847, author = {Leonid Gurvits and Jonathan Leake}, title = {Capacity Lower Bounds via Productization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {847--858}, doi = {10.1145/3406325.3451105}, year = {2021}, } Publisher's Version |
|
Haah, Jeongwan |
STOC '21: "Fiber Bundle Codes: Breaking ..."
Fiber Bundle Codes: Breaking the N1/2 polylog(N) Barrier for Quantum LDPC Codes
Matthew B. Hastings, Jeongwan Haah, and Ryan O'Donnell (Station Q, USA; Microsoft Quantum, USA; Carnegie Mellon University, USA) We present a quantum LDPC code family that has distance Ω(N3/5/polylog(N)) and Θ(N3/5) logical qubits, where N is the code length. This is the first quantum LDPC code construction that achieves distance greater than N1/2 polylog(N). The construction is based on generalizing the homological product of codes to a fiber bundle. @InProceedings{STOC21p1276, author = {Matthew B. Hastings and Jeongwan Haah and Ryan O'Donnell}, title = {Fiber Bundle Codes: Breaking the <i>N</i><sup>1/2</sup> polylog(<i>N</i>) Barrier for Quantum LDPC Codes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1276--1288}, doi = {10.1145/3406325.3451005}, year = {2021}, } Publisher's Version |
|
Haeupler, Bernhard |
STOC '21: "Tree Embeddings for Hop-Constrained ..."
Tree Embeddings for Hop-Constrained Network Design
Bernhard Haeupler, D. Ellis Hershkowitz, and Goran Zuzic (Carnegie Mellon University, USA; ETH Zurich, Switzerland) Network design problems aim to compute low-cost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hop-unconstrained counterparts. A significant algorithmic barrier in this setting is the fact that hop-constrained distances in graphs are very far from being a metric. We show that, nonetheless, hop-constrained distances can be approximated by distributions over ``partial tree metrics.'' We build this result into a powerful and versatile algorithmic tool which, similarly to classic probabilistic tree embeddings, reduces hop-constrained problems in general graphs to hop-unconstrained problems on trees. We then use this tool to give the first poly-logarithmic bicriteria approximations for the hop-constrained variants of many classic network design problems. These include Steiner forest, group Steiner tree, group Steiner forest, buy-at-bulk network design as well as online and oblivious versions of many of these problems. @InProceedings{STOC21p356, author = {Bernhard Haeupler and D. Ellis Hershkowitz and Goran Zuzic}, title = {Tree Embeddings for Hop-Constrained Network Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {356--369}, doi = {10.1145/3406325.3451053}, year = {2021}, } Publisher's Version STOC '21: "Hop-Constrained Oblivious ..." Hop-Constrained Oblivious Routing Mohsen Ghaffari, Bernhard Haeupler, and Goran Zuzic (ETH Zurich, Switzerland; Carnegie Mellon University, USA) We prove the existence of an oblivious routing scheme that is poly(logn)-competitive in terms of (congestion + dilation), thus resolving a well-known question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to minimize (congestion + dilation), defined as follows: The dilation is the maximum path hop-length, and the congestion is the maximum number of paths that include any single edge. The routing scheme obliviously and randomly selects a path for each packet independent of (the existence of) the other packets. Despite this obliviousness, the selected paths have (congestion + dilation) within a poly(logn) factor of the best possible value. More precisely, for any integer hop-bound h, this oblivious routing scheme selects paths of length at most h · poly(logn) and is poly(logn)-competitive in terms of congestion in comparison to the best possible congestion achievable via paths of length at most h hops. These paths can be sampled in polynomial time. This result can be viewed as an analogue of the celebrated oblivious routing results of R'acke [FOCS 2002, STOC 2008], which are O(logn)-competitive in terms of congestion, but are not competitive in terms of dilation. @InProceedings{STOC21p1208, author = {Mohsen Ghaffari and Bernhard Haeupler and Goran Zuzic}, title = {Hop-Constrained Oblivious Routing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1208--1220}, doi = {10.1145/3406325.3451098}, year = {2021}, } Publisher's Version STOC '21: "Universally-Optimal Distributed ..." Universally-Optimal Distributed Algorithms for Known Topologies Bernhard Haeupler, David Wajc, and Goran Zuzic (Carnegie Mellon University, USA; ETH Zurich, Switzerland; Stanford University, USA) Many distributed optimization algorithms achieve existentially-optimal running times, meaning that there exists some pathological worst-case topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (i) What network topology parameters determine the complexity of distributed optimization? (ii) Are there universally-optimal algorithms that are as fast as possible on every topology? We resolve these 25-year-old open problems in the known-topology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, (1+є)-min cut, various approximate shortest paths problems, sub-graph connectivity, etc. In particular, we provide several (equivalent) graph parameters and show they are tight universal lower bounds for the above problems, fully characterizing their inherent complexity. Our results also imply that algorithms based on the low-congestion shortcut framework match the above lower bound, making them universally optimal if shortcuts are efficiently approximable. @InProceedings{STOC21p1166, author = {Bernhard Haeupler and David Wajc and Goran Zuzic}, title = {Universally-Optimal Distributed Algorithms for Known Topologies}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1166--1179}, doi = {10.1145/3406325.3451081}, year = {2021}, } Publisher's Version |
|
Halldórsson, Magnús M. |
STOC '21: "Efficient Randomized Distributed ..."
Efficient Randomized Distributed Coloring in CONGEST
Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan (Reykjavik University, Iceland; University of Freiburg, Germany; Technion, Israel) Distributed vertex coloring is one of the classic problems and probably also the most widely studied problems in the area of distributed graph algorithms. We present a new randomized distributed vertex coloring algorithm for the standard CONGEST model, where the network is modeled as an n-node graph G, and where the nodes of G operate in synchronous communication rounds in which they can exchange O(logn)-bit messages over all the edges of G. For graphs with maximum degree Δ, we show that the (Δ+1)-list coloring problem (and therefore also the standard (Δ+1)-coloring problem) can be solved in O(log5logn) rounds. Previously such a result was only known for the significantly more powerful LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary size. The best previous (Δ+1)-coloring algorithm in the CONGEST model had a running time of O(logΔ + log6logn) rounds. As a function of n alone, the best previous algorithm therefore had a round complexity of O(logn), which is a bound that can also be achieved by a na'ive folklore algorithm. For large maximum degree Δ, our algorithm hence is an exponential improvement over the previous state of the art. @InProceedings{STOC21p1180, author = {Magnús M. Halldórsson and Fabian Kuhn and Yannic Maus and Tigran Tonoyan}, title = {Efficient Randomized Distributed Coloring in CONGEST}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1180--1193}, doi = {10.1145/3406325.3451089}, year = {2021}, } Publisher's Version |
|
Hanneke, Steve |
STOC '21: "A Theory of Universal Learning ..."
A Theory of Universal Learning
Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, and Amir Yehudayoff (Google, Switzerland; Toyota Technological Institute at Chicago, USA; Technion, Israel; Google Research, Israel; Princeton University, USA) How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of Vapnik-Chervonenkis and Valiant, does not explain the behavior of learning curves: the distribution-free PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios. @InProceedings{STOC21p532, author = {Olivier Bousquet and Steve Hanneke and Shay Moran and Ramon van Handel and Amir Yehudayoff}, title = {A Theory of Universal Learning}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {532--541}, doi = {10.1145/3406325.3451087}, year = {2021}, } Publisher's Version |
|
Harchol-Balter, Mor |
STOC '21: "Load Balancing Guardrails: ..."
Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)
Isaac Grosof, Ziv Scully, and Mor Harchol-Balter (Carnegie Mellon University, USA) This talk is about scheduling and load balancing in a multi-server system, with the goal of minimizing mean response time in a general stochastic setting. We will specifically concentrate on the common case of a load balancing system, where a front-end load balancer (a.k.a. dispatcher) dispatches requests to multiple back-end servers, each with their own queue. Much is known about load balancing in the case where the scheduling at the servers is First-Come-First-Served (FCFS). However, to minimize mean response time, we need to use Shortest-Remaining-Processing-Time (SRPT) scheduling at the servers. Unfortunately, there is almost nothing known about optimal dispatching when SRPT scheduling is used at the servers. To make things worse, it turns out that the traditional dispatching policies that are used in practice with FCFS servers often have poor performance in systems with SRPT servers. In this talk, we devise a simple fix that can be applied to any dispatching policy. This fix, called "guardrails" ensures that the dispatching policy yields optimal mean response time under heavy traffic, when used in a system with SRPT servers. Any dispatching policy, when augmented with guardrails becomes heavy-traffic optimal. Our results also yield the first analytical bounds on mean response time for load balancing systems with SRPT scheduling at the servers. Load balancing and scheduling are highly studied both in the stochastic and the worst-case scheduling communities. One aim of this talk is to contrast some differences in the approaches of the two communities when tackling multi-server scheduling problems. @InProceedings{STOC21p10, author = {Isaac Grosof and Ziv Scully and Mor Harchol-Balter}, title = {Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10--10}, doi = {10.1145/3406325.3465359}, year = {2021}, } Publisher's Version |
|
Harms, Nathaniel |
STOC '21: "VC Dimension and Distribution-Free ..."
VC Dimension and Distribution-Free Sample-Based Testing
Eric Blais, Renato Ferreira Pinto Jr., and Nathaniel Harms (University of Waterloo, Canada; Google, Canada) We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems. @InProceedings{STOC21p504, author = {Eric Blais and Renato Ferreira Pinto Jr. and Nathaniel Harms}, title = {VC Dimension and Distribution-Free Sample-Based Testing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {504--517}, doi = {10.1145/3406325.3451104}, year = {2021}, } Publisher's Version |
|
Harsha, Prahladh |
STOC '21: "Decoding Multivariate Multiplicity ..."
Decoding Multivariate Multiplicity Codes on Product Sets
Siddharth Bhandari, Prahladh Harsha, Mrinal Kumar, and Madhu Sudan (Tata Institute of Fundamental Research, India; IIT Bombay, India; Harvard University, USA) The multiplicity Schwartz-Zippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct high-rate locally-decodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial list-decodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in . @InProceedings{STOC21p1489, author = {Siddharth Bhandari and Prahladh Harsha and Mrinal Kumar and Madhu Sudan}, title = {Decoding Multivariate Multiplicity Codes on Product Sets}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1489--1501}, doi = {10.1145/3406325.3451027}, year = {2021}, } Publisher's Version |
|
Hartline, Jason D. |
STOC '21: "Revelation Gap for Pricing ..."
Revelation Gap for Pricing from Samples
Yiding Feng, Jason D. Hartline, and Yingkai Li (Northwestern University, USA) This paper considers prior-independent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a non-trivial revelation gap for revenue maximization. Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the sample-bid mechanism (a simple non-truthful mechanism) and upper-bound its prior-independent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve prior-independent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a non-trivial revelation gap is shown as the sample-bid mechanism outperforms the optimal prior-independent truthful mechanism. On the hardness side, we prove that no (possibly non-truthful) mechanism can achieve prior-independent approximation ratio better than 1.073 even for uniform distributions. @InProceedings{STOC21p1438, author = {Yiding Feng and Jason D. Hartline and Yingkai Li}, title = {Revelation Gap for Pricing from Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1438--1451}, doi = {10.1145/3406325.3451057}, year = {2021}, } Publisher's Version |
|
Hastings, Matthew B. |
STOC '21: "(Sub)Exponential Advantage ..."
(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem
András Gilyén, Matthew B. Hastings, and Umesh Vazirani (California Institute of Technology, USA; Microsoft Quantum, USA; Microsoft Research, USA; University of California at Berkeley, USA) We demonstrate the possibility of (sub)exponential quantum speedup via a quantum algorithm that follows an adiabatic path of a gapped Hamiltonian with no sign problem. The Hamiltonian that exhibits this speed-up comes from the adjacency matrix of an undirected graph whose vertices are labeled by n-bit strings, and we can view the adiabatic evolution as an efficient O(poly(n))-time quantum algorithm for finding a specific “EXIT” vertex in the graph given the “ENTRANCE” vertex. On the other hand we show that if the graph is given via an adjacency-list oracle, there is no classical algorithm that finds the “EXIT” with probability greater than exp(−nδ) using at most exp(nδ) queries for δ= 1/5 − o(1). Our construction of the graph is somewhat similar to the “welded-trees” construction of Childs et al., but uses additional ideas of Hastings for achieving a spectral gap and a short adiabatic path. @InProceedings{STOC21p1357, author = {András Gilyén and Matthew B. Hastings and Umesh Vazirani}, title = {(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1357--1369}, doi = {10.1145/3406325.3451060}, year = {2021}, } Publisher's Version STOC '21: "Fiber Bundle Codes: Breaking ..." Fiber Bundle Codes: Breaking the N1/2 polylog(N) Barrier for Quantum LDPC Codes Matthew B. Hastings, Jeongwan Haah, and Ryan O'Donnell (Station Q, USA; Microsoft Quantum, USA; Carnegie Mellon University, USA) We present a quantum LDPC code family that has distance Ω(N3/5/polylog(N)) and Θ(N3/5) logical qubits, where N is the code length. This is the first quantum LDPC code construction that achieves distance greater than N1/2 polylog(N). The construction is based on generalizing the homological product of codes to a fiber bundle. @InProceedings{STOC21p1276, author = {Matthew B. Hastings and Jeongwan Haah and Ryan O'Donnell}, title = {Fiber Bundle Codes: Breaking the <i>N</i><sup>1/2</sup> polylog(<i>N</i>) Barrier for Quantum LDPC Codes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1276--1288}, doi = {10.1145/3406325.3451005}, year = {2021}, } Publisher's Version |
|
Hazan, Elad |
STOC '21: "Boosting Simple Learners ..."
Boosting Simple Learners
Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rules-of-thumbs” from an “easy-to-learn class”. (Schapire and Freund ’12, Shalev-Shwartz and Ben-David ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ2) weak hypotheses with γ-margin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorial-geometric parameter which captures the expressivity of base-classes in boosting. As a corollary we provide an affirmative answer to the second question for many well-studied classes, including half-spaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481--489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version |
|
Hązła, Jan |
STOC '21: "On Codes Decoding a Constant ..."
On Codes Decoding a Constant Fraction of Errors on the BSC
Jan Hązła, Alex Samorodnitsky, and Ori Sberlo (EPFL, Switzerland; Hebrew University of Jerusalem, Israel; Tel Aviv University, Israel) We strengthen the results from a recent work by the second author, achieving bounds on the weight distribution of binary linear codes that are successful under block-MAP (as well as bit-MAP) decoding on the BEC. We conclude that a linear code that is successful on the BEC can also decode over a range of binary memoryless symmetric (BMS) channels. In particular, applying the result of Kudekar, Kumar, Mondelli, Pfister, Şaşoğlu and Urbanke from STOC 2016, we prove that a Reed–Muller code of positive rate R decodes errors on the p with high probability if p < 1/2 − √2−R(1−2−R). @InProceedings{STOC21p1479, author = {Jan Hązła and Alex Samorodnitsky and Ori Sberlo}, title = {On Codes Decoding a Constant Fraction of Errors on the BSC}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1479--1488}, doi = {10.1145/3406325.3451015}, year = {2021}, } Publisher's Version |
|
He, Kun |
STOC '21: "Sampling Constraint Satisfaction ..."
Sampling Constraint Satisfaction Solutions in the Local Lemma Regime
Weiming Feng, Kun He, and Yitong Yin (Nanjing University, China; Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China) We give a Markov chain based algorithm for sampling almost uniform solutions of constraint satisfaction problems (CSPs). Assuming a canonical setting for the Lovász local lemma, where each constraint is violated by a small number of forbidden local configurations, our sampling algorithm is accurate in a local lemma regime, and the running time is a fixed polynomial whose dependency on n is close to linear, where n is the number of variables. Our main approach is a new technique called state compression, which generalizes the “mark/unmark” paradigm of Moitra, and can give fast local-lemma-based sampling algorithms. As concrete applications of our technique, we give the current best almost-uniform samplers for hypergraph colorings and for CNF solutions. @InProceedings{STOC21p1565, author = {Weiming Feng and Kun He and Yitong Yin}, title = {Sampling Constraint Satisfaction Solutions in the Local Lemma Regime}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1565--1578}, doi = {10.1145/3406325.3451101}, year = {2021}, } Publisher's Version |
|
Hershkowitz, D. Ellis |
STOC '21: "Tree Embeddings for Hop-Constrained ..."
Tree Embeddings for Hop-Constrained Network Design
Bernhard Haeupler, D. Ellis Hershkowitz, and Goran Zuzic (Carnegie Mellon University, USA; ETH Zurich, Switzerland) Network design problems aim to compute low-cost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hop-unconstrained counterparts. A significant algorithmic barrier in this setting is the fact that hop-constrained distances in graphs are very far from being a metric. We show that, nonetheless, hop-constrained distances can be approximated by distributions over ``partial tree metrics.'' We build this result into a powerful and versatile algorithmic tool which, similarly to classic probabilistic tree embeddings, reduces hop-constrained problems in general graphs to hop-unconstrained problems on trees. We then use this tool to give the first poly-logarithmic bicriteria approximations for the hop-constrained variants of many classic network design problems. These include Steiner forest, group Steiner tree, group Steiner forest, buy-at-bulk network design as well as online and oblivious versions of many of these problems. @InProceedings{STOC21p356, author = {Bernhard Haeupler and D. Ellis Hershkowitz and Goran Zuzic}, title = {Tree Embeddings for Hop-Constrained Network Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {356--369}, doi = {10.1145/3406325.3451053}, year = {2021}, } Publisher's Version |
|
Hirahara, Shuichi |
STOC '21: "Average-Case Hardness of NP ..."
Average-Case Hardness of NP from Exponential Worst-Case Hardness Assumptions
Shuichi Hirahara (National Institute of Informatics, Japan) A long-standing and central open question in the theory of average-case complexity is to base average-case hardness of NP on worst-case hardness of NP. A frontier question along this line is to prove that PH is hard on average if UP requires (sub-)exponential worst-case complexity. The difficulty of resolving this question has been discussed from various perspectives based on technical barrier results, such as the limits of black-box reductions and the non-existence of worst-case hardness amplification procedures in PH. In this paper, we overcome these barriers and resolve the open question by presenting the following main results: 1. UP ⊈DTIME(2O(n / logn)) implies DistNP ⊈AvgP. 2. PH ⊈DTIME(2O(n / logn)) implies DistPH ⊈AvgP. 3. NP ⊈DTIME(2O(n / logn)) implies DistNP ⊈AvgP P. Here, AvgP P denotes P-computable average-case polynomial time, which interpolates average-case polynomial-time and worst-case polynomial-time. We complement this result by showing that DistPH ⊈AvgP if and only if DistPH ⊈AvgP P. At the core of all of our results is a new notion of universal heuristic scheme, whose running time is P-computable average-case polynomial time under every polynomial-time samplable distribution. Our proofs are based on the meta-complexity of time-bounded Kolmogorov complexity: We analyze average-case complexity through the lens of worst-case meta-complexity using a new “algorithmic” proof of language compression and weak symmetry of information for time-bounded Kolmogorov complexity. @InProceedings{STOC21p292, author = {Shuichi Hirahara}, title = {Average-Case Hardness of NP from Exponential Worst-Case Hardness Assumptions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {292--302}, doi = {10.1145/3406325.3451065}, year = {2021}, } Publisher's Version |
|
Hollender, Alexandros |
STOC '21: "The Complexity of Gradient ..."
The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
John Fearnley, Paul W. Goldberg, Alexandros Hollender, and Rahul Savani (University of Liverpool, UK; University of Oxford, UK) We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two well-known classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a Karush-Kuhn-Tucker (KKT) point of a continuously differentiable function over the domain [0,1]2 is PPAD ∩ PLS-complete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search) - which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD ∩ PLS and contains many interesting problems - is itself equal to PPAD ∩ PLS. @InProceedings{STOC21p46, author = {John Fearnley and Paul W. Goldberg and Alexandros Hollender and Rahul Savani}, title = {The Complexity of Gradient Descent: CLS = PPAD ∩ PLS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {46--59}, doi = {10.1145/3406325.3451052}, year = {2021}, } Publisher's Version |
|
Holmgren, Justin |
STOC '21: "Fiat–Shamir via List-Recoverable ..."
Fiat–Shamir via List-Recoverable Codes (or: Parallel Repetition of GMW Is Not Zero-Knowledge)
Justin Holmgren, Alex Lombardi, and Ron D. Rothblum (NTT Research, USA; Massachusetts Institute of Technology, USA; Technion, Israel) In a seminal work, Goldreich, Micali and Wigderson (CRYPTO ’86) demonstrated the wide applicability of zero-knowledge proofs by constructing such a proof system for the NP-complete problem of graph 3-coloring. A long-standing open question has been whether parallel repetition of their protocol preserves zero knowledge. In this work, we answer this question in the negative, assuming a standard cryptographic assumption (i.e., the hardness of learning with errors (LWE)). Leveraging a connection observed by Dwork, Naor, Reingold, and Stockmeyer (FOCS ’99), our negative result is obtained by making positive progress on a related fundamental problem in cryptography: securely instantiating the Fiat-Shamir heuristic for eliminating interaction in public-coin interactive protocols. A recent line of work has shown how to instantiate the heuristic securely, albeit only for a limited class of protocols. Our main result shows how to instantiate Fiat-Shamir for parallel repetitions of much more general interactive proofs. In particular, we construct hash functions that, assuming LWE, securely realize the Fiat-Shamir transform for the following rich classes of protocols: 1) The parallel repetition of any “commit-and-open” protocol (such as the GMW protocol mentioned above), when a specific (natural) commitment scheme is used. Commit-and-open protocols are a ubiquitous paradigm for constructing general purpose public-coin zero knowledge proofs. 2) The parallel repetition of any base protocol that (1) satisfies a stronger notion of soundness called round-by-round soundness, and (2) has an efficient procedure, using a suitable trapdoor, for recognizing “bad verifier randomness” that would allow the prover to cheat. Our results are obtained by establishing a new connection between the Fiat-Shamir transform and list-recoverable codes. In contrast to the usual focus in coding theory, we focus on a parameter regime in which the input lists are extremely large, but the rate can be small. We give a (probabilistic) construction based on Parvaresh-Vardy codes (FOCS ’05) that suffices for our applications. @InProceedings{STOC21p750, author = {Justin Holmgren and Alex Lombardi and Ron D. Rothblum}, title = {Fiat–Shamir via List-Recoverable Codes (or: Parallel Repetition of GMW Is Not Zero-Knowledge)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {750--760}, doi = {10.1145/3406325.3451116}, year = {2021}, } Publisher's Version |
|
Hongler, Clément |
STOC '21: "Neural Tangent Kernel: Convergence ..."
Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
Arthur Jacot, Franck Gabriel, and Clément Hongler (EPFL, Switzerland) The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features. @InProceedings{STOC21p6, author = {Arthur Jacot and Franck Gabriel and Clément Hongler}, title = {Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {6--6}, doi = {10.1145/3406325.3465355}, year = {2021}, } Publisher's Version |
|
Hrubes, Pavel |
STOC '21: "Learnability Can Be Independent ..."
Learnability Can Be Independent of Set Theory (Invited Paper)
Shai Ben-David, Pavel Hrubes, Shay Moran, Amir Shpilka, and Amir Yehudayoff (University of Waterloo, Canada; Czech Academy of Sciences, Czechia; Technion, Israel; Tel Aviv University, Israel) A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its Vapnik-Chervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VC-dimension characterization of binary classification learnability. @InProceedings{STOC21p11, author = {Shai Ben-David and Pavel Hrubes and Shay Moran and Amir Shpilka and Amir Yehudayoff}, title = {Learnability Can Be Independent of Set Theory (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11--11}, doi = {10.1145/3406325.3465360}, year = {2021}, } Publisher's Version |
|
Huang, Zhiyi |
STOC '21: "Online Stochastic Matching, ..."
Online Stochastic Matching, Poisson Arrivals, and the Natural Linear Program
Zhiyi Huang and Xinkai Shu (University of Hong Kong, China) We study the online stochastic matching problem. Consider a bipartite graph with offline vertices on one side, and with i.i.d.online vertices on the other side. The offline vertices and the distribution of online vertices are known to the algorithm beforehand. The realization of the online vertices, however, is revealed one at a time, upon which the algorithm immediately decides how to match it. For maximizing the cardinality of the matching, we give a 0.711-competitive online algorithm, which improves the best previous ratio of 0.706. When the offline vertices are weighted, we introduce a 0.7009-competitive online algorithm for maximizing the total weight of the matched offline vertices, which improves the best previous ratio of 0.662. Conceptually, we find that the analysis of online algorithms simplifies if the online vertices follow a Poisson process, and establish an approximate equivalence between this Poisson arrival model and online stochstic matching. Technically, we propose a natural linear program for the Poisson arrival model, and demonstrate how to exploit its structure by introducing a converse of Jensen’s inequality. Moreover, we design an algorithmic amortization to replace the analytic one in previous work, and as a result get the first vertex-weighted online stochastic matching algorithm that improves the results in the weaker random arrival model. @InProceedings{STOC21p682, author = {Zhiyi Huang and Xinkai Shu}, title = {Online Stochastic Matching, Poisson Arrivals, and the Natural Linear Program}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {682--693}, doi = {10.1145/3406325.3451079}, year = {2021}, } Publisher's Version |
|
Husić, Edin |
STOC '21: "Approximating Nash Social ..."
Approximating Nash Social Welfare under Rado Valuations
Jugal Garg, Edin Husić, and László A. Végh (University of Illinois at Urbana-Champaign, USA; London School of Economics and Political Science, UK) We consider the problem of approximating maximum Nash social welfare (NSW) while allocating a set of indivisible items to n agents. The NSW is a popular objective that provides a balanced tradeoff between the often conflicting requirements of fairness and efficiency, defined as the weighted geometric mean of the agents’ valuations. For the symmetric additive case of the problem, where agents have the same weight with additive valuations, the first constant-factor approximation algorithm was obtained in 2015. Subsequent work has obtained constant-factor approximation algorithms for the symmetric case under mild generalizations of additive, and O(n)-approximation algorithms for subadditive valuations and for the asymmetric case. In this paper, we make significant progress towards both symmetric and asymmetric NSW problems. We present the first constant-factor approximation algorithm for the symmetric case under Rado valuations. Rado valuations form a general class of valuation functions that arise from maximum cost independent matching problems, including as special cases assignment (OXS) valuations and weighted matroid rank functions. Furthermore, our approach also gives the first constant-factor approximation algorithm for the asymmetric case under Rado valuations, provided that the maximum ratio between the weights is bounded by a constant. @InProceedings{STOC21p1412, author = {Jugal Garg and Edin Husić and László A. Végh}, title = {Approximating Nash Social Welfare under Rado Valuations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1412--1425}, doi = {10.1145/3406325.3451031}, year = {2021}, } Publisher's Version |
|
Jacot, Arthur |
STOC '21: "Neural Tangent Kernel: Convergence ..."
Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
Arthur Jacot, Franck Gabriel, and Clément Hongler (EPFL, Switzerland) The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features. @InProceedings{STOC21p6, author = {Arthur Jacot and Franck Gabriel and Clément Hongler}, title = {Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {6--6}, doi = {10.1145/3406325.3465355}, year = {2021}, } Publisher's Version |
|
Jain, Aayush |
STOC '21: "Indistinguishability Obfuscation ..."
Indistinguishability Obfuscation from Well-Founded Assumptions
Aayush Jain, Huijia Lin, and Amit Sahai (University of California at Los Angeles, USA; University of Washington, USA) Indistinguishability obfuscation, introduced by [Barak et. al. Crypto 2001], aims to compile programs into unintelligible ones while preserving functionality. It is a fascinating and powerful object that has been shown to enable a host of new cryptographic goals and beyond. However, constructions of indistinguishability obfuscation have remained elusive, with all other proposals relying on heuristics or newly conjectured hardness assumptions. In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four well-founded assumptions. We prove: Informal Theorem: Let τ ∈ (0,∞), δ ∈ (0,1), ∈ (0,1) be arbitrary constants. Assume sub-exponential security of the following assumptions: - the Learning With Errors (LWE) assumption with subexponential modulus-to-noise ratio 2kє and noises of magnitude polynomial in k, where k is the dimension of the LWE secret, - the Learning Parity with Noise (LPN) assumption over general prime fields ℤp with polynomially many LPN samples and error rate 1/ℓδ, where ℓ is the dimension of the LPN secret, - the existence of a Boolean Pseudo-Random Generator (PRG) in NC0 with stretch n1+τ, where n is the length of the PRG seed, - the Decision Linear (DLIN) assumption on symmetric bilinear groups of prime order. Then, (subexponentially secure) indistinguishability obfuscation for all polynomial-size circuits exists. Further, assuming only polynomial security of the aforementioned assumptions, there exists collusion resistant public-key functional encryption for all polynomial-size circuits. @InProceedings{STOC21p60, author = {Aayush Jain and Huijia Lin and Amit Sahai}, title = {Indistinguishability Obfuscation from Well-Founded Assumptions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {60--73}, doi = {10.1145/3406325.3451093}, year = {2021}, } Publisher's Version |
|
Jain, Vishesh |
STOC '21: "Perfectly Sampling k ..."
Perfectly Sampling k ≥ (8/3 + o(1))Δ-Colorings in Graphs
Vishesh Jain, Ashwin Sah, and Mehtaab Sawhney (Simons Institute for the Theory of Computing Berkeley, USA; Massachusetts Institute of Technology, USA) We present a randomized algorithm which takes as input an undirected graph G on n vertices with maximum degree Δ, and a number of colors k ≥ (8/3 + oΔ(1))Δ, and returns – in expected time Õ(nΔ2logk) – a proper k-coloring of G distributed perfectly uniformly on the set of all proper k-colorings of G. Notably, our sampler breaks the barrier at k = 3Δ encountered in recent work of Bhandari and Chakraborty [STOC 2020]. We also discuss how our methods may be modified to relax the restriction on k to k ≥ (8/3 − є0)Δ for an absolute constant є0 > 0. As in the work of Bhandari and Chakraborty, and the pioneering work of Huber [STOC 1998], our sampler is based on Coupling from the Past [Propp&Wilson, Random Struct. Algorithms, 1995] and the bounding chain method [Huber, STOC 1998; H'aggstr'om& Nelander, Scand. J. Statist., 1999]. Our innovations include a novel bounding chain routine inspired by Jerrum’s analysis of the Glauber dynamics [Random Struct. Algorithms, 1995], as well as a preconditioning routine for bounding chains which uses the algorithmic Lovász Local Lemma [Moser&Tardos, J.ACM, 2010]. @InProceedings{STOC21p1589, author = {Vishesh Jain and Ashwin Sah and Mehtaab Sawhney}, title = {Perfectly Sampling <i>k</i> ≥ (8/3 + <i>o</i>(1))Δ-Colorings in Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1589--1600}, doi = {10.1145/3406325.3451012}, year = {2021}, } Publisher's Version |
|
Janczewski, Wojciech |
STOC '21: "Fully Dynamic Approximation ..."
Fully Dynamic Approximation of LIS in Polylogarithmic Time
Paweł Gawrychowski and Wojciech Janczewski (University of Wrocław, Poland) We revisit the problem of maintaining the longest increasing subsequence (LIS) of an array under (i) inserting an element, and (ii) deleting an element of an array. In a recent breakthrough, Mitzenmacher and Seddighin [STOC 2020] designed an algorithm that maintains an O((1/є)O(1/є))-approximation of LIS under both operations with worst-case update time Õ(nє), for any constant є>0 (Õ hides factors polynomial in logn, where n is the length of the input). We exponentially improve on their result by designing an algorithm that maintains an (1+є) approximation of LIS under both operations with worst-case update time Õ(є−5). Instead of working with the grid packing technique introduced by Mitzenmacher and Seddighin, we take a different approach building on a new tool that might be of independent interest: LIS sparsification. A particularly interesting consequence of our result is an improved solution for the so-called Erdős-Szekeres partitioning, in which we seek a partition of a given permutation of {1,2,…,n} into O(√n) monotone subsequences. This problem has been repeatedly stated as one of the natural examples in which we see a large gap between the decision-tree complexity and algorithmic complexity. The result of Mitzenmacher and Seddighin implies an O(n1+є) time solution for this problem, for any є>0. Our algorithm (in fact, its simpler decremental version) further improves this to Õ(n). @InProceedings{STOC21p654, author = {Paweł Gawrychowski and Wojciech Janczewski}, title = {Fully Dynamic Approximation of LIS in Polylogarithmic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {654--667}, doi = {10.1145/3406325.3451137}, year = {2021}, } Publisher's Version |
|
Jansen, Bart M. P. |
STOC '21: "Vertex Deletion Parameterized ..."
Vertex Deletion Parameterized by Elimination Distance and Even Less
Bart M. P. Jansen, Jari J. H. de Kroon, and Michał Włodarczyk (Eindhoven University of Technology, Netherlands) We study the parameterized complexity of various classic vertex-deletion problems such as Odd cycle transversal, Vertex planarization, and Chordal vertex deletion under hybrid parameterizations. Existing FPT algorithms for these problems either focus on the parameterization by solution size, detecting solutions of size k in time f(k) · nO(1), or width parameterizations, finding arbitrarily large optimal solutions in time f(w) · nO(1) for some width measure w like treewidth. We unify these lines of research by presenting FPT algorithms for parameterizations that can simultaneously be arbitrarily much smaller than the solution size and the treewidth. The first class of parameterizations is based on the notion of elimination distance of the input graph to the target graph class , which intuitively measures the number of rounds needed to obtain a graph in by removing one vertex from each connected component in each round. The second class of parameterizations consists of a relaxation of the notion of treewidth, allowing arbitrarily large bags that induce subgraphs belonging to the target class of the deletion problem as long as these subgraphs have small neighborhoods. Both kinds of parameterizations have been introduced recently and have already spawned several independent results. Our contribution is twofold. First, we present a framework for computing approximately optimal decompositions related to these graph measures. Namely, if the cost of an optimal decomposition is k, we show how to find a decomposition of cost kO(1) in time f(k) · nO(1). This is applicable to any class for which we can solve the so-called separation problem. Secondly, we exploit the constructed decompositions for solving vertex-deletion problems by extending ideas from algorithms using iterative compression and the finite state property. For the three mentioned vertex-deletion problems, and all problems which can be formulated as hitting a finite set of connected forbidden (a) minors or (b) (induced) subgraphs, we obtain FPT algorithms with respect to both studied parameterizations. For example, we present an algorithm running in time nO(1) + 2kO(1)·(n+m) and polynomial space for Odd cycle transversal parameterized by the elimination distance k to the class of bipartite graphs. @InProceedings{STOC21p1757, author = {Bart M. P. Jansen and Jari J. H. de Kroon and Michał Włodarczyk}, title = {Vertex Deletion Parameterized by Elimination Distance and Even Less}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1757--1769}, doi = {10.1145/3406325.3451068}, year = {2021}, } Publisher's Version |
|
Jawale, Ruta |
STOC '21: "SNARGs for Bounded Depth Computations ..."
SNARGs for Bounded Depth Computations and PPAD Hardness from Sub-Exponential LWE
Ruta Jawale, Yael Tauman Kalai, Dakshita Khurana, and Rachel Zhang (University of Illinois at Urbana-Champaign, USA; Microsoft Research, USA; Massachusetts Institute of Technology, USA) We construct a succinct non-interactive publicly-verifiable delegation scheme for any log-space uniform circuit under the sub-exponential Learning With Errors (LWE) assumption. For a circuit C:{0,1}N→{0,1} of size S and depth D, the prover runs in time poly(S), the communication complexity is D · polylog(S), and the verifier runs in time (D+N) ·polylog(S). To obtain this result, we introduce a new cryptographic primitive: a lossy correlation-intractable hash function family. We use this primitive to soundly instantiate the Fiat-Shamir transform for a large class of interactive proofs, including the interactive sum-check protocol and the GKR protocol, assuming the sub-exponential hardness of LWE. Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (sub-exponential) average-case hardness of PPAD, assuming the sub-exponential hardness of LWE. @InProceedings{STOC21p708, author = {Ruta Jawale and Yael Tauman Kalai and Dakshita Khurana and Rachel Zhang}, title = {SNARGs for Bounded Depth Computations and PPAD Hardness from Sub-Exponential LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {708--721}, doi = {10.1145/3406325.3451055}, year = {2021}, } Publisher's Version |
|
Jayaram, Rajesh |
STOC '21: "A Polynomial-Time Approximation ..."
A Polynomial-Time Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Counting the number of words of a certain length accepted by a non-deterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature. The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GEN-NFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k. It is known that #NFA is #P-complete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomial-time randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)). Besides, we will mention some consequences and applications of our results. In particular, from well-known results on counting and uniform generation, we obtain that GEN-NFA admits a fully polynomial-time almost uniform generator. Moreover, as #NFA is SpanL-complete under polynomial-time parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS. @InProceedings{STOC21p4, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {A Polynomial-Time Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {4--4}, doi = {10.1145/3406325.3465353}, year = {2021}, } Publisher's Version STOC '21: "When Is Approximate Counting ..." When Is Approximate Counting for Conjunctive Queries Tractable? Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomial-time randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomial-time algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries. The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasi-polynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits. @InProceedings{STOC21p1015, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {When Is Approximate Counting for Conjunctive Queries Tractable?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1015--1027}, doi = {10.1145/3406325.3451014}, year = {2021}, } Publisher's Version |
|
Jeronimo, Fernando Granha |
STOC '21: "Near-Linear Time Decoding ..."
Near-Linear Time Decoding of Ta-Shma’s Codes via Splittable Regularity
Fernando Granha Jeronimo, Shashank Srivastava, and Madhur Tulsiani (University of Chicago, USA; Toyota Technological Institute at Chicago, USA) The Gilbert–Varshamov bound non-constructively establishes the existence of binary codes of distance 1/2−є/2 and rate Ω(є2). In a breakthrough result, Ta-Shma [STOC 2017] constructed the first explicit family of nearly optimal binary codes with distance 1/2−є/2 and rate Ω(є2+α), where α → 0 as є → 0. Moreover, the codes in Ta-Shma’s construction are є-balanced, where the distance between distinct codewords is not only bounded from below by 1/2−є/2, but also from above by 1/2+є/2. Polynomial time decoding algorithms for (a slight modification of) Ta-Shma’s codes appeared in [FOCS 2020], and were based on the Sum-of-Squares (SoS) semidefinite programming hierarchy. The running times for these algorithms were of the form NOα(1) for unique decoding, and NOє,α(1) for the setting of “gentle list decoding”, with large exponents of N even when α is a fixed constant. We derive new algorithms for both these tasks, running in time Õє(N). Our algorithms also apply to the general setting of decoding direct-sum codes. Our algorithms follow from new structural and algorithmic results for collections of k-tuples (ordered hypergraphs) possessing a “structured expansion” property, which we call splittability. This property was previously identified and used in the analysis of SoS-based decoding and constraint satisfaction algorithms, and is also known to be satisfied by Ta-Shma’s code construction. We obtain a new weak regularity decomposition for (possibly sparse) splittable collections W ⊆ [n]k, similar to the regularity decomposition for dense structures by Frieze and Kannan [FOCS 1996]. These decompositions are also computable in near-linear time Õ(|W |), and form a key component of our algorithmic results. @InProceedings{STOC21p1527, author = {Fernando Granha Jeronimo and Shashank Srivastava and Madhur Tulsiani}, title = {Near-Linear Time Decoding of Ta-Shma’s Codes via Splittable Regularity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1527--1536}, doi = {10.1145/3406325.3451126}, year = {2021}, } Publisher's Version |
|
Jia, He |
STOC '21: "Reducing Isotropy and Volume ..."
Reducing Isotropy and Volume to KLS: An O*(n3ψ2) Volume Algorithm
He Jia, Aditi Laddha, Yin Tat Lee, and Santosh Vempala (Georgia Institute of Technology, USA; University of Washington, USA; Microsoft Research, USA) We show that the volume of a convex body in Rn in the general membership oracle model can be computed to within relative error ε using O(n3ψ2/ε2) oracle queries, where ψ is the KLS constant. With the current bound of ψ=O(no(1)), this gives an O(n3+o(1)/ε2) algorithm, the first improvement on the Lovász-Vempala O(n4/ε2) algorithm from 2003. The main new ingredient is an O(n3ψ2) algorithm for isotropic transformation, following which we can apply the O(n3/ε2) volume algorithm of Cousins and Vempala for well-rounded convex bodies. A positive resolution of the KLS conjecture would imply an O(n3/є2) volume algorithm. We also give an efficient implementation of the new algorithm for convex polytopes defined by m inequalities in Rn: polytope volume can be estimated in time O(mnc/ε2) where c<3.2 depends on the current matrix multiplication exponent and improves on the previous best bound. @InProceedings{STOC21p961, author = {He Jia and Aditi Laddha and Yin Tat Lee and Santosh Vempala}, title = {Reducing Isotropy and Volume to KLS: An <i>O</i>*(<i>n</i><sup>3</sup><i>ψ</i><sup>2</sup>) Volume Algorithm}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {961--974}, doi = {10.1145/3406325.3451018}, year = {2021}, } Publisher's Version |
|
Jiang, Shunhua |
STOC '21: "A Faster Algorithm for Solving ..."
A Faster Algorithm for Solving General LPs
Shunhua Jiang, Zhao Song, Omri Weinstein, and Hengjie Zhang (Columbia University, USA; Institute for Advanced Study at Princeton, USA) The fastest known LP solver for general (dense) linear programs is due to [Cohen, Lee and Song’19] and runs in O*(nω +n2.5−α/2 + n2+1/6) time. A number of follow-up works [Lee, Song and Zhang’19, Brand’20, Song and Yu’20] obtain the same complexity through different techniques, but none of them can go below n2+1/6, even if ω=2. This leaves a polynomial gap between the cost of solving linear systems (nω) and the cost of solving linear programs, and as such, improving the n2+1/6 term is crucial toward establishing an equivalence between these two fundamental problems. In this paper, we reduce the running time to O*(nω +n2.5−α/2 + n2+1/18) where ω and α are the fast matrix multiplication exponent and its dual. Hence, under the common belief that ω ≈ 2 and α ≈ 1, our LP solver runs in O*(n2.055) time instead of O*(n2.16). @InProceedings{STOC21p823, author = {Shunhua Jiang and Zhao Song and Omri Weinstein and Hengjie Zhang}, title = {A Faster Algorithm for Solving General LPs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {823--832}, doi = {10.1145/3406325.3451058}, year = {2021}, } Publisher's Version Info |
|
Jung, Christopher |
STOC '21: "A New Analysis of Differential ..."
A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)
Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld (University of Pennsylvania, USA; Hebrew University of Jerusalem, Israel) We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds. @InProceedings{STOC21p9, author = {Christopher Jung and Katrina Ligett and Seth Neel and Aaron Roth and Saeed Sharifi-Malvajerdi and Moshe Shenfeld}, title = {A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {9--9}, doi = {10.1145/3406325.3465358}, year = {2021}, } Publisher's Version |
|
Kalai, Yael Tauman |
STOC '21: "SNARGs for Bounded Depth Computations ..."
SNARGs for Bounded Depth Computations and PPAD Hardness from Sub-Exponential LWE
Ruta Jawale, Yael Tauman Kalai, Dakshita Khurana, and Rachel Zhang (University of Illinois at Urbana-Champaign, USA; Microsoft Research, USA; Massachusetts Institute of Technology, USA) We construct a succinct non-interactive publicly-verifiable delegation scheme for any log-space uniform circuit under the sub-exponential Learning With Errors (LWE) assumption. For a circuit C:{0,1}N→{0,1} of size S and depth D, the prover runs in time poly(S), the communication complexity is D · polylog(S), and the verifier runs in time (D+N) ·polylog(S). To obtain this result, we introduce a new cryptographic primitive: a lossy correlation-intractable hash function family. We use this primitive to soundly instantiate the Fiat-Shamir transform for a large class of interactive proofs, including the interactive sum-check protocol and the GKR protocol, assuming the sub-exponential hardness of LWE. Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (sub-exponential) average-case hardness of PPAD, assuming the sub-exponential hardness of LWE. @InProceedings{STOC21p708, author = {Ruta Jawale and Yael Tauman Kalai and Dakshita Khurana and Rachel Zhang}, title = {SNARGs for Bounded Depth Computations and PPAD Hardness from Sub-Exponential LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {708--721}, doi = {10.1145/3406325.3451055}, year = {2021}, } Publisher's Version |
|
Kandiros, Anthimos Vardis |
STOC '21: "Learning Ising Models from ..."
Learning Ising Models from One or Multiple Samples
Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for one-sample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a high-dimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the afore-described multiple-sample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anti-concentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161--168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version |
|
Kane, Daniel M. |
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of Wisconsin-Madison, USA; MPI-INF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are є-far from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sample-optimal testers for a range of properties. While one can always boost the confidence probability of any such tester by black-box amplification, this generic boosting method typically leads to sub-optimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors, as a function of all relevant parameters. We also show matching information-theoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542--555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version STOC '21: "Efficiently Learning Halfspaces ..." Efficiently Learning Halfspaces with Tsybakov Noise Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, and Nikos Zarifis (University of Wisconsin-Madison, USA; University of California at San Diego, USA) We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomial-time algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of well-behaved distributions including log-concave distributions. This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the non-optimality of a candidate halfspace and gave a quasi-polynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomial-time certificate algorithm. @InProceedings{STOC21p88, author = {Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, title = {Efficiently Learning Halfspaces with Tsybakov Noise}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88--101}, doi = {10.1145/3406325.3450998}, year = {2021}, } Publisher's Version |
|
Kapralov, Michael |
STOC '21: "Towards Tight Bounds for Spectral ..."
Towards Tight Bounds for Spectral Sparsification of Hypergraphs
Michael Kapralov, Robert Krauthgamer, Jakab Tardos, and Yuichi Yoshida (EPFL, Switzerland; Weizmann Institute of Science, Israel; National Institute of Informatics, Japan) Cut and spectral sparsification of graphs have numerous applications, including e.g. speeding up algorithms for cuts and Laplacian solvers. These powerful notions have recently been extended to hypergraphs, which are much richer and may offer new applications. However, the current bounds on the size of hypergraph sparsifiers are not as tight as the corresponding bounds for graphs. Our first result is a polynomial-time algorithm that, given a hypergraph on n vertices with maximum hyperedge size r, outputs an є-spectral sparsifier with O*(nr) hyperedges, where O* suppresses (є−1 logn)O(1) factors. This size bound improves the two previous bounds: O*(n3) [Soma and Yoshida, SODA’19] and O*(nr3) [Bansal, Svensson and Trevisan, FOCS’19]. Our main technical tool is a new method for proving concentration of the nonlinear analogue of the quadratic form of the Laplacians for hypergraph expanders. We complement this with lower bounds on the bit complexity of any compression scheme that (1+є)-approximates all the cuts in a given hypergraph, and hence also on the bit complexity of every є-cut/spectral sparsifier. These lower bounds are based on Ruzsa-Szemerédi graphs, and a particular instantiation yields an Ω(nr) lower bound on the bit complexity even for fixed constant є. In the case of hypergraph cut sparsifiers, this is tight up to polylogarithmic factors in n, due to recent result of [Chen, Khanna and Nagda, FOCS’20]. For spectral sparsifiers it narrows the gap to O*(r). Finally, for directed hypergraphs, we present an algorithm that computes an є-spectral sparsifier with O*(n2r3) hyperarcs, where r is the maximum size of a hyperarc. For small r, this improves over O*(n3) known from [Soma and Yoshida, SODA’19], and is getting close to the trivial lower bound of Ω(n2) hyperarcs. @InProceedings{STOC21p598, author = {Michael Kapralov and Robert Krauthgamer and Jakab Tardos and Yuichi Yoshida}, title = {Towards Tight Bounds for Spectral Sparsification of Hypergraphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {598--611}, doi = {10.1145/3406325.3451061}, year = {2021}, } Publisher's Version |
|
Karlin, Anna R. |
STOC '21: "A (Slightly) Improved Approximation ..."
A (Slightly) Improved Approximation Algorithm for Metric TSP
Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan (University of Washington, USA) For some > 10−36 we give a randomized 3/2− approximation algorithm for metric TSP. @InProceedings{STOC21p32, author = {Anna R. Karlin and Nathan Klein and Shayan Oveis Gharan}, title = {A (Slightly) Improved Approximation Algorithm for Metric TSP}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {32--45}, doi = {10.1145/3406325.3451009}, year = {2021}, } Publisher's Version |
|
Kaufman, Tali |
STOC '21: "New Cosystolic Expanders from ..."
New Cosystolic Expanders from Tensors Imply Explicit Quantum LDPC Codes with Ω(√n logk n) Distance
Tali Kaufman and Ran J. Tessler (Bar-Ilan University, Israel; Weizmann Institute of Science, Israel) In this work we introduce a new notion of expansion in higher dimensions that is stronger than the well studied cosystolic expansion notion, and is termed Collective-cosystolic expansion. We show that tensoring two cosystolic expanders yields a new cosystolic expander, assuming one of the complexes in the product, is not only cosystolic expander, but rather a collective cosystolic expander. We then show that the well known bounded degree cosystolic expanders, the Ramanujan complexes are, in fact, collective cosystolic expanders. This enables us to construct new bounded degree cosystolic expanders, by tensoring of Ramanujan complexes. Using our new constructed bounded degree cosystolic expanders we construct explicit quantum LDPC codes of distance √n logk n for any k, improving a recent result of Evra et. al. [FOCS, 2020], and setting a new record for distance of explicit quantum LDPC codes. The work of Evra et. al. [FOCS, 2020] took advantage of the high dimensional expansion notion known as cosystolic expansion, that occurs in Ramanujan complexes. Our improvement is achieved by considering tensor product of Ramanujan complexes, and using their newly derived property, the collective cosystolic expansion. @InProceedings{STOC21p1317, author = {Tali Kaufman and Ran J. Tessler}, title = {New Cosystolic Expanders from Tensors Imply Explicit Quantum LDPC Codes with Ω(√<i>n</i> log<sup><i>k</i></sup> <i>n</i>) Distance}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1317--1329}, doi = {10.1145/3406325.3451029}, year = {2021}, } Publisher's Version |
|
Keller, Nathan |
STOC '21: "Local Concentration Inequalities ..."
Local Concentration Inequalities and Tomaszewski’s Conjecture
Nathan Keller and Ohad Klein (Bar-Ilan University, Israel) We prove Tomaszewski’s conjecture (1986): Let f:{−1,1}n → ℝ be of the form f(x)= ∑i=1n ai xi. Then Pr[|f(x)| ≤ √Var[f]] ≥ 1/2. Our main novel tools are local concentration inequalities and an improved Berry-Esseen inequality for first-degree functions on the discrete cube. These tools are of independent interest, and may be useful in the study of linear threshold functions and of low degree Boolean functions. @InProceedings{STOC21p1656, author = {Nathan Keller and Ohad Klein}, title = {Local Concentration Inequalities and Tomaszewski’s Conjecture}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1656--1669}, doi = {10.1145/3406325.3451011}, year = {2021}, } Publisher's Version |
|
Kelley, Zander |
STOC '21: "An Improved Derandomization ..."
An Improved Derandomization of the Switching Lemma
Zander Kelley (University of Illinois at Urbana-Champaign, USA) We prove a new derandomization of Håstad’s switching lemma, showing how to efficiently generate restrictions satisfying the switching lemma for DNF or CNF formulas of size m using only O(logm) random bits. Derandomizations of the switching lemma have been useful in many works as a key building-block for constructing objects which are in some way provably-pseudorandom with respect to AC0-circuits. Here, we use our new derandomization to give an improved analysis of the pseudorandom generator of Trevisan and Xue for AC0-circuits (CCC’13): we show that the generator ε-fools size-m, depth-D circuits with n-bit inputs using only O(log(m/ε)D · logn) random bits. In particular, we obtain (modulo the loglog-factors hidden in the O-notation) a dependence on m/ε which is best-possible with respect to currently-known AC0-circuit lower bounds. @InProceedings{STOC21p272, author = {Zander Kelley}, title = {An Improved Derandomization of the Switching Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {272--282}, doi = {10.1145/3406325.3451054}, year = {2021}, } Publisher's Version |
|
Khot, Subhash |
STOC '21: "Optimal Inapproximability ..."
Optimal Inapproximability of Satisfiable k-LIN over Non-Abelian Groups
Amey Bhangale and Subhash Khot (University of California at Riverside, USA; New York University, USA) A seminal result of Håstad (2001) shows that it is NP-hard to find an assignment that satisfies 1/|G|+ε fraction of the constraints of a given k-LIN instance over an abelian group, even if there is an assignment that satisfies (1−ε) fraction of the constraints, for any constant ε>0. Engebretsen, Holmerin and Russell (2004) later showed that the same hardness result holds for k-LIN instances over any finite non-abelian group. Unlike the abelian case, where we can efficiently find a solution if the instance is satisfiable, in the non-abelian case, it is NP-complete to decide if a given system of linear equations is satisfiable or not, as shown by Goldmann and Russell (1999). Surprisingly, for certain non-abelian groups G, given a satisfiable k-LIN instance over G, one can in fact do better than just outputting a random assignment using a simple but clever algorithm. The approximation factor achieved by this algorithm varies with the underlying group. In this paper, we show that this algorithm is optimal by proving a tight hardness of approximation of satisfiable k-LIN instance over any non-abelian G, assuming P≠ NP. As a corollary, we also get 3-query probabilistically checkable proofs with perfect completeness over large alphabets with improved soundness. Our proof crucially uses the quasi-random properties of the non-abelian groups defined by Gowers (2008). @InProceedings{STOC21p1615, author = {Amey Bhangale and Subhash Khot}, title = {Optimal Inapproximability of Satisfiable k-LIN over Non-Abelian Groups}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1615--1628}, doi = {10.1145/3406325.3451003}, year = {2021}, } Publisher's Version |
|
Khurana, Dakshita |
STOC '21: "SNARGs for Bounded Depth Computations ..."
SNARGs for Bounded Depth Computations and PPAD Hardness from Sub-Exponential LWE
Ruta Jawale, Yael Tauman Kalai, Dakshita Khurana, and Rachel Zhang (University of Illinois at Urbana-Champaign, USA; Microsoft Research, USA; Massachusetts Institute of Technology, USA) We construct a succinct non-interactive publicly-verifiable delegation scheme for any log-space uniform circuit under the sub-exponential Learning With Errors (LWE) assumption. For a circuit C:{0,1}N→{0,1} of size S and depth D, the prover runs in time poly(S), the communication complexity is D · polylog(S), and the verifier runs in time (D+N) ·polylog(S). To obtain this result, we introduce a new cryptographic primitive: a lossy correlation-intractable hash function family. We use this primitive to soundly instantiate the Fiat-Shamir transform for a large class of interactive proofs, including the interactive sum-check protocol and the GKR protocol, assuming the sub-exponential hardness of LWE. Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (sub-exponential) average-case hardness of PPAD, assuming the sub-exponential hardness of LWE. @InProceedings{STOC21p708, author = {Ruta Jawale and Yael Tauman Kalai and Dakshita Khurana and Rachel Zhang}, title = {SNARGs for Bounded Depth Computations and PPAD Hardness from Sub-Exponential LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {708--721}, doi = {10.1145/3406325.3451055}, year = {2021}, } Publisher's Version |
|
Kim, Isaac H. |
STOC '21: "The Ghost in the Radiation: ..."
The Ghost in the Radiation: Robust Encodings of the Black Hole Interior (Invited Paper)
Isaac H. Kim, Eugene Tang, and John Preskill (University of Sydney, Australia; California Institute of Technology, USA) We reconsider the black hole firewall puzzle, emphasizing that quantum error-correction, computational complexity, and pseudorandomness are crucial concepts for understanding the black hole interior. We assume that the Hawking radiation emitted by an old black hole is pseudorandom, meaning that it cannot be distinguished from a perfectly thermal state by any efficient quantum computation acting on the radiation alone. We then infer the existence of a subspace of the radiation system which we interpret as an encoding of the black hole interior. This encoded interior is entangled with the late outgoing Hawking quanta emitted by the old black hole, and is inaccessible to computationally bounded observers who are outside the black hole. Specifically, efficient operations acting on the radiation, those with quantum computational complexity polynomial in the entropy of the remaining black hole, commute with a complete set of logical operators acting on the encoded interior, up to corrections which are exponentially small in the entropy. Thus, under our pseudorandomness assumption, the black hole interior is well protected from exterior observers as long as the remaining black hole is macroscopic. On the other hand, if the radiation is not pseudorandom, an exterior observer may be able to create a firewall by applying a polynomial-time quantum computation to the radiation. @InProceedings{STOC21p8, author = {Isaac H. Kim and Eugene Tang and John Preskill}, title = {The Ghost in the Radiation: Robust Encodings of the Black Hole Interior (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {8--8}, doi = {10.1145/3406325.3465357}, year = {2021}, } Publisher's Version |
|
Kim, Michael P. |
STOC '21: "Outcome Indistinguishability ..."
Outcome Indistinguishability
Cynthia Dwork, Michael P. Kim, Omer Reingold, Guy N. Rothblum, and Gal Yona (Harvard University, USA; University of California at Berkeley, USA; Stanford University, USA; Weizmann Institute of Science, Israel) Prediction algorithms assign numbers to individuals that are popularly understood as individual “probabilities”—what is the probability of 5-year survival after cancer diagnosis?—and which increasingly form the basis for life-altering decisions. Drawing on an understanding of computational indistinguishability developed in complexity theory and cryptography, we introduce Outcome Indistinguishability. Predictors that are Outcome Indistinguishable (OI) yield a generative model for outcomes that cannot be efficiently refuted on the basis of the real-life observations produced by . We investigate a hierarchy of OI definitions, whose stringency increases with the degree to which distinguishers may access the predictor in question. Our findings reveal that OI behaves qualitatively differently than previously studied notions of indistinguishability. First, we provide constructions at all levels of the hierarchy. Then, leveraging recently-developed machinery for proving average-case fine-grained hardness, we obtain lower bounds on the complexity of the more stringent forms of OI. This hardness result provides the first scientific grounds for the political argument that, when inspecting algorithmic risk prediction instruments, auditors should be granted oracle access to the algorithm, not simply historical predictions. @InProceedings{STOC21p1095, author = {Cynthia Dwork and Michael P. Kim and Omer Reingold and Guy N. Rothblum and Gal Yona}, title = {Outcome Indistinguishability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1095--1108}, doi = {10.1145/3406325.3451064}, year = {2021}, } Publisher's Version |
|
Kingsford, Carl |
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design
Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case guarantee. Worst-case instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that data-driven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, application-specific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing case-by-case analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewise-constant, -linear, or—more generally—piecewise-structured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {Maria-Florina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919--932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version |
|
Klein, Nathan |
STOC '21: "A (Slightly) Improved Approximation ..."
A (Slightly) Improved Approximation Algorithm for Metric TSP
Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan (University of Washington, USA) For some > 10−36 we give a randomized 3/2− approximation algorithm for metric TSP. @InProceedings{STOC21p32, author = {Anna R. Karlin and Nathan Klein and Shayan Oveis Gharan}, title = {A (Slightly) Improved Approximation Algorithm for Metric TSP}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {32--45}, doi = {10.1145/3406325.3451009}, year = {2021}, } Publisher's Version |
|
Klein, Ohad |
STOC '21: "Local Concentration Inequalities ..."
Local Concentration Inequalities and Tomaszewski’s Conjecture
Nathan Keller and Ohad Klein (Bar-Ilan University, Israel) We prove Tomaszewski’s conjecture (1986): Let f:{−1,1}n → ℝ be of the form f(x)= ∑i=1n ai xi. Then Pr[|f(x)| ≤ √Var[f]] ≥ 1/2. Our main novel tools are local concentration inequalities and an improved Berry-Esseen inequality for first-degree functions on the discrete cube. These tools are of independent interest, and may be useful in the study of linear threshold functions and of low degree Boolean functions. @InProceedings{STOC21p1656, author = {Nathan Keller and Ohad Klein}, title = {Local Concentration Inequalities and Tomaszewski’s Conjecture}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1656--1669}, doi = {10.1145/3406325.3451011}, year = {2021}, } Publisher's Version |
|
Klein, Philip N. |
STOC '21: "A Quasipolynomial (2 + ε)-Approximation ..."
A Quasipolynomial (2 + ε)-Approximation for Planar Sparsest Cut
Vincent Cohen-Addad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor approximations are known. We consider the problem for planar graphs, and give a (2+)-approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a Sherali-Adams-style linear programming relaxation of the problem, which we then round. This should be compared with the polynomial-time approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ1-embeddings, and achieves an O(√logn)-approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent Cohen-Addad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)-Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1056--1069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version |
|
Kleinberg, Jon |
STOC '21: "Simplicity Creates Inequity: ..."
Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)
Jon Kleinberg and Sendhil Mullainathan (Cornell University, USA; University of Chicago, USA) Algorithms are increasingly used to aid, or in some cases supplant, human decision-making, particularly for decisions that hinge on predictions. As a result, two additional features in addition to prediction quality have generated interest: (i) to facilitate human interaction and understanding with these algorithms, we desire prediction functions that are in some fashion simple or interpretable; and (ii) because they influence consequential decisions, we also want them to produce equitable allocations. We develop a formal model to explore the relationship between the demands of simplicity and equity. Although the two concepts appear to be motivated by qualitatively distinct goals, we show a fundamental inconsistency between them. Specifically, we formalize a general framework for producing simple prediction functions, and in this framework we establish two basic results. First, every simple prediction function is strictly improvable: there exists a more complex prediction function that is both strictly more efficient and also strictly more equitable. Put another way, using a simple prediction function both reduces utility for disadvantaged groups and reduces overall welfare relative to other options. Second, we show that simple prediction functions necessarily create incentives to use information about individuals' membership in a disadvantaged group --- incentives that weren't present before simplification, and that work against these individuals. Thus, simplicity transforms disadvantage into bias against the disadvantaged group. Our results are not only about algorithms but about any process that produces simple models, and as such they connect to the psychology of stereotypes and to an earlier economics literature on statistical discrimination. @InProceedings{STOC21p7, author = {Jon Kleinberg and Sendhil Mullainathan}, title = {Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {7--7}, doi = {10.1145/3406325.3465356}, year = {2021}, } Publisher's Version |
|
Knop, Alexander |
STOC '21: "Statistical Query Complexity ..."
Statistical Query Complexity of Manifold Estimation
Eddie Aamari and Alexander Knop (LPSM, France; Sorbonne University, France; University of Paris, France; CNRS, France; University of California at San Diego, USA) This paper studies the statistical query (SQ) complexity of estimating d-dimensional submanifolds in ℝn. We propose a purely geometric algorithm called Manifold Propagation, that reduces the problem to three natural geometric routines: projection, tangent space estimation, and point detection. We then provide constructions of these geometric routines in the SQ framework. Given an adversarial STAT(τ) oracle and a target Hausdorff distance precision ε = Ω(τ2/(d+1)), the resulting SQ manifold reconstruction algorithm has query complexity O(n polylog(n) ε−d/2), which is proved to be nearly optimal. In the process, we establish low-rank matrix completion results for SQ’s and lower bounds for randomized SQ estimators in general metric spaces. @InProceedings{STOC21p116, author = {Eddie Aamari and Alexander Knop}, title = {Statistical Query Complexity of Manifold Estimation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {116--122}, doi = {10.1145/3406325.3451135}, year = {2021}, } Publisher's Version STOC '21: "Log-Rank and Lifting for AND-Functions ..." Log-Rank and Lifting for AND-Functions Alexander Knop, Shachar Lovett, Sam McGuire, and Weiqiang Yuan (University of California at San Diego, USA; Tsinghua University, China) Let f: {0, 1}n → {0, 1} be a boolean function, and let f∧(x, y) = f(x ∧ y) denote the AND-function of f, where x ∧ y denotes bit-wise AND. We study the deterministic communication complexity of f∧ and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f∧. This comes within a logn factor of establishing the log-rank conjecture for AND-functions with no assumptions on f. Our result stands in contrast with previous results on special cases of the log-rank conjecture, which needed significant restrictions on f such as monotonicity or low F2-degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for AND-functions, stating that the deterministic communication complexity of f∧ is polynomially related to the AND-decision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}n → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size poly-logarithmic in its sparsity. We also establish extensions of this result to multi-linear polynomials f: {0, 1}n → with a larger range. @InProceedings{STOC21p197, author = {Alexander Knop and Shachar Lovett and Sam McGuire and Weiqiang Yuan}, title = {Log-Rank and Lifting for AND-Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {197--208}, doi = {10.1145/3406325.3450999}, year = {2021}, } Publisher's Version |
|
Knudsen, Jakob Bæk Tejs |
STOC '21: "Load Balancing with Dynamic ..."
Load Balancing with Dynamic Set of Balls and Bins
Anders Aamand, Jakob Bæk Tejs Knudsen, and Mikkel Thorup (University of Copenhagen, Denmark) In dynamic load balancing, we wish to distribute balls into bins in an environment where both balls and bins can be added and removed. We want to minimize the maximum load of any bin but we also want to minimize the number of balls and bins that are affected when adding or removing a ball or a bin. We want a hashing-style solution where we given the ID of a ball can find its bin efficiently. We are given a user-specified balancing parameter c=1+ε, where ε∈ (0,1). Let n and m be the current number of balls and bins. Then we want no bin with load above C=⌈ c n/m⌉, referred to as the capacity of the bins. We present a scheme where we can locate a ball checking 1+O(log1/ε) bins in expectation. When inserting or deleting a ball, we expect to move O(1/ε) balls, and when inserting or deleting a bin, we expect to move O(C/ε) balls. Previous bounds were off by a factor 1/ε. The above bounds are best possible when C=O(1) but for larger C, we can do much better: We define f=ε C when C≤ log1/ε, f=ε√C· √log(1/(ε√C)) when log1/ε≤ C<1/2ε2, and f=1 when C≥ 1/2ε2. We show that we expect to move O(1/f) balls when inserting or deleting a ball, and O(C/f) balls when inserting or deleting a bin. Moreover, when C≥ log1/ε, we can search a ball checking only O(1) bins in expectation. For the bounds with larger C, we first have to resolve a much simpler probabilistic problem. Place n balls in m bins of capacity C, one ball at the time. Each ball picks a uniformly random non-full bin. We show that in expectation and with high probability, the fraction of non-full bins is Θ(f). Then the expected number of bins that a new ball would have to visit to find one that is not full is Θ(1/f). As it turns out, this is also the complexity of an insertion in our more complicated scheme where both balls and bins can be added and removed. @InProceedings{STOC21p1262, author = {Anders Aamand and Jakob Bæk Tejs Knudsen and Mikkel Thorup}, title = {Load Balancing with Dynamic Set of Balls and Bins}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1262--1275}, doi = {10.1145/3406325.3451107}, year = {2021}, } Publisher's Version |
|
Kociumaka, Tomasz |
STOC '21: "Improved Dynamic Algorithms ..."
Improved Dynamic Algorithms for Longest Increasing Subsequence
Tomasz Kociumaka and Saeed Seddighin (University of California at Berkeley, USA; Toyota Technological Institute at Chicago, USA) We study dynamic algorithms for the longest increasing subsequence (LIS) problem. A dynamic LIS algorithm maintains a sequence subject to operations of the following form arriving one by one: insert an element, delete an element, or substitute an element for another. After each update, the algorithm must report the length of the longest increasing subsequence of the current sequence. Our main contribution is the first exact dynamic LIS algorithm with sublinear update time. More precisely, we present a randomized algorithm that performs each operation in time Õ(n4/5) and, after each update, reports the answer to the LIS problem correctly with high probability. We use several novel techniques and observations for this algorithm that may find applications in future work. In the second part of the paper, we study approximate dynamic LIS algorithms, which are allowed to underestimate the solution size within a bounded multiplicative factor. In this setting, we give a deterministic (1−o(1))-approximation algorithm with update time O(no(1)). This result improves upon the previous work of Mitzenmacher and Seddighin (STOC’20) that provides an Ω(єO(1/є))-approximation algorithm with update time Õ(nє) for any є > 0. @InProceedings{STOC21p640, author = {Tomasz Kociumaka and Saeed Seddighin}, title = {Improved Dynamic Algorithms for Longest Increasing Subsequence}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {640--653}, doi = {10.1145/3406325.3451026}, year = {2021}, } Publisher's Version |
|
Kol, Gillat |
STOC '21: "Optimal Error Resilience of ..."
Optimal Error Resilience of Adaptive Message Exchange
Klim Efremenko, Gillat Kol, and Raghuvansh R. Saxena (Ben-Gurion University of the Negev, Israel; Princeton University, USA) We study the error resilience of the message exchange task: Two parties, each holding a private input, want to exchange their inputs. However, the channel connecting them is governed by an adversary that may corrupt a constant fraction of the transmissions. What is the maximum fraction of corruptions that still allows the parties to exchange their inputs? For the non-adaptive channel, where the parties must agree in advance on the order in which they communicate, the maximum error resilience was shown to be 1/4 (see Braverman and Rao, STOC 2011). The problem was also studied over the adaptive channel, where the order in which the parties communicate may not be predetermined (Ghaffari, Haeupler, and Sudan, STOC 2014; Efremenko, Kol, and Saxena, STOC 2020). These works show that the adaptive channel admits much richer set of protocols but leave open the question of finding its maximum error resilience. In this work, we show that the maximum error resilience of a protocol for message exchange over the adaptive channel is 5/16, thereby settling the above question. Our result requires improving both the known upper bounds and the known lower bounds for the problem. @InProceedings{STOC21p1235, author = {Klim Efremenko and Gillat Kol and Raghuvansh R. Saxena}, title = {Optimal Error Resilience of Adaptive Message Exchange}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1235--1247}, doi = {10.1145/3406325.3451077}, year = {2021}, } Publisher's Version STOC '21: "Almost Optimal Super-Constant-Pass ..." Almost Optimal Super-Constant-Pass Streaming Lower Bounds for Reachability Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu (Massachusetts Institute of Technology, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA) We give an almost quadratic n2−o(1) lower bound on the space consumption of any o(√logn)-pass streaming algorithm solving the (directed) s-t reachability problem. This means that any such algorithm must essentially store the entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional fundamental problems, including maximum matching, shortest path, matrix rank, and linear programming. Our main technical contribution is the definition and construction of set hiding graphs, that may be of independent interest: we give a general way of encoding a set S ⊆ [k] as a directed graph with n = k 1 + o( 1 ) vertices, such that deciding whether i ∈ S boils down to deciding if ti is reachable from si, for a specific pair of vertices (si,ti) in the graph. Furthermore, we prove that our graph “hides” S, in the sense that no low-space streaming algorithm with a small number of passes can learn (almost) anything about S. @InProceedings{STOC21p570, author = {Lijie Chen and Gillat Kol and Dmitry Paramonov and Raghuvansh R. Saxena and Zhao Song and Huacheng Yu}, title = {Almost Optimal Super-Constant-Pass Streaming Lower Bounds for Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {570--583}, doi = {10.1145/3406325.3451038}, year = {2021}, } Publisher's Version |
|
Kontonis, Vasilis |
STOC '21: "Efficiently Learning Halfspaces ..."
Efficiently Learning Halfspaces with Tsybakov Noise
Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, and Nikos Zarifis (University of Wisconsin-Madison, USA; University of California at San Diego, USA) We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomial-time algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of well-behaved distributions including log-concave distributions. This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the non-optimality of a candidate halfspace and gave a quasi-polynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomial-time certificate algorithm. @InProceedings{STOC21p88, author = {Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, title = {Efficiently Learning Halfspaces with Tsybakov Noise}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88--101}, doi = {10.1145/3406325.3450998}, year = {2021}, } Publisher's Version |
|
Kothari, Pravesh K. |
STOC '21: "Playing Unique Games on Certified ..."
Playing Unique Games on Certified Small-Set Expanders
Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer (Harvard University, USA; Carnegie Mellon University, USA; Stanford University, USA; ETH Zurich, Switzerland) We give an algorithm for solving unique games (UG) instances whenever low-degree sum-of-squares proofs certify good bounds on the small-set-expansion of the underlying constraint graph via a hypercontractive inequality. Our algorithm is in fact more versatile, and succeeds even when the constraint graph is not a small-set expander as long as the structure of non-expanding small sets is (informally speaking) “characterized” by a low-degree sum-of-squares proof. Our results are obtained by rounding low-entropy solutions — measured via a new global potential function — to sum-of-squares (SoS) semidefinite programs. This technique adds to the (currently short) list of general tools for analyzing SoS relaxations for worst-case optimization problems. As corollaries, we obtain the first polynomial-time algorithms for solving any UG instance where the constraint graph is either the noisy hypercube, the short code or the Johnson graph. The prior best algorithm for such instances was the eigenvalue enumeration algorithm of Arora, Barak, and Steurer (2010) which requires quasi-polynomial time for the noisy hypercube and nearly-exponential time for the short code and Johnson graphs. All of our results achieve an approximation of 1−є vs δ for UG instances, where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent of the alphabet size. @InProceedings{STOC21p1629, author = {Mitali Bafna and Boaz Barak and Pravesh K. Kothari and Tselil Schramm and David Steurer}, title = {Playing Unique Games on Certified Small-Set Expanders}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1629--1642}, doi = {10.1145/3406325.3451099}, year = {2021}, } Publisher's Version |
|
Kothari, Robin |
STOC '21: "Degree vs. Approximate Degree ..."
Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem
Scott Aaronson, Shalev Ben-David, Robin Kothari, Shravas Rao, and Avishay Tal (University of Texas at Austin, USA; University of Waterloo, Canada; Microsoft Quantum, USA; Microsoft Research, USA; Northwestern University, USA; University of California at Berkeley, USA) Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f, deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function. D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017). We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg conjecture. We show that if f is a nontrivial monotone graph property of an n-vertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal. We also show that the approximate degree of any read-once formula on n variables is Θ(√n). @InProceedings{STOC21p1330, author = {Scott Aaronson and Shalev Ben-David and Robin Kothari and Shravas Rao and Avishay Tal}, title = {Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1330--1342}, doi = {10.1145/3406325.3451047}, year = {2021}, } Publisher's Version |
|
Krauthgamer, Robert |
STOC '21: "Towards Tight Bounds for Spectral ..."
Towards Tight Bounds for Spectral Sparsification of Hypergraphs
Michael Kapralov, Robert Krauthgamer, Jakab Tardos, and Yuichi Yoshida (EPFL, Switzerland; Weizmann Institute of Science, Israel; National Institute of Informatics, Japan) Cut and spectral sparsification of graphs have numerous applications, including e.g. speeding up algorithms for cuts and Laplacian solvers. These powerful notions have recently been extended to hypergraphs, which are much richer and may offer new applications. However, the current bounds on the size of hypergraph sparsifiers are not as tight as the corresponding bounds for graphs. Our first result is a polynomial-time algorithm that, given a hypergraph on n vertices with maximum hyperedge size r, outputs an є-spectral sparsifier with O*(nr) hyperedges, where O* suppresses (є−1 logn)O(1) factors. This size bound improves the two previous bounds: O*(n3) [Soma and Yoshida, SODA’19] and O*(nr3) [Bansal, Svensson and Trevisan, FOCS’19]. Our main technical tool is a new method for proving concentration of the nonlinear analogue of the quadratic form of the Laplacians for hypergraph expanders. We complement this with lower bounds on the bit complexity of any compression scheme that (1+є)-approximates all the cuts in a given hypergraph, and hence also on the bit complexity of every є-cut/spectral sparsifier. These lower bounds are based on Ruzsa-Szemerédi graphs, and a particular instantiation yields an Ω(nr) lower bound on the bit complexity even for fixed constant є. In the case of hypergraph cut sparsifiers, this is tight up to polylogarithmic factors in n, due to recent result of [Chen, Khanna and Nagda, FOCS’20]. For spectral sparsifiers it narrows the gap to O*(r). Finally, for directed hypergraphs, we present an algorithm that computes an є-spectral sparsifier with O*(n2r3) hyperarcs, where r is the maximum size of a hyperarc. For small r, this improves over O*(n3) known from [Soma and Yoshida, SODA’19], and is getting close to the trivial lower bound of Ω(n2) hyperarcs. @InProceedings{STOC21p598, author = {Michael Kapralov and Robert Krauthgamer and Jakab Tardos and Yuichi Yoshida}, title = {Towards Tight Bounds for Spectral Sparsification of Hypergraphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {598--611}, doi = {10.1145/3406325.3451061}, year = {2021}, } Publisher's Version STOC '21: "Subcubic Algorithms for Gomory–Hu ..." Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs Amir Abboud, Robert Krauthgamer, and Ohad Trabelsi (Weizmann Institute of Science, Israel) Every undirected graph G has a (weighted) cut-equivalent tree T, commonly named after Gomory and Hu who discovered it in 1961. Both T and G have the same node set, and for every node pair s,t, the minimum (s,t)-cut in T is also an exact minimum (s,t)-cut in G. We give the first subcubic-time algorithm that constructs such a tree for a simple graph G (unweighted with no parallel edges). Its time complexity is Õ(n2.5), for n=|V(G)|; previously, only Õ(n3) was known, except for restricted cases like sparse graphs. Consequently, we obtain the first algorithm for All-Pairs Max-Flow in simple graphs that breaks the cubic-time barrier. Gomory and Hu compute this tree using n−1 queries to (single-pair) Max-Flow; the new algorithm can be viewed as a fine-grained reduction to Õ(√n) Max-Flow computations on n-node graphs. @InProceedings{STOC21p1725, author = {Amir Abboud and Robert Krauthgamer and Ohad Trabelsi}, title = {Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1725--1737}, doi = {10.1145/3406325.3451073}, year = {2021}, } Publisher's Version |
|
Krishnamurthy, Akshay |
STOC '21: "Contextual Search in the Presence ..."
Contextual Search in the Presence of Irrational Agents
Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, and Robert Schapire (Microsoft Research, USA; Harvard University, USA) We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard game-theoretic formulations of this problem assume that agents act in accordance with a specific behavioral model. In practice, some agents may not subscribe to the dominant behavioral model or may act in ways that are seemingly arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance even with a few arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis. @InProceedings{STOC21p910, author = {Akshay Krishnamurthy and Thodoris Lykouris and Chara Podimata and Robert Schapire}, title = {Contextual Search in the Presence of Irrational Agents}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {910--918}, doi = {10.1145/3406325.3451120}, year = {2021}, } Publisher's Version |
|
Kuhn, Fabian |
STOC '21: "Efficient Randomized Distributed ..."
Efficient Randomized Distributed Coloring in CONGEST
Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan (Reykjavik University, Iceland; University of Freiburg, Germany; Technion, Israel) Distributed vertex coloring is one of the classic problems and probably also the most widely studied problems in the area of distributed graph algorithms. We present a new randomized distributed vertex coloring algorithm for the standard CONGEST model, where the network is modeled as an n-node graph G, and where the nodes of G operate in synchronous communication rounds in which they can exchange O(logn)-bit messages over all the edges of G. For graphs with maximum degree Δ, we show that the (Δ+1)-list coloring problem (and therefore also the standard (Δ+1)-coloring problem) can be solved in O(log5logn) rounds. Previously such a result was only known for the significantly more powerful LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary size. The best previous (Δ+1)-coloring algorithm in the CONGEST model had a running time of O(logΔ + log6logn) rounds. As a function of n alone, the best previous algorithm therefore had a round complexity of O(logn), which is a bound that can also be achieved by a na'ive folklore algorithm. For large maximum degree Δ, our algorithm hence is an exponential improvement over the previous state of the art. @InProceedings{STOC21p1180, author = {Magnús M. Halldórsson and Fabian Kuhn and Yannic Maus and Tigran Tonoyan}, title = {Efficient Randomized Distributed Coloring in CONGEST}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1180--1193}, doi = {10.1145/3406325.3451089}, year = {2021}, } Publisher's Version |
|
Kumar, Mrinal |
STOC '21: "Decoding Multivariate Multiplicity ..."
Decoding Multivariate Multiplicity Codes on Product Sets
Siddharth Bhandari, Prahladh Harsha, Mrinal Kumar, and Madhu Sudan (Tata Institute of Fundamental Research, India; IIT Bombay, India; Harvard University, USA) The multiplicity Schwartz-Zippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct high-rate locally-decodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial list-decodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in . @InProceedings{STOC21p1489, author = {Siddharth Bhandari and Prahladh Harsha and Mrinal Kumar and Madhu Sudan}, title = {Decoding Multivariate Multiplicity Codes on Product Sets}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1489--1501}, doi = {10.1145/3406325.3451027}, year = {2021}, } Publisher's Version |
|
Kumar, Ravi |
STOC '21: "Sample-Efficient Proper PAC ..."
Sample-Efficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d6), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2O(d) on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {Sample-Efficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183--196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version |
|
Kuszmaul, William |
STOC '21: "How Asymmetry Helps Buffer ..."
How Asymmetry Helps Buffer Management: Achieving Optimal Tail Size in Cup Games
William Kuszmaul (Massachusetts Institute of Technology, USA) The cup game on n cups is a multi-step game with two players, a filler and an emptier. At each step, the filler distributes 1 unit of water among the cups, and then the emptier selects a single cup to remove (up to) 1 unit of water from. There are several objective functions that the emptier might wish to minimize. One of the strongest guarantees would be to minimize tail size, which is defined to be the number of cups with fill 2 or greater. A simple lower-bound construction shows that the optimal tail size for deterministic emptying algorithms is Θ(n), however. We present a simple randomized emptying algorithm that achieves tail size Õ(logn) with high probability in n for poly n steps. Moreover, we show that this is tight up to doubly logarithmic factors. We also extend our results to the multi-processor cup game, achieving tail size Õ(logn + p) on p processors with high probability in n. We show that the dependence on p is near optimal for any emptying algorithm that achieves polynomial-bounded backlog. A natural question is whether our results can be extended to give unending guarantees, which apply to arbitrarily long games. We give a lower bound construction showing that no monotone memoryless emptying algorithm can achieve an unending guarantee on either tail size or the related objective function of backlog. On the other hand, we show that even a very small (i.e., 1 / poly n) amount of resource augmentation is sufficient to overcome this barrier. @InProceedings{STOC21p1248, author = {William Kuszmaul}, title = {How Asymmetry Helps Buffer Management: Achieving Optimal Tail Size in Cup Games}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1248--1261}, doi = {10.1145/3406325.3451033}, year = {2021}, } Publisher's Version |
|
Laddha, Aditi |
STOC '21: "Reducing Isotropy and Volume ..."
Reducing Isotropy and Volume to KLS: An O*(n3ψ2) Volume Algorithm
He Jia, Aditi Laddha, Yin Tat Lee, and Santosh Vempala (Georgia Institute of Technology, USA; University of Washington, USA; Microsoft Research, USA) We show that the volume of a convex body in Rn in the general membership oracle model can be computed to within relative error ε using O(n3ψ2/ε2) oracle queries, where ψ is the KLS constant. With the current bound of ψ=O(no(1)), this gives an O(n3+o(1)/ε2) algorithm, the first improvement on the Lovász-Vempala O(n4/ε2) algorithm from 2003. The main new ingredient is an O(n3ψ2) algorithm for isotropic transformation, following which we can apply the O(n3/ε2) volume algorithm of Cousins and Vempala for well-rounded convex bodies. A positive resolution of the KLS conjecture would imply an O(n3/є2) volume algorithm. We also give an efficient implementation of the new algorithm for convex polytopes defined by m inequalities in Rn: polytope volume can be estimated in time O(mnc/ε2) where c<3.2 depends on the current matrix multiplication exponent and improves on the previous best bound. @InProceedings{STOC21p961, author = {He Jia and Aditi Laddha and Yin Tat Lee and Santosh Vempala}, title = {Reducing Isotropy and Volume to KLS: An <i>O</i>*(<i>n</i><sup>3</sup><i>ψ</i><sup>2</sup>) Volume Algorithm}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {961--974}, doi = {10.1145/3406325.3451018}, year = {2021}, } Publisher's Version |
|
Langley, Zachary |
STOC '21: "A Framework for Dynamic Matching ..."
A Framework for Dynamic Matching in Weighted Graphs
Aaron Bernstein, Aditi Dudeja, and Zachary Langley (Rutgers University, USA) We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap. For bipartite graphs, we almost completely settle the gap with a general reduction that converts any algorithm for α-approximate unweighted matching to an algorithm for (1−)α-approximate weighted matching, while only increasing the update time by an O(logn) factor for constant . We also show that our framework leads to significant improvements for non-bipartite graphs, though not in the form of a universal reduction. In particular, we give two algorithms for weighted non-bipartite matching: 1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)-approximate maximum weight matching in worst-case update time O(polylog n) with high probability against an adaptive adversary. Our bounds are essentially the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic fully dynamic algorithm that maintains a (2/3−)-approximate maximum weight matching in amortized update time O(m1/4). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein and Stein [SODA 2016]. A key feature of our framework is that it uses existing algorithms for unweighted matching as black-boxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models. @InProceedings{STOC21p668, author = {Aaron Bernstein and Aditi Dudeja and Zachary Langley}, title = {A Framework for Dynamic Matching in Weighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {668--681}, doi = {10.1145/3406325.3451113}, year = {2021}, } Publisher's Version |
|
Lazos, Philip |
STOC '21: "Efficient Two-Sided Markets ..."
Efficient Two-Sided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for two-sided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with near-optimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimum-possible amount of prior information, from each seller distribution is sufficient for a large class of two-sided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any one-sided mechanism into a two-sided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackbox-type mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient Two-Sided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1452--1465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version |
|
Le, Hung |
STOC '21: "Clan Embeddings into Trees, ..."
Clan Embeddings into Trees, and Low Treewidth Graphs
Arnold Filtser and Hung Le (Columbia University, USA; University of Massachusetts, USA) In low distortion metric embeddings, the goal is to embed a host “hard” metric space into a “simpler” target space while approximately preserving pairwise distances. A highly desirable target space is that of a tree metric. Unfortunately, such embedding will result in a huge distortion. A celebrated bypass to this problem is stochastic embedding with logarithmic expected distortion. Another bypass is Ramsey-type embedding, where the distortion guarantee applies only to a subset of the points. However, both these solutions fail to provide an embedding into a single tree with a worst-case distortion guarantee on all pairs. In this paper, we propose a novel third bypass called clan embedding. Here each point x is mapped to a subset of points f(x), called a clan, with a special chief point χ(x)∈ f(x). The clan embedding has multiplicative distortion t if for every pair (x,y) some copy y′∈ f(y) in the clan of y is close to the chief of x: miny′∈ f(y)d(y′,χ(x))≤ t· d(x,y). Our first result is a clan embedding into a tree with multiplicative distortion O(logn/є) such that each point has 1+є copies (in expectation). In addition, we provide a “spanning” version of this theorem for graphs and use it to devise the first compact routing scheme with constant size routing tables. We then focus on minor-free graphs of diameter prameterized by D, which were known to be stochastically embeddable into bounded treewidth graphs with expected additive distortion є D. We devise Ramsey-type embedding and clan embedding analogs of the stochastic embedding. We use these embeddings to construct the first (bicriteria quasi-polynomial time) approximation scheme for the metric ρ-dominating set and metric ρ-independent set problems in minor-free graphs. @InProceedings{STOC21p342, author = {Arnold Filtser and Hung Le}, title = {Clan Embeddings into Trees, and Low Treewidth Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {342--355}, doi = {10.1145/3406325.3451043}, year = {2021}, } Publisher's Version |
|
Leake, Jonathan |
STOC '21: "Sampling Matrices from Harish-Chandra–Itzykson–Zuber ..."
Sampling Matrices from Harish-Chandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy
Jonathan Leake, Colin McSwiggen, and Nisheeth K. Vishnoi (TU Berlin, Germany; University of Tokyo, Japan; Yale University, USA) Given two Hermitian matrices Y and Λ, the Harish-Chandra–Itzykson–Zuber (HCIZ) distribution is given by the density eTr(U Λ U*Y) with respect to the Haar measure on the unitary group. Random unitary matrices distributed according to the HCIZ distribution are important in various settings in physics and random matrix theory, but the problem of sampling efficiently from this distribution has remained open. We present two algorithms to sample matrices from distributions that are close to the HCIZ distribution. The first produces samples that are ξ-close in total variation distance, and the number of arithmetic operations required depends on poly(log1/ξ). The second produces samples that are ξ-close in infinity divergence, but with a poly(1/ξ) dependence. Our results have the following applications: 1) an efficient algorithm to sample from complex versions of matrix Langevin distributions studied in statistics, 2) an efficient algorithm to sample from continuous maximum entropy distributions over unitary orbits, which in turn implies an efficient algorithm to sample a pure quantum state from the entropy-maximizing ensemble representing a given density matrix, and 3) an efficient algorithm for differentially private rank-k approximation that comes with improved utility bounds for k>1. @InProceedings{STOC21p1384, author = {Jonathan Leake and Colin McSwiggen and Nisheeth K. Vishnoi}, title = {Sampling Matrices from Harish-Chandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1384--1397}, doi = {10.1145/3406325.3451094}, year = {2021}, } Publisher's Version STOC '21: "Capacity Lower Bounds via ..." Capacity Lower Bounds via Productization Leonid Gurvits and Jonathan Leake (City College of New York, USA; TU Berlin, Germany) We give a sharp lower bound on the capacity of a real stable polynomial, depending only on the value of its gradient at x = 1. This result implies a sharp improvement to a similar inequality proved by Linial-Samorodnitsky-Wigderson in 2000, which was crucial to the analysis of their permanent approximation algorithm. Such inequalities have played an important role in the recent work on operator scaling and its generalizations and applications, and in fact we use our bound to construct a new scaling algorithm for real stable polynomials. Our bound is also quite similar to one used very recently by Karlin-Klein-Oveis Gharan to give an improved approximation factor for metric TSP. The new technique we develop to prove this bound is productization, which says that any real stable polynomial can be approximated at any point in the positive orthant by a product of linear forms. Beyond the results of this paper, our main hope is that this new technique will allow us to avoid ”frightening technicalities”, in the words of Laurent and Schrijver, that often accompany combinatorial lower bounds. In particular, we believe that this technique will be useful towards simplifying and improving further the approximation factor given in the fantastic work of Karlin-Klein-Oveis Gharan on metric TSP. @InProceedings{STOC21p847, author = {Leonid Gurvits and Jonathan Leake}, title = {Capacity Lower Bounds via Productization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {847--858}, doi = {10.1145/3406325.3451105}, year = {2021}, } Publisher's Version |
|
Lee, Euiwoong |
STOC '21: "A Framework for Quadratic ..."
A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations
Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor (Institute for Advanced Study at Princeton, USA; Princeton University, USA; University of Michigan, USA) We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an origin-symmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using case-specific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constant-approximability necessitates that K has type-2 constant that grows slowly with n. However, we show that even when the type-2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type-2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constant-factor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type-2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type-2 constant nearly characterizes the approximability of quadratic maximization. @InProceedings{STOC21p870, author = {Vijay Bhattiprolu and Euiwoong Lee and Assaf Naor}, title = {A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {870--881}, doi = {10.1145/3406325.3451128}, year = {2021}, } Publisher's Version |
|
Lee, Yin Tat |
STOC '21: "Minimum Cost Flows, MDPs, ..."
Minimum Cost Flows, MDPs, and ℓ1-Regression in Nearly Linear Time for Dense Instances
Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang (KTH, Sweden; University of Washington, USA; Microsoft Research, USA; Stanford University, USA; University of Michigan, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA; Google Research, USA) In this paper we provide new randomized algorithms with improved runtimes for solving linear programs with two-sided constraints. In the special case of the minimum cost flow problem on n-vertex m-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in Õ(m + n1.5) time. This improves upon the previous best runtime of Õ(m √n) [Lee-Sidford’14] and, in the special case of unit-capacity maximum flow, improves upon the previous best runtimes of m4/3 + o(1) [Liu-Sidford’20, Kathuria’20] and Õ(m √n) [Lee-Sidford’14] for sufficiently dense graphs. In the case of ℓ1-regression in a matrix with n-columns and m-rows we obtain a randomized method which computes an є-approximate solution in Õ(mn + n2.5) time. This yields a randomized method which computes an є-optimal policy of a discounted Markov Decision Process with S states and, A actions per state in time Õ(S2 A + S2.5). These methods improve upon the previous best runtimes of methods which depend polylogarithmically on problem parameters, which were Õ(mn1.5) [Lee-Sidford’15] and Õ(S2.5 A) [Lee-Sidford’14, Sidford-Wang-Wu-Ye’18] respectively. To obtain this result we introduce two new algorithmic tools of possible independent interest. First, we design a new general interior point method for solving linear programs with two sided constraints which combines techniques from [Lee-Song-Zhang’19, Brand et al.’20] to obtain a robust stochastic method with iteration count nearly the square root of the smaller dimension. Second, to implement this method we provide dynamic data structures for efficiently maintaining approximations to variants of Lewis-weights, a fundamental importance measure for matrices which generalize leverage scores and effective resistances. @InProceedings{STOC21p859, author = {Jan van den Brand and Yin Tat Lee and Yang P. Liu and Thatchaphol Saranurak and Aaron Sidford and Zhao Song and Di Wang}, title = {Minimum Cost Flows, MDPs, and ℓ<sub>1</sub>-Regression in Nearly Linear Time for Dense Instances}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {859--869}, doi = {10.1145/3406325.3451108}, year = {2021}, } Publisher's Version STOC '21: "Reducing Isotropy and Volume ..." Reducing Isotropy and Volume to KLS: An O*(n3ψ2) Volume Algorithm He Jia, Aditi Laddha, Yin Tat Lee, and Santosh Vempala (Georgia Institute of Technology, USA; University of Washington, USA; Microsoft Research, USA) We show that the volume of a convex body in Rn in the general membership oracle model can be computed to within relative error ε using O(n3ψ2/ε2) oracle queries, where ψ is the KLS constant. With the current bound of ψ=O(no(1)), this gives an O(n3+o(1)/ε2) algorithm, the first improvement on the Lovász-Vempala O(n4/ε2) algorithm from 2003. The main new ingredient is an O(n3ψ2) algorithm for isotropic transformation, following which we can apply the O(n3/ε2) volume algorithm of Cousins and Vempala for well-rounded convex bodies. A positive resolution of the KLS conjecture would imply an O(n3/є2) volume algorithm. We also give an efficient implementation of the new algorithm for convex polytopes defined by m inequalities in Rn: polytope volume can be estimated in time O(mnc/ε2) where c<3.2 depends on the current matrix multiplication exponent and improves on the previous best bound. @InProceedings{STOC21p961, author = {He Jia and Aditi Laddha and Yin Tat Lee and Santosh Vempala}, title = {Reducing Isotropy and Volume to KLS: An <i>O</i>*(<i>n</i><sup>3</sup><i>ψ</i><sup>2</sup>) Volume Algorithm}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {961--974}, doi = {10.1145/3406325.3451018}, year = {2021}, } Publisher's Version STOC '21: "A Nearly-Linear Time Algorithm ..." A Nearly-Linear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path Sally Dong, Yin Tat Lee, and Guanghao Ye (University of Washington, USA; Microsoft Research, USA) Arising from structural graph theory, treewidth has become a focus of study in fixed-parameter tractable algorithms. Many NP-hard problems are known to be solvable in O(n · 2O(τ)) time, where τ is the treewidth of the input graph. Analogously, many problems in P should be solvable in O(n · τO(1)) time; however, due to the lack of appropriate tools, only a few such results are currently known. In our paper, we show this holds for linear programs: Given a linear program of the form minAx=b,ℓ ≤ x≤ u c⊤ x whose dual graph GA has treewidth τ, and a corresponding width-τ tree decomposition, we show how to solve it in time O(n · τ2 log(1/ε)), where n is the number of variables and ε is the relative accuracy. When a tree decomposition is not given, we use existing techniques in vertex separators to obtain algorithms with O(n · τ4 log(1/ε)) and O(n · τ2 log(1/ε) + n1.5) run-times. Besides being the first of its kind, our algorithm has run-time nearly matching the fastest run-time for solving the sub-problem Ax=b (under the assumption that no fast matrix multiplication is used). We obtain these results by combining recent techniques in interior-point methods (IPMs), sketching, and a novel representation of the solution under a multiscale basis similar to the wavelet basis. This representation further yields the first IPM with o(rank(A)) time per iteration when the treewidth is small. @InProceedings{STOC21p1784, author = {Sally Dong and Yin Tat Lee and Guanghao Ye}, title = {A Nearly-Linear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1784--1797}, doi = {10.1145/3406325.3451056}, year = {2021}, } Publisher's Version |
|
Leme, Renato Paes |
STOC '21: "Combinatorial Bernoulli Factories: ..."
Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes
Rad Niazadeh, Renato Paes Leme, and Jon Schneider (University of Chicago, USA; Google Research, USA) A Bernoulli factory is an algorithmic procedure for exact sampling of certain random variables having only Bernoulli access to their parameters. Bernoulli access to a parameter p ∈ [0,1] means the algorithm does not know p, but has sample access to independent draws of a Bernoulli random variable with mean equal to p. In this paper, we study the problem of Bernoulli factories for polytopes: given Bernoulli access to a vector x∈ P for a given polytope P⊂ [0,1]n, output a randomized vertex such that the expected value of the i-th coordinate is exactly equal to xi. For example, for the special case of the perfect matching polytope, one is given Bernoulli access to the entries of a doubly stochastic matrix [xij] and asked to sample a matching such that the probability of each edge (i,j) be present in the matching is exactly equal to xij. We show that a polytope P admits a Bernoulli factory if and and only if P is the intersection of [0,1]n with an affine subspace. Our construction is based on an algebraic formulation of the problem, involving identifying a family of Bernstein polynomials (one per vertex) that satisfy a certain algebraic identity on P. The main technical tool behind our construction is a connection between these polynomials and the geometry of zonotope tilings. We apply these results to construct an explicit factory for the perfect matching polytope. The resulting factory is deeply connected to the combinatorial enumeration of arborescences and may be of independent interest. For the k-uniform matroid polytope, we recover a sampling procedure known in statistics as Sampford sampling. @InProceedings{STOC21p833, author = {Rad Niazadeh and Renato Paes Leme and Jon Schneider}, title = {Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {833--846}, doi = {10.1145/3406325.3451072}, year = {2021}, } Publisher's Version |
|
Leonardi, Stefano |
STOC '21: "Efficient Two-Sided Markets ..."
Efficient Two-Sided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for two-sided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with near-optimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimum-possible amount of prior information, from each seller distribution is sufficient for a large class of two-sided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any one-sided mechanism into a two-sided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackbox-type mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient Two-Sided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1452--1465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version STOC '21: "Flow Time Scheduling with ..." Flow Time Scheduling with Uncertain Processing Time Yossi Azar, Stefano Leonardi, and Noam Touitou (Tel Aviv University, Israel; Sapienza University of Rome, Italy) We consider the problem of online scheduling on a single machine in order to minimize weighted flow time. The existing algorithms for this problem (STOC ’01, SODA ’03, FOCS ’18) all require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in real-life scenarios. In this paper, we present the first algorithm for weighted flow time which do not require exact knowledge of the processing times of jobs. Specifically, we introduce the Scheduling with Predicted Processing Time (SPPT) problem, where the algorithm is given a prediction for the processing time of each job, instead of its real processing time. For the case of a constant factor distortion between the predictions and the real processing time, our algorithms match all the best known competitiveness bounds for weighted flow time – namely O(logP), O(logD) and O(logW), where P,D,W are the maximum ratios of processing times, densities, and weights, respectively. For larger errors, the competitiveness of our algorithms degrades gracefully. @InProceedings{STOC21p1070, author = {Yossi Azar and Stefano Leonardi and Noam Touitou}, title = {Flow Time Scheduling with Uncertain Processing Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1070--1080}, doi = {10.1145/3406325.3451023}, year = {2021}, } Publisher's Version |
|
Levin, Leonid A. |
STOC '21: "Climbing Algorithms (Invited ..."
Climbing Algorithms (Invited Talk)
Leonid A. Levin (Boston University, USA) NP (search) problems allow easy correctness tests for solutions. Climbing algorithms allow also easy assessment of how close to yielding the correct answer is the configuration at any stage of their run. This offers a great flexibility, as how sensible is any deviation from the standard procedures can be instantly assessed. An example is the Dual Matrix Algorithm (DMA) for linear programming, variations of which were considered by A.Y.Levin in 1965 and by Yamnitsky and myself in 1982. It has little sensitivity to numerical errors and to the number of inequalities. It offers substantial flexibility and, thus, potential for further developments. @InProceedings{STOC21p2, author = {Leonid A. Levin}, title = {Climbing Algorithms (Invited Talk)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {2--3}, doi = {10.1145/3406325.3457137}, year = {2021}, } Publisher's Version |
|
Li, Jason |
STOC '21: "Deterministic Mincut in Almost-Linear ..."
Deterministic Mincut in Almost-Linear Time
Jason Li (Carnegie Mellon University, USA) We present a deterministic (global) mincut algorithm for weighted, undirected graphs that runs in m1+o(1) time, answering an open question of Karger from the 1990s. To obtain our result, we de-randomize the construction of the skeleton graph in Karger’s near-linear time mincut algorithm, which is its only randomized component. In particular, we partially de-randomize the well-known Benczur-Karger graph sparsification technique by random sampling, which we accomplish by the method of pessimistic estimators. Our main technical component is designing an efficient pessimistic estimator to capture the cuts of a graph, which involves harnessing the expander decomposition framework introduced in recent work by Goranci et al. (SODA 2021). As a side-effect, we obtain a structural representation of all approximate mincuts in a graph, which may have future applications. @InProceedings{STOC21p384, author = {Jason Li}, title = {Deterministic Mincut in Almost-Linear Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {384--395}, doi = {10.1145/3406325.3451114}, year = {2021}, } Publisher's Version STOC '21: "A Quasipolynomial (2 + ε)-Approximation ..." A Quasipolynomial (2 + ε)-Approximation for Planar Sparsest Cut Vincent Cohen-Addad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor approximations are known. We consider the problem for planar graphs, and give a (2+)-approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a Sherali-Adams-style linear programming relaxation of the problem, which we then round. This should be compared with the polynomial-time approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ1-embeddings, and achieves an O(√logn)-approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent Cohen-Addad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)-Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1056--1069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version STOC '21: "Approximate Gomory–Hu Tree ..." Approximate Gomory–Hu Tree Is Faster Than n – 1 Max-Flows Jason Li and Debmalya Panigrahi (Carnegie Mellon University, USA; Duke University, USA) The Gomory-Hu tree or cut tree (Gomory and Hu, 1961) is a classic data structure for reporting s−t mincuts (and by duality, the values of s−t maxflows) for all pairs of vertices s and t in an undirected graph. Gomory and Hu showed that it can be computed using n−1 exact maxflow computations. Surprisingly, this remains the best algorithm for Gomory-Hu trees more than 50 years later, even for approximate mincuts. In this paper, we break this longstanding barrier and give an algorithm for computing a (1+є)-approximate Gomory-Hu tree using log(n) maxflow computations. Specifically, we obtain the runtime bounds we describe below. We obtain a randomized (Monte Carlo) algorithm for undirected, weighted graphs that runs in Õ(m + n3/2) time and returns a (1+є)-approximate Gomory-Hu tree algorithm whp. Previously, the best running time known was Õ(n5/2), which is obtained by running Gomory and Hu’s original algorithm on a cut sparsifier of the graph. Next, we obtain a randomized (Monte Carlo) algorithm for undirected, unweighted graphs that runs in m4/3+o(1) time and returns a (1+є)-approximate Gomory-Hu tree algorithm whp. This improves on our first result for sparse graphs, namely m = o(n9/8). Previously, the best running time known for unweighted graphs was Õ(mn) for an exact Gomory-Hu tree (Bhalgat et al., STOC 2007); no better result was known if approximations are allowed. As a consequence of our Gomory-Hu tree algorithms, we also solve the (1+є)-approximate all pairs mincut and single source mincut problems in the same time bounds. (These problems are simpler in that the goal is to only return the s−t mincut values, and not the mincuts.) This improves on the recent algorithm for these problems in Õ(n2) time due to Abboud et al. (FOCS 2020). @InProceedings{STOC21p1738, author = {Jason Li and Debmalya Panigrahi}, title = {Approximate Gomory–Hu Tree Is Faster Than <i>n</i> – 1 Max-Flows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1738--1748}, doi = {10.1145/3406325.3451112}, year = {2021}, } Publisher's Version STOC '21: "Vertex Connectivity in Poly-logarithmic ..." Vertex Connectivity in Poly-logarithmic Max-Flows Jason Li, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Sorrachai Yingchareonthawornchai (Carnegie Mellon University, USA; University of Copenhagen, Denmark; KTH, Sweden; Duke University, USA; University of Michigan, USA; Aalto University, Finland) The vertex connectivity of an m-edge n-vertex undirected graph is the smallest number of vertices whose removal disconnects the graph, or leaves only a singleton vertex. In this paper, we give a reduction from the vertex connectivity problem to a set of maxflow instances. Using this reduction, we can solve vertex connectivity in (mα) time for any α ≥ 1, if there is a mα-time maxflow algorithm. Using the current best maxflow algorithm that runs in m4/3+o(1) time (Kathuria, Liu and Sidford, FOCS 2020), this yields a m4/3+o(1)-time vertex connectivity algorithm. This is the first improvement in the running time of the vertex connectivity problem in over 20 years, the previous best being an Õ(mn)-time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm with an o(mn) running time was known before our work, even if we assume an (m)-time maxflow algorithm. Our new technique is robust enough to also improve the best Õ(mn)-time bound for directed vertex connectivity to mn1−1/12+o(1) time @InProceedings{STOC21p317, author = {Jason Li and Danupon Nanongkai and Debmalya Panigrahi and Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai}, title = {Vertex Connectivity in Poly-logarithmic Max-Flows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {317--329}, doi = {10.1145/3406325.3451088}, year = {2021}, } Publisher's Version |
|
Li, Ray |
STOC '21: "Settling SETH vs. Approximate ..."
Settling SETH vs. Approximate Sparse Directed Unweighted Diameter (up to (NU)NSETH)
Ray Li (Stanford University, USA) We prove several tight results on the fine-grained complexity of approximating the diameter of a graph. First, we prove that, for any ε>0, assuming the Strong Exponential Time Hypothesis (SETH), there are no near-linear time 2−ε-approximation algorithms for the Diameter of a sparse directed graph, even in unweighted graphs. This result shows that a simple near-linear time 2-approximation algorithm for Diameter is optimal under SETH, answering a question from a survey of Rubinstein and Vassilevska-Williams (SIGACT ’19) for the case of directed graphs. In the same survey, Rubinstein and Vassilevska-Williams also asked if it is possible to show that there are no 2−ε approximation algorithms for Diameter in a directed graph in O(n1.499) time. We show that, assuming a hypothesis called NSETH, one cannot use a deterministic SETH-based reduction to rule out the existence of such algorithms. Extending the techniques in these two results, we characterize whether a 2−ε approximation algorithm running in time O(n1+δ) for the Diameter of a sparse directed unweighted graph can be ruled out by a deterministic SETH-based reduction for every δ∈(0,1) and essentially every ε∈(0,1), assuming NSETH. This settles the SETH-hardness of approximating the diameter of sparse directed unweighted graphs for deterministic reductions, up to NSETH. We make the same characterization for randomized SETH-based reductions, assuming another hypothesis called NUNSETH. We prove additional hardness and non-reducibility results for undirected graphs. @InProceedings{STOC21p1684, author = {Ray Li}, title = {Settling SETH vs. Approximate Sparse Directed Unweighted Diameter (up to (NU)NSETH)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1684--1696}, doi = {10.1145/3406325.3451045}, year = {2021}, } Publisher's Version |
|
Li, Yingkai |
STOC '21: "Revelation Gap for Pricing ..."
Revelation Gap for Pricing from Samples
Yiding Feng, Jason D. Hartline, and Yingkai Li (Northwestern University, USA) This paper considers prior-independent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a non-trivial revelation gap for revenue maximization. Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the sample-bid mechanism (a simple non-truthful mechanism) and upper-bound its prior-independent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve prior-independent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a non-trivial revelation gap is shown as the sample-bid mechanism outperforms the optimal prior-independent truthful mechanism. On the hardness side, we prove that no (possibly non-truthful) mechanism can achieve prior-independent approximation ratio better than 1.073 even for uniform distributions. @InProceedings{STOC21p1438, author = {Yiding Feng and Jason D. Hartline and Yingkai Li}, title = {Revelation Gap for Pricing from Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1438--1451}, doi = {10.1145/3406325.3451057}, year = {2021}, } Publisher's Version |
|
Ligett, Katrina |
STOC '21: "A New Analysis of Differential ..."
A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)
Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld (University of Pennsylvania, USA; Hebrew University of Jerusalem, Israel) We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds. @InProceedings{STOC21p9, author = {Christopher Jung and Katrina Ligett and Seth Neel and Aaron Roth and Saeed Sharifi-Malvajerdi and Moshe Shenfeld}, title = {A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {9--9}, doi = {10.1145/3406325.3465358}, year = {2021}, } Publisher's Version |
|
Lin, Bingkai |
STOC '21: "Constant Approximating k-Clique ..."
Constant Approximating k-Clique Is W[1]-Hard
Bingkai Lin (Nanjing University, China) For every graph G, let ω(G) be the largest size of complete subgraph in G. This paper presents a simple algorithm which, on input a graph G, a positive integer k and a small constant є>0, outputs a graph G′ and an integer k′ in 2Θ(k5)· |G|O(1)-time such that (1) k′≤ 2Θ(k5), (2) if ω(G)≥ k, then ω(G′)≥ k′, (3) if ω(G)<k, then ω(G′)< (1−є)k′. This implies that no f(k)· |G|O(1)-time algorithm can distinguish between the cases ω(G)≥ k and ω(G)<k/c for any constant c≥ 1 and computable function f, unless FPT= W[1]. @InProceedings{STOC21p1749, author = {Bingkai Lin}, title = {Constant Approximating k-Clique Is W[1]-Hard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1749--1756}, doi = {10.1145/3406325.3451016}, year = {2021}, } Publisher's Version |
|
Lin, Huijia |
STOC '21: "Indistinguishability Obfuscation ..."
Indistinguishability Obfuscation from Well-Founded Assumptions
Aayush Jain, Huijia Lin, and Amit Sahai (University of California at Los Angeles, USA; University of Washington, USA) Indistinguishability obfuscation, introduced by [Barak et. al. Crypto 2001], aims to compile programs into unintelligible ones while preserving functionality. It is a fascinating and powerful object that has been shown to enable a host of new cryptographic goals and beyond. However, constructions of indistinguishability obfuscation have remained elusive, with all other proposals relying on heuristics or newly conjectured hardness assumptions. In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four well-founded assumptions. We prove: Informal Theorem: Let τ ∈ (0,∞), δ ∈ (0,1), ∈ (0,1) be arbitrary constants. Assume sub-exponential security of the following assumptions: - the Learning With Errors (LWE) assumption with subexponential modulus-to-noise ratio 2kє and noises of magnitude polynomial in k, where k is the dimension of the LWE secret, - the Learning Parity with Noise (LPN) assumption over general prime fields ℤp with polynomially many LPN samples and error rate 1/ℓδ, where ℓ is the dimension of the LPN secret, - the existence of a Boolean Pseudo-Random Generator (PRG) in NC0 with stretch n1+τ, where n is the length of the PRG seed, - the Decision Linear (DLIN) assumption on symmetric bilinear groups of prime order. Then, (subexponentially secure) indistinguishability obfuscation for all polynomial-size circuits exists. Further, assuming only polynomial security of the aforementioned assumptions, there exists collusion resistant public-key functional encryption for all polynomial-size circuits. @InProceedings{STOC21p60, author = {Aayush Jain and Huijia Lin and Amit Sahai}, title = {Indistinguishability Obfuscation from Well-Founded Assumptions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {60--73}, doi = {10.1145/3406325.3451093}, year = {2021}, } Publisher's Version |
|
Liu, Allen |
STOC '21: "Settling the Robust Learnability ..."
Settling the Robust Learnability of Mixtures of Gaussians
Allen Liu and Ankur Moitra (Massachusetts Institute of Technology, USA) This work represents a natural coalescence of two important lines of work – learning mixtures of Gaussians and algorithmic robust statistics. In particular we give the first provably robust algorithm for learning mixtures of any constant number of Gaussians. We require only mild assumptions on the mixing weights (bounded fractionality) and that the total variation distance between components is bounded away from zero. At the heart of our algorithm is a new method for proving dimension-independent polynomial identifiability through applying a carefully chosen sequence of differential operations to certain generating functions that not only encode the parameters we would like to learn but also the system of polynomial equations we would like to solve. We show how the symbolic identities we derive can be directly used to analyze a natural sum-of-squares relaxation. @InProceedings{STOC21p518, author = {Allen Liu and Ankur Moitra}, title = {Settling the Robust Learnability of Mixtures of Gaussians}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {518--531}, doi = {10.1145/3406325.3451084}, year = {2021}, } Publisher's Version |
|
Liu, Kuikui |
STOC '21: "Optimal Mixing of Glauber ..."
Optimal Mixing of Glauber Dynamics: Entropy Factorization via High-Dimensional Expansion
Zongchen Chen, Kuikui Liu, and Eric Vigoda (Georgia Institute of Technology, USA; University of Washington, USA) We prove an optimal mixing time bound for the single-site update Markov chain known as the Glauber dynamics or Gibbs sampling in a variety of settings. Our work presents an improved version of the spectral independence approach of Anari et al. (2020) and shows O(nlogn) mixing time on any n-vertex graph of bounded degree when the maximum eigenvalue of an associated influence matrix is bounded. As an application of our results, for the hard-core model on independent sets weighted by a fugacity λ, we establish O(nlogn) mixing time for the Glauber dynamics on any n-vertex graph of constant maximum degree Δ when λ<λc(Δ) where λc(Δ) is the critical point for the uniqueness/non-uniqueness phase transition on the Δ-regular tree. More generally, for any antiferromagnetic 2-spin system we prove O(nlogn) mixing time of the Glauber dynamics on any bounded degree graph in the corresponding tree uniqueness region. Our results apply more broadly; for example, we also obtain O(nlogn) mixing for q-colorings of triangle-free graphs of maximum degree Δ when the number of colors satisfies q > α Δ where α ≈ 1.763, and O(mlogn) mixing for generating random matchings of any graph with bounded degree and m edges. Our approach is based on two steps. First, we show that the approximate tensorization of entropy (i.e., factorizing entropy into single vertices), which is a key step for establishing the modified log-Sobolev inequality in many previous works, can be deduced from entropy factorization into blocks of fixed linear size. Second, we adapt the local-to-global scheme of Alev and Lau (2020) to establish such block factorization of entropy in a more general setting of pure weighted simplicial complexes satisfying local spectral expansion; this also substantially generalizes the result of Cryan et al. (2019). @InProceedings{STOC21p1537, author = {Zongchen Chen and Kuikui Liu and Eric Vigoda}, title = {Optimal Mixing of Glauber Dynamics: Entropy Factorization via High-Dimensional Expansion}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1537--1550}, doi = {10.1145/3406325.3451035}, year = {2021}, } Publisher's Version STOC '21: "Log-Concave Polynomials IV: ..." Log-Concave Polynomials IV: Approximate Exchange, Tight Mixing Times, and Near-Optimal Sampling of Forests Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and Thuy-Duong Vuong (Stanford University, USA; University of Washington, USA; North Carolina State University, USA) We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with log-concave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with log-concave polynomials of homogeneous degree k on n variables, we show that the down-up random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of well-studied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over size-k subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a kO(k)-approximation of maxS µ(S) when µ is generated by a log-concave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage down-up random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog2 n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almost-linear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearly-linear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k1 and k2 in time O(n log2 n), for fixed parameters k1, k2. @InProceedings{STOC21p408, author = {Nima Anari and Kuikui Liu and Shayan Oveis Gharan and Cynthia Vinzant and Thuy-Duong Vuong}, title = {Log-Concave Polynomials IV: Approximate Exchange, Tight Mixing Times, and Near-Optimal Sampling of Forests}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {408--420}, doi = {10.1145/3406325.3451091}, year = {2021}, } Publisher's Version Info |
|
Liu, Yang P. |
STOC '21: "Minimum Cost Flows, MDPs, ..."
Minimum Cost Flows, MDPs, and ℓ1-Regression in Nearly Linear Time for Dense Instances
Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang (KTH, Sweden; University of Washington, USA; Microsoft Research, USA; Stanford University, USA; University of Michigan, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA; Google Research, USA) In this paper we provide new randomized algorithms with improved runtimes for solving linear programs with two-sided constraints. In the special case of the minimum cost flow problem on n-vertex m-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in Õ(m + n1.5) time. This improves upon the previous best runtime of Õ(m √n) [Lee-Sidford’14] and, in the special case of unit-capacity maximum flow, improves upon the previous best runtimes of m4/3 + o(1) [Liu-Sidford’20, Kathuria’20] and Õ(m √n) [Lee-Sidford’14] for sufficiently dense graphs. In the case of ℓ1-regression in a matrix with n-columns and m-rows we obtain a randomized method which computes an є-approximate solution in Õ(mn + n2.5) time. This yields a randomized method which computes an є-optimal policy of a discounted Markov Decision Process with S states and, A actions per state in time Õ(S2 A + S2.5). These methods improve upon the previous best runtimes of methods which depend polylogarithmically on problem parameters, which were Õ(mn1.5) [Lee-Sidford’15] and Õ(S2.5 A) [Lee-Sidford’14, Sidford-Wang-Wu-Ye’18] respectively. To obtain this result we introduce two new algorithmic tools of possible independent interest. First, we design a new general interior point method for solving linear programs with two sided constraints which combines techniques from [Lee-Song-Zhang’19, Brand et al.’20] to obtain a robust stochastic method with iteration count nearly the square root of the smaller dimension. Second, to implement this method we provide dynamic data structures for efficiently maintaining approximations to variants of Lewis-weights, a fundamental importance measure for matrices which generalize leverage scores and effective resistances. @InProceedings{STOC21p859, author = {Jan van den Brand and Yin Tat Lee and Yang P. Liu and Thatchaphol Saranurak and Aaron Sidford and Zhao Song and Di Wang}, title = {Minimum Cost Flows, MDPs, and ℓ<sub>1</sub>-Regression in Nearly Linear Time for Dense Instances}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {859--869}, doi = {10.1145/3406325.3451108}, year = {2021}, } Publisher's Version STOC '21: "Discrepancy Minimization via ..." Discrepancy Minimization via a Self-Balancing Walk Ryan Alweiss, Yang P. Liu, and Mehtaab Sawhney (Princeton University, USA; Stanford University, USA; Massachusetts Institute of Technology, USA) We study discrepancy minimization for vectors in ℝn under various settings. The main result is the analysis of a new simple random process in high dimensions through a comparison argument. As corollaries, we obtain bounds which are tight up to logarithmic factors for online vector balancing against oblivious adversaries, resolving several questions posed by Bansal, Jiang, Singla, and Sinha (STOC 2020), as well as a linear time algorithm for logarithmic bounds for the Komlós conjecture. @InProceedings{STOC21p14, author = {Ryan Alweiss and Yang P. Liu and Mehtaab Sawhney}, title = {Discrepancy Minimization via a Self-Balancing Walk}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14--20}, doi = {10.1145/3406325.3450994}, year = {2021}, } Publisher's Version Video |
|
Liu, Yanyi |
STOC '21: "Cryptography from Sublinear-Time ..."
Cryptography from Sublinear-Time Average-Case Hardness of Time-Bounded Kolmogorov Complexity
Yanyi Liu and Rafael Pass (Cornell University, USA) Let MKtP[s] be the set of strings x such that Kt(x) ≤ s(|x|), where Kt(x) denotes the t-bounded Kolmogorov complexity of the truthtable described by x. Our main theorem shows that for an appropriate notion of mild average-case hardness, for every ε>0, polynomial t(n) ≥ (1+ε)n, and every “nice” class F of super-polynomial functions, the following are equivalent: (i) the existence of some function T ∈ F such that T-hard one-way functions (OWF) exists (with non-uniform security); (ii) the existence of some function T ∈ F such that MKtP[T−1] is mildly average-case hard with respect to sublinear-time non-uniform algorithms (with running-time nδ for some 0<δ<1). For instance, existence of subexponentially-hard (resp. quasi-poly-nomially-hard) OWFs is equivalent to mild average-case hardness of MKtP[poly logn] (resp. MKtP[2O(√logn))]) w.r.t. sublinear-time non-uniform algorithms. We additionally note that if we want to deduce T-hard OWFs where security holds w.r.t. uniform T-time probabilistic attackers (i.e., uniformly-secure OWFs), it suffices to assume sublinear time hardness of MKtP w.r.t. uniform probabilistic sublinear-time attackers. We complement this result by proving lower bounds that come surprisingly close to what is required to unconditionally deduce the existence of (uniformly-secure) OWFs: MKtP[polylogn] is worst-case hard w.r.t. uniform probabilistic sublinear-time algorithms, and MKtP[n−logn] is mildly average-case hard for all O(t(n)/n3)-time deterministic algorithms. @InProceedings{STOC21p722, author = {Yanyi Liu and Rafael Pass}, title = {Cryptography from Sublinear-Time Average-Case Hardness of Time-Bounded Kolmogorov Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {722--735}, doi = {10.1145/3406325.3451121}, year = {2021}, } Publisher's Version |
|
Lokshtanov, Daniel |
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C>t -Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C>t-free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an n-vertex C>t-free graph G with weights on vertices, finds in time n(log3 n) a maximum-weight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n(log2 n) assuming G is Pt-free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on Pt-free graphs in two directions: by encompassing the more general setting of C>t-free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> -Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330--341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version |
|
Lombardi, Alex |
STOC '21: "Fiat–Shamir via List-Recoverable ..."
Fiat–Shamir via List-Recoverable Codes (or: Parallel Repetition of GMW Is Not Zero-Knowledge)
Justin Holmgren, Alex Lombardi, and Ron D. Rothblum (NTT Research, USA; Massachusetts Institute of Technology, USA; Technion, Israel) In a seminal work, Goldreich, Micali and Wigderson (CRYPTO ’86) demonstrated the wide applicability of zero-knowledge proofs by constructing such a proof system for the NP-complete problem of graph 3-coloring. A long-standing open question has been whether parallel repetition of their protocol preserves zero knowledge. In this work, we answer this question in the negative, assuming a standard cryptographic assumption (i.e., the hardness of learning with errors (LWE)). Leveraging a connection observed by Dwork, Naor, Reingold, and Stockmeyer (FOCS ’99), our negative result is obtained by making positive progress on a related fundamental problem in cryptography: securely instantiating the Fiat-Shamir heuristic for eliminating interaction in public-coin interactive protocols. A recent line of work has shown how to instantiate the heuristic securely, albeit only for a limited class of protocols. Our main result shows how to instantiate Fiat-Shamir for parallel repetitions of much more general interactive proofs. In particular, we construct hash functions that, assuming LWE, securely realize the Fiat-Shamir transform for the following rich classes of protocols: 1) The parallel repetition of any “commit-and-open” protocol (such as the GMW protocol mentioned above), when a specific (natural) commitment scheme is used. Commit-and-open protocols are a ubiquitous paradigm for constructing general purpose public-coin zero knowledge proofs. 2) The parallel repetition of any base protocol that (1) satisfies a stronger notion of soundness called round-by-round soundness, and (2) has an efficient procedure, using a suitable trapdoor, for recognizing “bad verifier randomness” that would allow the prover to cheat. Our results are obtained by establishing a new connection between the Fiat-Shamir transform and list-recoverable codes. In contrast to the usual focus in coding theory, we focus on a parameter regime in which the input lists are extremely large, but the rate can be small. We give a (probabilistic) construction based on Parvaresh-Vardy codes (FOCS ’05) that suffices for our applications. @InProceedings{STOC21p750, author = {Justin Holmgren and Alex Lombardi and Ron D. Rothblum}, title = {Fiat–Shamir via List-Recoverable Codes (or: Parallel Repetition of GMW Is Not Zero-Knowledge)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {750--760}, doi = {10.1145/3406325.3451116}, year = {2021}, } Publisher's Version |
|
Lovett, Shachar |
STOC '21: "Log-Rank and Lifting for AND-Functions ..."
Log-Rank and Lifting for AND-Functions
Alexander Knop, Shachar Lovett, Sam McGuire, and Weiqiang Yuan (University of California at San Diego, USA; Tsinghua University, China) Let f: {0, 1}n → {0, 1} be a boolean function, and let f∧(x, y) = f(x ∧ y) denote the AND-function of f, where x ∧ y denotes bit-wise AND. We study the deterministic communication complexity of f∧ and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f∧. This comes within a logn factor of establishing the log-rank conjecture for AND-functions with no assumptions on f. Our result stands in contrast with previous results on special cases of the log-rank conjecture, which needed significant restrictions on f such as monotonicity or low F2-degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for AND-functions, stating that the deterministic communication complexity of f∧ is polynomially related to the AND-decision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}n → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size poly-logarithmic in its sparsity. We also establish extensions of this result to multi-linear polynomials f: {0, 1}n → with a larger range. @InProceedings{STOC21p197, author = {Alexander Knop and Shachar Lovett and Sam McGuire and Weiqiang Yuan}, title = {Log-Rank and Lifting for AND-Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {197--208}, doi = {10.1145/3406325.3450999}, year = {2021}, } Publisher's Version |
|
Lu, Zhenjian |
STOC '21: "Pseudodeterministic Algorithms ..."
Pseudodeterministic Algorithms and the Structure of Probabilistic Time
Zhenjian Lu, Igor C. Oliveira, and Rahul Santhanam (University of Warwick, UK; University of Oxford, UK) We connect the study of pseudodeterministic algorithms to two major open problems about the structural complexity of BPTIME: proving hierarchy theorems and showing the existence of complete problems. Our main contributions can be summarised as follows. A new pseudorandom generator and its consequences. We build on techniques developed to prove hierarchy theorems for probabilistic time with advice (Fortnow and Santhanam, FOCS 2004) to construct the first unconditional pseudorandom generator of polynomial stretch computable in pseudodeterministic polynomial time (with one bit of advice) that is secure infinitely often against polynomial-time computations. As an application of this construction, we obtain new results about the complexity of generating and representing prime numbers. For instance, we show unconditionally for each ε > 0 that infinitely many primes pn have a succinct representation in the following sense: there is a fixed probabilistic polynomial time algorithm that generates pn with high probability from its succinct representation of size O(|pn|ε). This offers an exponential improvement over the running time of previous results, and shows that infinitely many primes have succinct and efficient representations. Structural results for probabilistic time from pseudodeterministic algorithms. Oliveira and Santhanam (STOC 2017) established unconditionally that there is a pseudodeterministic algorithm for the Circuit Acceptance Probability Problem (CAPP) that runs in sub-exponential time and is correct with high probability over any samplable distribution on circuits on infinitely many input lengths. We show that improving this running time or obtaining a result that holds for every large input length would imply new time hierarchy theorems for probabilistic time. In addition, we prove that a worst-case polynomial-time pseudodeterministic algorithm for CAPP would imply that BPP has complete problems. Equivalence between pseudodeterministic constructions and hierarchies. We establish an equivalence between a certain explicit pseudodeterministic construction problem and the existence of strong hierarchy theorems for probabilistic time. More precisely, we show that pseudodeterministically constructing in exponential time strings of large rKt complexity (Oliveira, ICALP 2019) is possible if and only if for every constructive function T(n) ≤ exp(o(exp(n))) we have BPTIME[poly(T)] ⊈ i.o.BPTIME[T]/logT. More generally, these results suggest new approaches for designing pseudodeterministic algorithms for search problems and for unveiling the structure of probabilistic time. @InProceedings{STOC21p303, author = {Zhenjian Lu and Igor C. Oliveira and Rahul Santhanam}, title = {Pseudodeterministic Algorithms and the Structure of Probabilistic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {303--316}, doi = {10.1145/3406325.3451085}, year = {2021}, } Publisher's Version |
|
Lykouris, Thodoris |
STOC '21: "Contextual Search in the Presence ..."
Contextual Search in the Presence of Irrational Agents
Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, and Robert Schapire (Microsoft Research, USA; Harvard University, USA) We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard game-theoretic formulations of this problem assume that agents act in accordance with a specific behavioral model. In practice, some agents may not subscribe to the dominant behavioral model or may act in ways that are seemingly arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance even with a few arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis. @InProceedings{STOC21p910, author = {Akshay Krishnamurthy and Thodoris Lykouris and Chara Podimata and Robert Schapire}, title = {Contextual Search in the Presence of Irrational Agents}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {910--918}, doi = {10.1145/3406325.3451120}, year = {2021}, } Publisher's Version |
|
Lyu, Xin |
STOC '21: "Inverse-Exponential Correlation ..."
Inverse-Exponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma
Lijie Chen and Xin Lyu (Massachusetts Institute of Technology, USA; Tsinghua University, China) In this work we prove that there is a function f ∈ E NP such that, for every sufficiently large n and d = √n/logn, fn (f restricted to n-bit inputs) cannot be (1/2 + 2−d)-approximated by F2-polynomials of degree d. We also observe that a minor improvement (e.g., improving d to n1/2+ε for any ε > 0) over our result would imply E NP cannot be computed by depth-3 AC0-circuits of 2n1/2 + ε size, which is a notoriously hard open question in complexity theory. Using the same proof techniques, we are also able to construct extremely rigid matrices over F2 in P NP. More specifically, we show that for every constant ε ∈ (0,1), there is a P NP algorithm which on input 1n outputs an n× n F2-matrix Hn satisfying RHn(2log1 − ε n) ≥ (1/2 − exp(−log2/3 · ε n) ) · n2, for every sufficiently large n. This improves the recent P NP constructions of rigid matrices in [Alman and Chen, FOCS 2019] and [Bhangale et al., FOCS 2020], which only give Ω(n2) rigidity. The key ingredient in the proof of our new results is a new derandomized XOR lemma based on approximate linear sums, which roughly says that given an n-input function f which cannot be 0.99-approximated by certain linear sum of s many functions in F within ℓ1-distance, one can construct a new function Ampf with O(n) input bits, which cannot be (1/2+sΩ(1))-approximated by F-functions. Taking F to be a function collection containing low-degree F2-polynomials or low-rank F2-matrices, our results are then obtained by first using the algorithmic method to construct a function which is weakly hard against linear sums of F in the above sense, and then applying the derandomized XOR lemma to f. We obtain our new derandomized XOR lemma by giving a generalization of the famous hardcore lemma by Impagliazzo. Our generalization in some sense constructs a non-Boolean hardcore of a weakly hard function f with respect to F-functions, from the weak inapproximability of f by any linear sum of F with bounded ℓp-norm. This generalization recovers the original hardcore lemma by considering the ℓ∞-norm. Surprisingly, when we switch to the ℓ1-norm, we immediately rediscover Levin’s proof of Yao’s XOR Lemma. That is, these first two proofs of Yao’s XOR Lemma can be unified with our new perspective. For proving the correlation bounds, our new derandomized XOR lemma indeed works with the ℓ4/3-norm. @InProceedings{STOC21p761, author = {Lijie Chen and Xin Lyu}, title = {Inverse-Exponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {761--771}, doi = {10.1145/3406325.3451132}, year = {2021}, } Publisher's Version Info |
|
Mangoubi, Oren |
STOC '21: "Greedy Adversarial Equilibrium: ..."
Greedy Adversarial Equilibrium: An Efficient Alternative to Nonconvex-Nonconcave Min-Max Optimization
Oren Mangoubi and Nisheeth K. Vishnoi (Worcester Polytechnic Institute, USA; Yale University, USA) Min-max optimization of an objective function f: ℝd × ℝd → ℝ is an important model for robustness in an adversarial setting, with applications to many areas including optimization, economics, and deep learning. In many applications f may be nonconvex-nonconcave, and finding a global min-max point may be computationally intractable. There is a long line of work that seeks computationally tractable algorithms for alternatives to the min-max optimization model. However, many of the alternative models have solution points which are only guaranteed to exist under strong assumptions on f, such as convexity, monotonicity, or special properties of the starting point. We propose an optimization model, the ε-greedy adversarial equilibrium, and show that it can serve as a computationally tractable alternative to the min-max optimization model. Roughly, we say that a point (x⋆, y⋆) is an ε-greedy adversarial equilibrium if y⋆ is an ε-approximate local maximum for f(x⋆,·), and x⋆ is an ε-approximate local minimum for a “greedy approximation” to the function maxz f(x, z) which can be efficiently estimated using second-order optimization algorithms. We prove the existence of such a point for any smooth function which is bounded and has Lipschitz Hessian. To prove existence, we introduce an algorithm that converges from any starting point to an ε-greedy adversarial equilibrium in a number of evaluations of the function f, the max-player’s gradient ∇y f(x,y), and its Hessian ∇y2 f(x,y), that is polynomial in the dimension d, 1/ε, and the bounds on f and its Lipschitz constant. @InProceedings{STOC21p896, author = {Oren Mangoubi and Nisheeth K. Vishnoi}, title = {Greedy Adversarial Equilibrium: An Efficient Alternative to Nonconvex-Nonconcave Min-Max Optimization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {896--909}, doi = {10.1145/3406325.3451097}, year = {2021}, } Publisher's Version |
|
Manurangsi, Pasin |
STOC '21: "Sample-Efficient Proper PAC ..."
Sample-Efficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d6), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2O(d) on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {Sample-Efficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183--196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version |
|
Maus, Yannic |
STOC '21: "Efficient Randomized Distributed ..."
Efficient Randomized Distributed Coloring in CONGEST
Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan (Reykjavik University, Iceland; University of Freiburg, Germany; Technion, Israel) Distributed vertex coloring is one of the classic problems and probably also the most widely studied problems in the area of distributed graph algorithms. We present a new randomized distributed vertex coloring algorithm for the standard CONGEST model, where the network is modeled as an n-node graph G, and where the nodes of G operate in synchronous communication rounds in which they can exchange O(logn)-bit messages over all the edges of G. For graphs with maximum degree Δ, we show that the (Δ+1)-list coloring problem (and therefore also the standard (Δ+1)-coloring problem) can be solved in O(log5logn) rounds. Previously such a result was only known for the significantly more powerful LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary size. The best previous (Δ+1)-coloring algorithm in the CONGEST model had a running time of O(logΔ + log6logn) rounds. As a function of n alone, the best previous algorithm therefore had a round complexity of O(logn), which is a bound that can also be achieved by a na'ive folklore algorithm. For large maximum degree Δ, our algorithm hence is an exponential improvement over the previous state of the art. @InProceedings{STOC21p1180, author = {Magnús M. Halldórsson and Fabian Kuhn and Yannic Maus and Tigran Tonoyan}, title = {Efficient Randomized Distributed Coloring in CONGEST}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1180--1193}, doi = {10.1145/3406325.3451089}, year = {2021}, } Publisher's Version |
|
McGuire, Sam |
STOC '21: "Log-Rank and Lifting for AND-Functions ..."
Log-Rank and Lifting for AND-Functions
Alexander Knop, Shachar Lovett, Sam McGuire, and Weiqiang Yuan (University of California at San Diego, USA; Tsinghua University, China) Let f: {0, 1}n → {0, 1} be a boolean function, and let f∧(x, y) = f(x ∧ y) denote the AND-function of f, where x ∧ y denotes bit-wise AND. We study the deterministic communication complexity of f∧ and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f∧. This comes within a logn factor of establishing the log-rank conjecture for AND-functions with no assumptions on f. Our result stands in contrast with previous results on special cases of the log-rank conjecture, which needed significant restrictions on f such as monotonicity or low F2-degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for AND-functions, stating that the deterministic communication complexity of f∧ is polynomially related to the AND-decision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}n → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size poly-logarithmic in its sparsity. We also establish extensions of this result to multi-linear polynomials f: {0, 1}n → with a larger range. @InProceedings{STOC21p197, author = {Alexander Knop and Shachar Lovett and Sam McGuire and Weiqiang Yuan}, title = {Log-Rank and Lifting for AND-Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {197--208}, doi = {10.1145/3406325.3450999}, year = {2021}, } Publisher's Version |
|
McKenzie, Theo |
STOC '21: "Support of Closed Walks and ..."
Support of Closed Walks and Second Eigenvalue Multiplicity of Graphs
Theo McKenzie, Peter Michael Reichstein Rasmussen, and Nikhil Srivastava (University of California at Berkeley, USA; University of Copenhagen, Denmark) We show that the multiplicity of the second normalized adjacency matrix eigenvalue of any connected graph of maximum degree Δ is bounded by O(n Δ7/5/log1/5−o(1)n) for any Δ, and improve this to O(nlog1/2d/log1/4−o(1)n) for simple d-regular graphs when d≥ log1/4n. In fact, the same bounds hold for the number of eigenvalues in any interval of width λ2/logΔ1−o(1)n containing the second eigenvalue λ2. The main ingredient in the proof is a polynomial (in k) lower bound on the typical support of a closed random walk of length 2k in any connected graph, which in turn relies on new lower bounds for the entries of the Perron eigenvector of submatrices of the normalized adjacency matrix. @InProceedings{STOC21p396, author = {Theo McKenzie and Peter Michael Reichstein Rasmussen and Nikhil Srivastava}, title = {Support of Closed Walks and Second Eigenvalue Multiplicity of Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {396--407}, doi = {10.1145/3406325.3451129}, year = {2021}, } Publisher's Version |
|
McSwiggen, Colin |
STOC '21: "Sampling Matrices from Harish-Chandra–Itzykson–Zuber ..."
Sampling Matrices from Harish-Chandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy
Jonathan Leake, Colin McSwiggen, and Nisheeth K. Vishnoi (TU Berlin, Germany; University of Tokyo, Japan; Yale University, USA) Given two Hermitian matrices Y and Λ, the Harish-Chandra–Itzykson–Zuber (HCIZ) distribution is given by the density eTr(U Λ U*Y) with respect to the Haar measure on the unitary group. Random unitary matrices distributed according to the HCIZ distribution are important in various settings in physics and random matrix theory, but the problem of sampling efficiently from this distribution has remained open. We present two algorithms to sample matrices from distributions that are close to the HCIZ distribution. The first produces samples that are ξ-close in total variation distance, and the number of arithmetic operations required depends on poly(log1/ξ). The second produces samples that are ξ-close in infinity divergence, but with a poly(1/ξ) dependence. Our results have the following applications: 1) an efficient algorithm to sample from complex versions of matrix Langevin distributions studied in statistics, 2) an efficient algorithm to sample from continuous maximum entropy distributions over unitary orbits, which in turn implies an efficient algorithm to sample a pure quantum state from the entropy-maximizing ensemble representing a given density matrix, and 3) an efficient algorithm for differentially private rank-k approximation that comes with improved utility bounds for k>1. @InProceedings{STOC21p1384, author = {Jonathan Leake and Colin McSwiggen and Nisheeth K. Vishnoi}, title = {Sampling Matrices from Harish-Chandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1384--1397}, doi = {10.1145/3406325.3451094}, year = {2021}, } Publisher's Version |
|
Minzer, Dor |
STOC '21: "New Separations Results for ..."
New Separations Results for External Information
Mark Braverman and Dor Minzer (Princeton University, USA; Massachusetts Institute of Technology, USA) We obtain new separation results for the two-party external information complexity of Boolean functions. The external information complexity of a function f(x,y) is the minimum amount of information a two-party protocol computing f must reveal to an outside observer about the input. We prove an exponential separation between external and internal information complexity, which is the best possible; previously no separation was known. We use this result in order to then prove a near-quadratic separation between amortized zero-error communication complexity and external information complexity for total functions, disproving a conjecture of the first author. Finally, we prove a matching upper bound showing that our separation result is tight. @InProceedings{STOC21p248, author = {Mark Braverman and Dor Minzer}, title = {New Separations Results for External Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {248--258}, doi = {10.1145/3406325.3451044}, year = {2021}, } Publisher's Version |
|
Moitra, Ankur |
STOC '21: "Settling the Robust Learnability ..."
Settling the Robust Learnability of Mixtures of Gaussians
Allen Liu and Ankur Moitra (Massachusetts Institute of Technology, USA) This work represents a natural coalescence of two important lines of work – learning mixtures of Gaussians and algorithmic robust statistics. In particular we give the first provably robust algorithm for learning mixtures of any constant number of Gaussians. We require only mild assumptions on the mixing weights (bounded fractionality) and that the total variation distance between components is bounded away from zero. At the heart of our algorithm is a new method for proving dimension-independent polynomial identifiability through applying a carefully chosen sequence of differential operations to certain generating functions that not only encode the parameters we would like to learn but also the system of polynomial equations we would like to solve. We show how the symbolic identities we derive can be directly used to analyze a natural sum-of-squares relaxation. @InProceedings{STOC21p518, author = {Allen Liu and Ankur Moitra}, title = {Settling the Robust Learnability of Mixtures of Gaussians}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {518--531}, doi = {10.1145/3406325.3451084}, year = {2021}, } Publisher's Version STOC '21: "Algorithmic Foundations for ..." Algorithmic Foundations for the Diffraction Limit Sitan Chen and Ankur Moitra (Massachusetts Institute of Technology, USA) For more than a century and a half it has been widely-believed (but was never rigorously shown) that the physics of diffraction imposes certain fundamental limits on the resolution of an optical system. However our understanding of what exactly can and cannot be resolved has never risen above heuristic arguments which, even worse, appear contradictory. In this work we remedy this gap by studying the diffraction limit as a statistical inverse problem and, based on connections to provable algorithms for learning mixture models, we rigorously prove upper and lower bounds on the statistical and algorithmic complexity needed to resolve closely spaced point sources. In particular we show that there is a phase transition where the sample complexity goes from polynomial to exponential. Surprisingly, we show that this does not occur at the Abbe limit, which has long been presumed to be the true diffraction limit. @InProceedings{STOC21p490, author = {Sitan Chen and Ankur Moitra}, title = {Algorithmic Foundations for the Diffraction Limit}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {490--503}, doi = {10.1145/3406325.3451078}, year = {2021}, } Publisher's Version |
|
Moran, Shay |
STOC '21: "Learnability Can Be Independent ..."
Learnability Can Be Independent of Set Theory (Invited Paper)
Shai Ben-David, Pavel Hrubes, Shay Moran, Amir Shpilka, and Amir Yehudayoff (University of Waterloo, Canada; Czech Academy of Sciences, Czechia; Technion, Israel; Tel Aviv University, Israel) A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its Vapnik-Chervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VC-dimension characterization of binary classification learnability. @InProceedings{STOC21p11, author = {Shai Ben-David and Pavel Hrubes and Shay Moran and Amir Shpilka and Amir Yehudayoff}, title = {Learnability Can Be Independent of Set Theory (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11--11}, doi = {10.1145/3406325.3465360}, year = {2021}, } Publisher's Version STOC '21: "A Theory of Universal Learning ..." A Theory of Universal Learning Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, and Amir Yehudayoff (Google, Switzerland; Toyota Technological Institute at Chicago, USA; Technion, Israel; Google Research, Israel; Princeton University, USA) How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of Vapnik-Chervonenkis and Valiant, does not explain the behavior of learning curves: the distribution-free PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios. @InProceedings{STOC21p532, author = {Olivier Bousquet and Steve Hanneke and Shay Moran and Ramon van Handel and Amir Yehudayoff}, title = {A Theory of Universal Learning}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {532--541}, doi = {10.1145/3406325.3451087}, year = {2021}, } Publisher's Version STOC '21: "Boosting Simple Learners ..." Boosting Simple Learners Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rules-of-thumbs” from an “easy-to-learn class”. (Schapire and Freund ’12, Shalev-Shwartz and Ben-David ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ2) weak hypotheses with γ-margin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorial-geometric parameter which captures the expressivity of base-classes in boosting. As a corollary we provide an affirmative answer to the second question for many well-studied classes, including half-spaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481--489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version STOC '21: "Adversarial Laws of Large ..." Adversarial Laws of Large Numbers and Optimal Regret in Online Classification Noga Alon, Omri Ben-Eliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed sub-population is well-estimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by Ben-Eliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The sample-complexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from Ben-David, Pál, and Shalev-Shwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri Ben-Eliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447--455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version |
|
Moshkovitz, Guy |
STOC '21: "Structure vs. Randomness for ..."
Structure vs. Randomness for Bilinear Maps
Alex Cohen and Guy Moshkovitz (Yale University, USA; City University of New York, USA) We prove that the slice rank of a 3-tensor (a combinatorial notion introduced by Tao in the context of the cap-set problem), the analytic rank (a Fourier-theoretic notion introduced by Gowers and Wolf), and the geometric rank (a recently introduced algebro-geometric notion) are all equivalent up to an absolute constant. As a corollary, we obtain strong trade-offs on the arithmetic complexity of a biased bililnear map, and on the separation between computing a bilinear map exactly and on average. Our result settles open questions of Haramaty and Shpilka [STOC 2010], and of Lovett [Discrete Anal., 2019] for 3-tensors. @InProceedings{STOC21p800, author = {Alex Cohen and Guy Moshkovitz}, title = {Structure vs. Randomness for Bilinear Maps}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {800--808}, doi = {10.1145/3406325.3451007}, year = {2021}, } Publisher's Version |
|
Mossel, Elchanan |
STOC '21: "Robust Testing of Low Dimensional ..."
Robust Testing of Low Dimensional Functions
Anindya De, Elchanan Mossel, and Joe Neeman (University of Pennsylvania, USA; Massachusetts Institute of Technology, USA; University of Texas at Austin, USA) A natural problem in high-dimensional inference is to decide if a classifier f:ℝn → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝn → {−1,1}, a linear k-junta if it is completely determined by some k-dimensional subspace of the input space. A recent work of the authors showed that linear k-juntas are testable. Thus there exists an algorithm to distinguish between: (1) f: ℝn → {−1,1} which is a linear k-junta with surface area s. (2) f is є-far from any linear k-junta with surface area (1+є)s. The query complexity of the algorithm is independent of the ambient dimension n. Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between: (1) f: ℝn → {−1,1} has correlation at least c with some linear k-junta with surface area s. (2) f has correlation at most c−є with any linear k-junta with surface area at most s. The query complexity of our tester is kpoly(s/є). Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear k-juntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity kO(poly(logk/є)) for the class of intersection of k-halfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no non-trivial noise tolerant testers were known even for a single halfspace. @InProceedings{STOC21p584, author = {Anindya De and Elchanan Mossel and Joe Neeman}, title = {Robust Testing of Low Dimensional Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {584--597}, doi = {10.1145/3406325.3451115}, year = {2021}, } Publisher's Version |
|
Mukhopadhyay, Partha |
STOC '21: "Lower Bounds for Monotone ..."
Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Arkadev Chattopadhyay, Rajit Datta, and Partha Mukhopadhyay (Tata Institute of Fundamental Research, India; Chennai Mathematical Institute, India) Valiant (1980) showed that general arithmetic circuits with negation can be exponentially more powerful than monotone ones. We give the first improvement to this classical result: we construct a family of polynomials Pn in n variables, each of its monomials has non-negative coefficient, such that Pn can be computed by a polynomial-size depth-three formula but every monotone circuit computing it has size 2Ω(n1/4/log(n)). The polynomial Pn embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif (2020) to refute the Log-Approximate-Rank Conjecture in communication complexity. To prove our lower bound for Pn, we develop a general connection between corruption of combinatorial rectangles by any function f ∘ XOR and corruption of product polynomials by a certain polynomial Pf that is an arithmetic embedding of f. This connection should be of independent interest. Using further ideas from communication complexity, we construct another family of set-multilinear polynomials fn,m such that both Fn,m − є· fn,m and Fn,m + є· fn,m have monotone circuit complexity 2Ω(n/log(n)) if є ≥ 2− Ω( m ) and Fn,m ∏i=1n (xi,1 +⋯+xi,m), with m = O( n/logn ). The polynomials fn,m have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits has been advocated recently by Hrubeš (2020) as a first step towards proving lower bounds against general circuits via his new approach. @InProceedings{STOC21p786, author = {Arkadev Chattopadhyay and Rajit Datta and Partha Mukhopadhyay}, title = {Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {786--799}, doi = {10.1145/3406325.3451069}, year = {2021}, } Publisher's Version |
|
Mukhopadhyay, Sagnik |
STOC '21: "Distributed Weighted Min-Cut ..."
Distributed Weighted Min-Cut in Nearly-Optimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimum-weight cut (min-cut) is a basic measure of a network’s connectivity strength. While the min-cut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own min-cut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearly-optimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)-approximation at best while the exact Õ(n0.8D0.2 + n0.9)-time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hop-diameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)-time algorithm for computing min-cut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of Mukhopadhyay-Nanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted Min-Cut in Nearly-Optimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1144--1153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version STOC '21: "Breaking the Quadratic Barrier ..." Breaking the Quadratic Barrier for Matroid Intersection Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, and Danupon Nanongkai (KTH, Sweden; University of Copenhagen, Denmark) The matroid intersection problem is a fundamental problem that has been extensively studied for half a century. In the classic version of this problem, we are given two matroids M1 = (V, I1) and M2 = (V, I2) on a comment ground set V of n elements, and then we have to find the largest common independent set S ∈ I1 ∩ I2 by making independence oracle queries of the form ”Is S ∈ I1?” or ”Is S ∈ I2?” for S ⊆ V. The goal is to minimize the number of queries. Beating the existing Õ(n2) bound, known as the quadratic barrier, is an open problem that captures the limits of techniques from two lines of work. The first one is the classic Cunningham’s algorithm [SICOMP 1986], whose Õ(n2)-query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally, these algorithms take Õ(nr) queries where r denotes the rank which can be as big as n). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requires either approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm with o(n2) independence queries was known. In this work, we break the quadratic barrier with a randomized algorithm guaranteeing Õ(n9/5) independence queries with high probability, and a deterministic algorithm guaranteeing Õ(n11/6) independence queries. Our key insight is simple and fast algorithms to solve a graph reachability problem that arose in the standard augmenting path framework [Edmonds 1968]. Combining this with previous exact and approximation algorithms leads to our results. @InProceedings{STOC21p421, author = {Joakim Blikstad and Jan van den Brand and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Breaking the Quadratic Barrier for Matroid Intersection}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {421--432}, doi = {10.1145/3406325.3451092}, year = {2021}, } Publisher's Version |
|
Mullainathan, Sendhil |
STOC '21: "Simplicity Creates Inequity: ..."
Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)
Jon Kleinberg and Sendhil Mullainathan (Cornell University, USA; University of Chicago, USA) Algorithms are increasingly used to aid, or in some cases supplant, human decision-making, particularly for decisions that hinge on predictions. As a result, two additional features in addition to prediction quality have generated interest: (i) to facilitate human interaction and understanding with these algorithms, we desire prediction functions that are in some fashion simple or interpretable; and (ii) because they influence consequential decisions, we also want them to produce equitable allocations. We develop a formal model to explore the relationship between the demands of simplicity and equity. Although the two concepts appear to be motivated by qualitatively distinct goals, we show a fundamental inconsistency between them. Specifically, we formalize a general framework for producing simple prediction functions, and in this framework we establish two basic results. First, every simple prediction function is strictly improvable: there exists a more complex prediction function that is both strictly more efficient and also strictly more equitable. Put another way, using a simple prediction function both reduces utility for disadvantaged groups and reduces overall welfare relative to other options. Second, we show that simple prediction functions necessarily create incentives to use information about individuals' membership in a disadvantaged group --- incentives that weren't present before simplification, and that work against these individuals. Thus, simplicity transforms disadvantage into bias against the disadvantaged group. Our results are not only about algorithms but about any process that produces simple models, and as such they connect to the psychology of stereotypes and to an earlier economics literature on statistical discrimination. @InProceedings{STOC21p7, author = {Jon Kleinberg and Sendhil Mullainathan}, title = {Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {7--7}, doi = {10.1145/3406325.3465356}, year = {2021}, } Publisher's Version |
|
N, Vishvajeet |
STOC '21: "Graph Streaming Lower Bounds ..."
Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma
Sepehr Assadi and Vishvajeet N (Rutgers University, USA) We study space-pass tradeoffs in graph streaming algorithms for parameter estimation and property testing problems such as estimating the size of maximum matchings and maximum cuts, weight of minimum spanning trees, or testing if a graph is connected or cycle-free versus being far from these properties. We develop a new lower bound technique that proves that for many problems of interest, including all the above, obtaining a (1+є)-approximation requires either nΩ(1) space or Ω(1/є) passes, even on highly restricted families of graphs such as bounded-degree planar graphs. For multiple of these problems, this bound matches those of existing algorithms and is thus (asymptotically) optimal. Our results considerably strengthen prior lower bounds even for arbitrary graphs: starting from the influential work of [Verbin, Yu; SODA 2011], there has been a plethora of lower bounds for single-pass algorithms for these problems; however, the only multi-pass lower bounds proven very recently in [Assadi, Kol, Saxena, Yu; FOCS 2020] rules out sublinear-space algorithms with exponentially smaller o(log(1/є)) passes for these problems. One key ingredient of our proofs is a simple streaming XOR Lemma, a generic hardness amplification result, that we prove: informally speaking, if a p-pass s-space streaming algorithm can only solve a decision problem with advantage δ > 0 over random guessing, then it cannot solve XOR of ℓ independent copies of the problem with advantage much better than δℓ. This result can be of independent interest and useful for other streaming lower bounds as well. @InProceedings{STOC21p612, author = {Sepehr Assadi and Vishvajeet N}, title = {Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {612--625}, doi = {10.1145/3406325.3451110}, year = {2021}, } Publisher's Version |
|
Nakos, Vasileios |
STOC '21: "Sparse Nonnegative Convolution ..."
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Karl Bringmann, Nick Fischer, and Vasileios Nakos (Saarland University, Germany; MPI-INF, Germany) Computing the convolution A ⋆ B of two length-n vectors A,B is an ubiquitous computational primitive, with applications in a variety of disciplines. Within theoretical computer science, applications range from string problems to Knapsack-type problems, and from 3SUM to All-Pairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of A,B are nonnegative integers. The classical algorithm to compute A⋆ B uses the Fast Fourier Transform (FFT) and runs in time O(n logn). However, in many cases A and B might satisfy sparsity conditions, and hence one could hope for significant gains compared to the standard FFT algorithm. The ideal goal would be an O(k logk)-time algorithm, where k is the number of non-zero elements in the output, i.e., the size of the support of A ⋆ B. This problem is referred to as sparse nonnegative convolution, and has received a considerable amount of attention in the literature; the fastest algorithms to date run in time O(k log2 n). The main result of this paper is the first O(k logk)-time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length n and the largest entry of A and B are subexponential in k. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constant-error randomized algorithms. Specifically, if D(n) is the time to convolve two nonnegative length-n vectors with success probability 2/3, and S(k) is the time to convolve two nonnegative vectors with output size k with success probability 2/3, then S(k) = O(D(k) + k (loglogk)2). Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function. @InProceedings{STOC21p1711, author = {Karl Bringmann and Nick Fischer and Vasileios Nakos}, title = {Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1711--1724}, doi = {10.1145/3406325.3451090}, year = {2021}, } Publisher's Version |
|
Nanongkai, Danupon |
STOC '21: "Distributed Weighted Min-Cut ..."
Distributed Weighted Min-Cut in Nearly-Optimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimum-weight cut (min-cut) is a basic measure of a network’s connectivity strength. While the min-cut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own min-cut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearly-optimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)-approximation at best while the exact Õ(n0.8D0.2 + n0.9)-time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hop-diameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)-time algorithm for computing min-cut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of Mukhopadhyay-Nanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted Min-Cut in Nearly-Optimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1144--1153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version STOC '21: "Breaking the Quadratic Barrier ..." Breaking the Quadratic Barrier for Matroid Intersection Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, and Danupon Nanongkai (KTH, Sweden; University of Copenhagen, Denmark) The matroid intersection problem is a fundamental problem that has been extensively studied for half a century. In the classic version of this problem, we are given two matroids M1 = (V, I1) and M2 = (V, I2) on a comment ground set V of n elements, and then we have to find the largest common independent set S ∈ I1 ∩ I2 by making independence oracle queries of the form ”Is S ∈ I1?” or ”Is S ∈ I2?” for S ⊆ V. The goal is to minimize the number of queries. Beating the existing Õ(n2) bound, known as the quadratic barrier, is an open problem that captures the limits of techniques from two lines of work. The first one is the classic Cunningham’s algorithm [SICOMP 1986], whose Õ(n2)-query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally, these algorithms take Õ(nr) queries where r denotes the rank which can be as big as n). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requires either approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm with o(n2) independence queries was known. In this work, we break the quadratic barrier with a randomized algorithm guaranteeing Õ(n9/5) independence queries with high probability, and a deterministic algorithm guaranteeing Õ(n11/6) independence queries. Our key insight is simple and fast algorithms to solve a graph reachability problem that arose in the standard augmenting path framework [Edmonds 1968]. Combining this with previous exact and approximation algorithms leads to our results. @InProceedings{STOC21p421, author = {Joakim Blikstad and Jan van den Brand and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Breaking the Quadratic Barrier for Matroid Intersection}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {421--432}, doi = {10.1145/3406325.3451092}, year = {2021}, } Publisher's Version STOC '21: "Vertex Connectivity in Poly-logarithmic ..." Vertex Connectivity in Poly-logarithmic Max-Flows Jason Li, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Sorrachai Yingchareonthawornchai (Carnegie Mellon University, USA; University of Copenhagen, Denmark; KTH, Sweden; Duke University, USA; University of Michigan, USA; Aalto University, Finland) The vertex connectivity of an m-edge n-vertex undirected graph is the smallest number of vertices whose removal disconnects the graph, or leaves only a singleton vertex. In this paper, we give a reduction from the vertex connectivity problem to a set of maxflow instances. Using this reduction, we can solve vertex connectivity in (mα) time for any α ≥ 1, if there is a mα-time maxflow algorithm. Using the current best maxflow algorithm that runs in m4/3+o(1) time (Kathuria, Liu and Sidford, FOCS 2020), this yields a m4/3+o(1)-time vertex connectivity algorithm. This is the first improvement in the running time of the vertex connectivity problem in over 20 years, the previous best being an Õ(mn)-time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm with an o(mn) running time was known before our work, even if we assume an (m)-time maxflow algorithm. Our new technique is robust enough to also improve the best Õ(mn)-time bound for directed vertex connectivity to mn1−1/12+o(1) time @InProceedings{STOC21p317, author = {Jason Li and Danupon Nanongkai and Debmalya Panigrahi and Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai}, title = {Vertex Connectivity in Poly-logarithmic Max-Flows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {317--329}, doi = {10.1145/3406325.3451088}, year = {2021}, } Publisher's Version |
|
Naor, Assaf |
STOC '21: "A Framework for Quadratic ..."
A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations
Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor (Institute for Advanced Study at Princeton, USA; Princeton University, USA; University of Michigan, USA) We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an origin-symmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like max-cut, Grothendieck/non-commutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using case-specific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constant-approximability necessitates that K has type-2 constant that grows slowly with n. However, we show that even when the type-2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type-2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type-2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constant-factor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type-2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type-2 constant nearly characterizes the approximability of quadratic maximization. @InProceedings{STOC21p870, author = {Vijay Bhattiprolu and Euiwoong Lee and Assaf Naor}, title = {A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {870--881}, doi = {10.1145/3406325.3451128}, year = {2021}, } Publisher's Version |
|
Naor, Moni |
STOC '21: "Adversarial Laws of Large ..."
Adversarial Laws of Large Numbers and Optimal Regret in Online Classification
Noga Alon, Omri Ben-Eliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed sub-population is well-estimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by Ben-Eliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The sample-complexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from Ben-David, Pál, and Shalev-Shwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri Ben-Eliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447--455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version |
|
Nederlof, Jesper |
STOC '21: "Improving Schroeppel and Shamir’s ..."
Improving Schroeppel and Shamir’s Algorithm for Subset Sum via Orthogonal Vectors
Jesper Nederlof and Karol Węgrzycki (Utrecht University, Netherlands; Saarland University, Germany; MPI-INF, Germany) We present an O∗(20.5n) time and O∗(20.249999n) space randomized algorithm for solving worst-case Subset Sum instances with n integers. This is the first improvement over the long-standing O∗(2n/2) time and O∗(2n/4) space algorithm due to Schroeppel and Shamir (FOCS 1979). We breach this gap in two steps: (1) We present a space efficient reduction to the Orthogonal Vectors Problem (OV), one of the most central problem in Fine-Grained Complexity. The reduction is established via an intricate combination of the method of Schroeppel and Shamir, and the representation technique introduced by Howgrave-Graham and Joux (EUROCRYPT 2010) for designing Subset Sum algorithms for the average case regime. (2) We provide an algorithm for OV that detects an orthogonal pair among N given vectors in {0,1}d with support size d/4 in time Õ(N· 2d/d d/4). Our algorithm for OV is based on and refines the representative families framework developed by Fomin, Lokshtanov, Panolan and Saurabh (J. ACM 2016). Our reduction uncovers a curious tight relation between Subset Sum and OV, because any improvement of our algorithm for OV would imply an improvement over the runtime of Schroeppel and Shamir, which is also a long standing open problem. @InProceedings{STOC21p1670, author = {Jesper Nederlof and Karol Węgrzycki}, title = {Improving Schroeppel and Shamir’s Algorithm for Subset Sum via Orthogonal Vectors}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1670--1683}, doi = {10.1145/3406325.3451024}, year = {2021}, } Publisher's Version Video |
|
Neel, Seth |
STOC '21: "A New Analysis of Differential ..."
A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)
Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld (University of Pennsylvania, USA; Hebrew University of Jerusalem, Israel) We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds. @InProceedings{STOC21p9, author = {Christopher Jung and Katrina Ligett and Seth Neel and Aaron Roth and Saeed Sharifi-Malvajerdi and Moshe Shenfeld}, title = {A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {9--9}, doi = {10.1145/3406325.3465358}, year = {2021}, } Publisher's Version |
|
Neeman, Joe |
STOC '21: "Robust Testing of Low Dimensional ..."
Robust Testing of Low Dimensional Functions
Anindya De, Elchanan Mossel, and Joe Neeman (University of Pennsylvania, USA; Massachusetts Institute of Technology, USA; University of Texas at Austin, USA) A natural problem in high-dimensional inference is to decide if a classifier f:ℝn → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝn → {−1,1}, a linear k-junta if it is completely determined by some k-dimensional subspace of the input space. A recent work of the authors showed that linear k-juntas are testable. Thus there exists an algorithm to distinguish between: (1) f: ℝn → {−1,1} which is a linear k-junta with surface area s. (2) f is є-far from any linear k-junta with surface area (1+є)s. The query complexity of the algorithm is independent of the ambient dimension n. Following the surge of interest in noise-tolerant property testing, in this paper we prove a noise-tolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between: (1) f: ℝn → {−1,1} has correlation at least c with some linear k-junta with surface area s. (2) f has correlation at most c−є with any linear k-junta with surface area at most s. The query complexity of our tester is kpoly(s/є). Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear k-juntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity kO(poly(logk/є)) for the class of intersection of k-halfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no non-trivial noise tolerant testers were known even for a single halfspace. @InProceedings{STOC21p584, author = {Anindya De and Elchanan Mossel and Joe Neeman}, title = {Robust Testing of Low Dimensional Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {584--597}, doi = {10.1145/3406325.3451115}, year = {2021}, } Publisher's Version |
|
Nekrich, Yakov |
STOC '21: "Dynamic Planar Point Location ..."
Dynamic Planar Point Location in Optimal Time
Yakov Nekrich (Michigan Technological University, USA) In this paper we describe a fully-dynamic data structure that supports point location queries in a connected planar subdivision with n edges. Our data structure uses O(n) space, answers queries in O(logn) time, and supports updates in O(logn) time. Our solution is based on a data structure for vertical ray shooting queries that supports queries and updates in O(logn) time. @InProceedings{STOC21p1003, author = {Yakov Nekrich}, title = {Dynamic Planar Point Location in Optimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1003--1014}, doi = {10.1145/3406325.3451100}, year = {2021}, } Publisher's Version |
|
Niazadeh, Rad |
STOC '21: "Combinatorial Bernoulli Factories: ..."
Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes
Rad Niazadeh, Renato Paes Leme, and Jon Schneider (University of Chicago, USA; Google Research, USA) A Bernoulli factory is an algorithmic procedure for exact sampling of certain random variables having only Bernoulli access to their parameters. Bernoulli access to a parameter p ∈ [0,1] means the algorithm does not know p, but has sample access to independent draws of a Bernoulli random variable with mean equal to p. In this paper, we study the problem of Bernoulli factories for polytopes: given Bernoulli access to a vector x∈ P for a given polytope P⊂ [0,1]n, output a randomized vertex such that the expected value of the i-th coordinate is exactly equal to xi. For example, for the special case of the perfect matching polytope, one is given Bernoulli access to the entries of a doubly stochastic matrix [xij] and asked to sample a matching such that the probability of each edge (i,j) be present in the matching is exactly equal to xij. We show that a polytope P admits a Bernoulli factory if and and only if P is the intersection of [0,1]n with an affine subspace. Our construction is based on an algebraic formulation of the problem, involving identifying a family of Bernstein polynomials (one per vertex) that satisfy a certain algebraic identity on P. The main technical tool behind our construction is a connection between these polynomials and the geometry of zonotope tilings. We apply these results to construct an explicit factory for the perfect matching polytope. The resulting factory is deeply connected to the combinatorial enumeration of arborescences and may be of independent interest. For the k-uniform matroid polytope, we recover a sampling procedure known in statistics as Sampford sampling. @InProceedings{STOC21p833, author = {Rad Niazadeh and Renato Paes Leme and Jon Schneider}, title = {Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {833--846}, doi = {10.1145/3406325.3451072}, year = {2021}, } Publisher's Version |
|
Nisan, Noam |
STOC '21: "Bipartite Perfect Matching ..."
Bipartite Perfect Matching as a Real Polynomial
Gal Beniamini and Noam Nisan (Hebrew University of Jerusalem, Israel) We obtain a description of the Bipartite Perfect Matching decision problem as a multilinear polynomial over the Reals. We show that it has full degree and (1−on(1))· 2n2 monomials with non-zero coefficients. In contrast, we show that in the dual representation (switching the roles of 0 and 1) the number of monomials is only exponential in Θ(n logn). Our proof relies heavily on the fact that the lattice of graphs which are “matching-covered” is Eulerian. @InProceedings{STOC21p1118, author = {Gal Beniamini and Noam Nisan}, title = {Bipartite Perfect Matching as a Real Polynomial}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1118--1131}, doi = {10.1145/3406325.3451002}, year = {2021}, } Publisher's Version |
|
Nordström, Jakob |
STOC '21: "Automating Algebraic Proof ..."
Automating Algebraic Proof Systems Is NP-Hard
Susanna F. de Rezende, Mika Göös, Jakob Nordström, Toniann Pitassi, Robert Robere, and Dmitry Sokolov (Czech Academy of Sciences, Czechia; EPFL, Switzerland; University of Copenhagen, Denmark; Lund University, Sweden; University of Toronto, Canada; Institute for Advanced Study at Princeton, USA; McGill University, Canada; St. Petersburg State University, Russia; Russian Academy of Sciences, Russia) We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula F, it is NP-hard to find a refutation of F in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time polynomial in the size of the shortest such refutation. Our work extends, and gives a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020) that established an analogous result for Resolution. @InProceedings{STOC21p209, author = {Susanna F. de Rezende and Mika Göös and Jakob Nordström and Toniann Pitassi and Robert Robere and Dmitry Sokolov}, title = {Automating Algebraic Proof Systems Is NP-Hard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {209--222}, doi = {10.1145/3406325.3451080}, year = {2021}, } Publisher's Version |
|
Nowicki, Krzysztof |
STOC '21: "A Deterministic Algorithm ..."
A Deterministic Algorithm for the MST Problem in Constant Rounds of Congested Clique
Krzysztof Nowicki (University of Copenhagen, Denmark; University of Wrocław, Poland) In this paper we show that the Minimum Spanning Tree problem (MST) can be solved deterministically in O(1) rounds of the Congested Clique model. In the Congested Clique model there are n players that perform computation in synchronous rounds. Each round consist of a phase of local computation and a phase of communication, in which each pair of players is allowed to exchange O(logn) bit messages. The studies of this model began with the MST problem: in the paper by Lotker, Pavlov, Patt-Shamir, and Peleg [SPAA’03, SICOMP’05] that defines the Congested Clique model the authors give a deterministic O(loglogn) round algorithm that improved over a trivial O(logn) round adaptation of Borůvka’s algorithm. There was a sequence of gradual improvements to this result: an O(logloglogn) round algorithm by Hegeman, Pandurangan, Pemmaraju, Sardeshmukh, and Scquizzato [PODC’15], an O(log* n) round algorithm by Ghaffari and Parter, [PODC’16] and an O(1) round algorithm by Jurdziński and Nowicki, [SODA’18], but all those algorithms were randomized. Therefore, the question about the existence of any deterministic o(loglogn) round algorithms for the Minimum Spanning Tree problem remains open since the seminal paper by Lotker, Pavlov, Patt-Shamir, and Peleg [SPAA’03, SICOMP’05]. Our result resolves this question and establishes that O(1) rounds is enough to solve the MST problem in the Congested Clique model, even if we are not allowed to use any randomness. Furthermore, the amount of communication needed by the algorithm makes it applicable to a variant of the MPC model using machines with local memory of size O(n). @InProceedings{STOC21p1154, author = {Krzysztof Nowicki}, title = {A Deterministic Algorithm for the MST Problem in Constant Rounds of Congested Clique}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1154--1165}, doi = {10.1145/3406325.3451136}, year = {2021}, } Publisher's Version |
|
O'Donnell, Ryan |
STOC '21: "Improved Quantum Data Analysis ..."
Improved Quantum Data Analysis
Costin Bădescu and Ryan O'Donnell (Carnegie Mellon University, USA) We provide more sample-efficient versions of some basic routines in quantum data analysis, along with simpler proofs. Particularly, we give a quantum ”Threshold Search” algorithm that requires only O((log2 m)/є2) samples of a d-dimensional state ρ. That is, given observables 0 ≤ A1, A2, …, Am ≤ 1 such that (ρ Ai) ≥ 1/2 for at least one i, the algorithm finds j with (ρ Aj) ≥ 1/2−є. As a consequence, we obtain a Shadow Tomography algorithm requiring only O((log2 m)(logd)/є4) samples, which simultaneously achieves the best known dependence on each parameter m, d, є. This yields the same sample complexity for quantum Hypothesis Selection among m states; we also give an alternative Hypothesis Selection method using O((log3 m)/є2) samples. @InProceedings{STOC21p1398, author = {Costin Bădescu and Ryan O'Donnell}, title = {Improved Quantum Data Analysis}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1398--1411}, doi = {10.1145/3406325.3451109}, year = {2021}, } Publisher's Version STOC '21: "Fiber Bundle Codes: Breaking ..." Fiber Bundle Codes: Breaking the N1/2 polylog(N) Barrier for Quantum LDPC Codes Matthew B. Hastings, Jeongwan Haah, and Ryan O'Donnell (Station Q, USA; Microsoft Quantum, USA; Carnegie Mellon University, USA) We present a quantum LDPC code family that has distance Ω(N3/5/polylog(N)) and Θ(N3/5) logical qubits, where N is the code length. This is the first quantum LDPC code construction that achieves distance greater than N1/2 polylog(N). The construction is based on generalizing the homological product of codes to a fiber bundle. @InProceedings{STOC21p1276, author = {Matthew B. Hastings and Jeongwan Haah and Ryan O'Donnell}, title = {Fiber Bundle Codes: Breaking the <i>N</i><sup>1/2</sup> polylog(<i>N</i>) Barrier for Quantum LDPC Codes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1276--1288}, doi = {10.1145/3406325.3451005}, year = {2021}, } Publisher's Version |
|
Oliveira, Igor C. |
STOC '21: "Pseudodeterministic Algorithms ..."
Pseudodeterministic Algorithms and the Structure of Probabilistic Time
Zhenjian Lu, Igor C. Oliveira, and Rahul Santhanam (University of Warwick, UK; University of Oxford, UK) We connect the study of pseudodeterministic algorithms to two major open problems about the structural complexity of BPTIME: proving hierarchy theorems and showing the existence of complete problems. Our main contributions can be summarised as follows. A new pseudorandom generator and its consequences. We build on techniques developed to prove hierarchy theorems for probabilistic time with advice (Fortnow and Santhanam, FOCS 2004) to construct the first unconditional pseudorandom generator of polynomial stretch computable in pseudodeterministic polynomial time (with one bit of advice) that is secure infinitely often against polynomial-time computations. As an application of this construction, we obtain new results about the complexity of generating and representing prime numbers. For instance, we show unconditionally for each ε > 0 that infinitely many primes pn have a succinct representation in the following sense: there is a fixed probabilistic polynomial time algorithm that generates pn with high probability from its succinct representation of size O(|pn|ε). This offers an exponential improvement over the running time of previous results, and shows that infinitely many primes have succinct and efficient representations. Structural results for probabilistic time from pseudodeterministic algorithms. Oliveira and Santhanam (STOC 2017) established unconditionally that there is a pseudodeterministic algorithm for the Circuit Acceptance Probability Problem (CAPP) that runs in sub-exponential time and is correct with high probability over any samplable distribution on circuits on infinitely many input lengths. We show that improving this running time or obtaining a result that holds for every large input length would imply new time hierarchy theorems for probabilistic time. In addition, we prove that a worst-case polynomial-time pseudodeterministic algorithm for CAPP would imply that BPP has complete problems. Equivalence between pseudodeterministic constructions and hierarchies. We establish an equivalence between a certain explicit pseudodeterministic construction problem and the existence of strong hierarchy theorems for probabilistic time. More precisely, we show that pseudodeterministically constructing in exponential time strings of large rKt complexity (Oliveira, ICALP 2019) is possible if and only if for every constructive function T(n) ≤ exp(o(exp(n))) we have BPTIME[poly(T)] ⊈ i.o.BPTIME[T]/logT. More generally, these results suggest new approaches for designing pseudodeterministic algorithms for search problems and for unveiling the structure of probabilistic time. @InProceedings{STOC21p303, author = {Zhenjian Lu and Igor C. Oliveira and Rahul Santhanam}, title = {Pseudodeterministic Algorithms and the Structure of Probabilistic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {303--316}, doi = {10.1145/3406325.3451085}, year = {2021}, } Publisher's Version |
|
Oshman, Rotem |
STOC '21: "The Communication Complexity ..."
The Communication Complexity of Multiparty Set Disjointness under Product Distributions
Nachum Dershowitz, Rotem Oshman, and Tal Roth (Tel Aviv University, Israel) In the multiparty number-in-hand set disjointness problem, we have k players, with private inputs X1,…,Xk ⊆ [n]. The players’ goal is to check whether ∩ℓ=1k Xℓ = ∅. It is known that in the shared blackboard model of communication, set disjointness requires Ω(n logk + k) bits of communication, and in the coordinator model, it requires Ω(kn) bits. However, these two lower bounds require that the players’ inputs can be highly correlated. We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the two-party case. Our main result is a nearly-tight bound of Θ̃(n1−1/k + k) for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the two-party case do not scale to more than two players. @InProceedings{STOC21p1194, author = {Nachum Dershowitz and Rotem Oshman and Tal Roth}, title = {The Communication Complexity of Multiparty Set Disjointness under Product Distributions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1194--1207}, doi = {10.1145/3406325.3451106}, year = {2021}, } Publisher's Version |
|
Pan, Qinxuan |
STOC '21: "Sample-Optimal and Efficient ..."
Sample-Optimal and Efficient Learning of Tree Ising Models
Constantinos Daskalakis and Qinxuan Pan (Massachusetts Institute of Technology, USA) We show that n-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance є from an optimal O(n lnn/є2) samples, where O(·) hides an absolute constant which, importantly, does not depend on the model being learned—neither its tree nor the magnitude of its edge strengths, on which we place no assumptions. Our guarantees hold, in fact, for the celebrated Chow-Liu algorithm [1968], using the plug-in estimator for estimating mutual information. While this (or any other) algorithm may fail to identify the structure of the underlying model correctly from a finite sample, we show that it will still learn a tree-structured model that is є-close to the true one in total variation distance, a guarantee called “proper learning.” Our guarantees do not follow from known results for the Chow-Liu algorithm and the ensuing literature on learning graphical models, including the very recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or sample-suboptimal and/or time-inefficient algorithms, unless further assumptions are placed on the model, such as bounds on the “strengths” of the model’s edges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sample-optimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation. @InProceedings{STOC21p133, author = {Constantinos Daskalakis and Qinxuan Pan}, title = {Sample-Optimal and Efficient Learning of Tree Ising Models}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {133--146}, doi = {10.1145/3406325.3451006}, year = {2021}, } Publisher's Version |
|
Panigrahi, Debmalya |
STOC '21: "Approximate Gomory–Hu Tree ..."
Approximate Gomory–Hu Tree Is Faster Than n – 1 Max-Flows
Jason Li and Debmalya Panigrahi (Carnegie Mellon University, USA; Duke University, USA) The Gomory-Hu tree or cut tree (Gomory and Hu, 1961) is a classic data structure for reporting s−t mincuts (and by duality, the values of s−t maxflows) for all pairs of vertices s and t in an undirected graph. Gomory and Hu showed that it can be computed using n−1 exact maxflow computations. Surprisingly, this remains the best algorithm for Gomory-Hu trees more than 50 years later, even for approximate mincuts. In this paper, we break this longstanding barrier and give an algorithm for computing a (1+є)-approximate Gomory-Hu tree using log(n) maxflow computations. Specifically, we obtain the runtime bounds we describe below. We obtain a randomized (Monte Carlo) algorithm for undirected, weighted graphs that runs in Õ(m + n3/2) time and returns a (1+є)-approximate Gomory-Hu tree algorithm whp. Previously, the best running time known was Õ(n5/2), which is obtained by running Gomory and Hu’s original algorithm on a cut sparsifier of the graph. Next, we obtain a randomized (Monte Carlo) algorithm for undirected, unweighted graphs that runs in m4/3+o(1) time and returns a (1+є)-approximate Gomory-Hu tree algorithm whp. This improves on our first result for sparse graphs, namely m = o(n9/8). Previously, the best running time known for unweighted graphs was Õ(mn) for an exact Gomory-Hu tree (Bhalgat et al., STOC 2007); no better result was known if approximations are allowed. As a consequence of our Gomory-Hu tree algorithms, we also solve the (1+є)-approximate all pairs mincut and single source mincut problems in the same time bounds. (These problems are simpler in that the goal is to only return the s−t mincut values, and not the mincuts.) This improves on the recent algorithm for these problems in Õ(n2) time due to Abboud et al. (FOCS 2020). @InProceedings{STOC21p1738, author = {Jason Li and Debmalya Panigrahi}, title = {Approximate Gomory–Hu Tree Is Faster Than <i>n</i> – 1 Max-Flows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1738--1748}, doi = {10.1145/3406325.3451112}, year = {2021}, } Publisher's Version STOC '21: "Vertex Connectivity in Poly-logarithmic ..." Vertex Connectivity in Poly-logarithmic Max-Flows Jason Li, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Sorrachai Yingchareonthawornchai (Carnegie Mellon University, USA; University of Copenhagen, Denmark; KTH, Sweden; Duke University, USA; University of Michigan, USA; Aalto University, Finland) The vertex connectivity of an m-edge n-vertex undirected graph is the smallest number of vertices whose removal disconnects the graph, or leaves only a singleton vertex. In this paper, we give a reduction from the vertex connectivity problem to a set of maxflow instances. Using this reduction, we can solve vertex connectivity in (mα) time for any α ≥ 1, if there is a mα-time maxflow algorithm. Using the current best maxflow algorithm that runs in m4/3+o(1) time (Kathuria, Liu and Sidford, FOCS 2020), this yields a m4/3+o(1)-time vertex connectivity algorithm. This is the first improvement in the running time of the vertex connectivity problem in over 20 years, the previous best being an Õ(mn)-time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm with an o(mn) running time was known before our work, even if we assume an (m)-time maxflow algorithm. Our new technique is robust enough to also improve the best Õ(mn)-time bound for directed vertex connectivity to mn1−1/12+o(1) time @InProceedings{STOC21p317, author = {Jason Li and Danupon Nanongkai and Debmalya Panigrahi and Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai}, title = {Vertex Connectivity in Poly-logarithmic Max-Flows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {317--329}, doi = {10.1145/3406325.3451088}, year = {2021}, } Publisher's Version |
|
Paramonov, Dmitry |
STOC '21: "Almost Optimal Super-Constant-Pass ..."
Almost Optimal Super-Constant-Pass Streaming Lower Bounds for Reachability
Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu (Massachusetts Institute of Technology, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA) We give an almost quadratic n2−o(1) lower bound on the space consumption of any o(√logn)-pass streaming algorithm solving the (directed) s-t reachability problem. This means that any such algorithm must essentially store the entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional fundamental problems, including maximum matching, shortest path, matrix rank, and linear programming. Our main technical contribution is the definition and construction of set hiding graphs, that may be of independent interest: we give a general way of encoding a set S ⊆ [k] as a directed graph with n = k 1 + o( 1 ) vertices, such that deciding whether i ∈ S boils down to deciding if ti is reachable from si, for a specific pair of vertices (si,ti) in the graph. Furthermore, we prove that our graph “hides” S, in the sense that no low-space streaming algorithm with a small number of passes can learn (almost) anything about S. @InProceedings{STOC21p570, author = {Lijie Chen and Gillat Kol and Dmitry Paramonov and Raghuvansh R. Saxena and Zhao Song and Huacheng Yu}, title = {Almost Optimal Super-Constant-Pass Streaming Lower Bounds for Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {570--583}, doi = {10.1145/3406325.3451038}, year = {2021}, } Publisher's Version |
|
Parisi, Daniel |
STOC '21: "Entropy Decay in the Swendsen–Wang ..."
Entropy Decay in the Swendsen–Wang Dynamics on ℤd
Antonio Blanca, Pietro Caputo, Daniel Parisi, Alistair Sinclair, and Eric Vigoda (Pennsylvania State University, USA; Roma Tre University, Italy; University of California at Berkeley, USA; Georgia Institute of Technology, USA) We study the mixing time of the Swendsen-Wang dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤd. This dynamics is a widely used Markov chain that has largely resisted sharp analysis because it is non-local, i.e., it changes the entire configuration in one step. We prove that, whenever strong spatial mixing (SSM) holds, the mixing time on any n-vertex cube in ℤd is O(logn), and we prove this is tight by establishing a matching lower bound. The previous best known bound was O(n). SSM is a standard condition corresponding to exponential decay of correlations with distance between spins on the lattice and is known to hold in d=2 dimensions throughout the high-temperature (single phase) region. Our result follows from a modified log-Sobolev inequality, which expresses the fact that the dynamics contracts relative entropy at a constant rate at each step. The proof of this fact utilizes a new factorization of the entropy in the joint probability space over spins and edges that underlies the Swendsen-Wang dynamics, which extends to general bipartite graphs of bounded degree. This factorization leads to several additional results, including mixing time bounds for a number of natural local and non-local Markov chains on the joint space, as well as for the standard random-cluster dynamics. @InProceedings{STOC21p1551, author = {Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda}, title = {Entropy Decay in the Swendsen–Wang Dynamics on ℤ<sup><i>d</i></sup>}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1551--1564}, doi = {10.1145/3406325.3451095}, year = {2021}, } Publisher's Version |
|
Pass, Rafael |
STOC '21: "Cryptography from Sublinear-Time ..."
Cryptography from Sublinear-Time Average-Case Hardness of Time-Bounded Kolmogorov Complexity
Yanyi Liu and Rafael Pass (Cornell University, USA) Let MKtP[s] be the set of strings x such that Kt(x) ≤ s(|x|), where Kt(x) denotes the t-bounded Kolmogorov complexity of the truthtable described by x. Our main theorem shows that for an appropriate notion of mild average-case hardness, for every ε>0, polynomial t(n) ≥ (1+ε)n, and every “nice” class F of super-polynomial functions, the following are equivalent: (i) the existence of some function T ∈ F such that T-hard one-way functions (OWF) exists (with non-uniform security); (ii) the existence of some function T ∈ F such that MKtP[T−1] is mildly average-case hard with respect to sublinear-time non-uniform algorithms (with running-time nδ for some 0<δ<1). For instance, existence of subexponentially-hard (resp. quasi-poly-nomially-hard) OWFs is equivalent to mild average-case hardness of MKtP[poly logn] (resp. MKtP[2O(√logn))]) w.r.t. sublinear-time non-uniform algorithms. We additionally note that if we want to deduce T-hard OWFs where security holds w.r.t. uniform T-time probabilistic attackers (i.e., uniformly-secure OWFs), it suffices to assume sublinear time hardness of MKtP w.r.t. uniform probabilistic sublinear-time attackers. We complement this result by proving lower bounds that come surprisingly close to what is required to unconditionally deduce the existence of (uniformly-secure) OWFs: MKtP[polylogn] is worst-case hard w.r.t. uniform probabilistic sublinear-time algorithms, and MKtP[n−logn] is mildly average-case hard for all O(t(n)/n3)-time deterministic algorithms. @InProceedings{STOC21p722, author = {Yanyi Liu and Rafael Pass}, title = {Cryptography from Sublinear-Time Average-Case Hardness of Time-Bounded Kolmogorov Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {722--735}, doi = {10.1145/3406325.3451121}, year = {2021}, } Publisher's Version STOC '21: "Indistinguishability Obfuscation ..." Indistinguishability Obfuscation from Circular Security Romain Gay and Rafael Pass (IBM Research, Switzerland; Cornell Tech, USA) We show the existence of indistinguishability obfuscators (iO) for general circuits assuming subexponential security of: (a) the Learning with Errors (LWE) assumption (with subexponential modulus-to-noise ratio); (b) a circular security conjecture regarding the Gentry-Sahai-Waters' (GSW) encryption scheme and a Packed version of Regev's encryption scheme. The circular security conjecture states that a notion of leakage-resilient security, that we prove is satisfied by GSW assuming LWE, is retained in the presence of an encrypted key-cycle involving GSW and Packed Regev. @InProceedings{STOC21p736, author = {Romain Gay and Rafael Pass}, title = {Indistinguishability Obfuscation from Circular Security}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {736--749}, doi = {10.1145/3406325.3451070}, year = {2021}, } Publisher's Version |
|
Peebles, John |
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of Wisconsin-Madison, USA; MPI-INF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are є-far from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sample-optimal testers for a range of properties. While one can always boost the confidence probability of any such tester by black-box amplification, this generic boosting method typically leads to sub-optimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors, as a function of all relevant parameters. We also show matching information-theoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542--555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version |
|
Peleg, Shir |
STOC '21: "Polynomial Time Deterministic ..."
Polynomial Time Deterministic Identity Testing Algorithm for Σ[3]ΠΣΠ[2] Circuits via Edelstein–Kelly Type Theorem for Quadratic Polynomials
Shir Peleg and Amir Shpilka (Tel Aviv University, Israel) In this work we resolve conjectures of Beecken, Mitmann and Saxena [BMS13] and Gupta [Gupta14], by proving an analog of a theorem of Edelstein and Kelly for quadratic polynomials. As immediate corollary we obtain the first deterministic polynomial time black-box algorithm for testing zeroness of Σ[3]ΠΣΠ[2] circuits. @InProceedings{STOC21p259, author = {Shir Peleg and Amir Shpilka}, title = {Polynomial Time Deterministic Identity Testing Algorithm for Σ<sup>[3]</sup>ΠΣΠ<sup>[2]</sup> Circuits via Edelstein–Kelly Type Theorem for Quadratic Polynomials}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {259--271}, doi = {10.1145/3406325.3451013}, year = {2021}, } Publisher's Version Info |
|
Peri, Noam |
STOC '21: "Expander Random Walks: A Fourier-Analytic ..."
Expander Random Walks: A Fourier-Analytic Approach
Gil Cohen, Noam Peri, and Amnon Ta-Shma (Tel Aviv University, Israel) In this work we ask the following basic question: assume the vertices of an expander graph are labelled by 0,1. What “test” functions f : { 0,1}t → {0,1} cannot distinguish t independent samples from those obtained by a random walk? The expander hitting property due to Ajtai, Komlos and Szemeredi (STOC 1987) is captured by the AND test function, whereas the fundamental expander Chernoff bound due to Gillman (SICOMP 1998), Heally (Computational Complexity 2008) is about test functions indicating whether the weight is close to the mean. In fact, it is known that all threshold functions are fooled by a random walk (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Recently, it was shown that even the highly sensitive PARITY function is fooled by a random walk Ta-Shma (STOC 2017). We focus on balanced labels. Our first main result is proving that all symmetric functions are fooled by a random walk. Put differently, we prove a central limit theorem (CLT) for expander random walks with respect to the total variation distance, significantly strengthening the classic CLT for Markov Chains that is established with respect to the Kolmogorov distance (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Our approach significantly deviates from prior works. We first study how well a Fourier character χS is fooled by a random walk as a function of S. Then, given a test function f, we expand f in the Fourier basis and combine the above with known results on the Fourier spectrum of f. We also proceed further and consider general test functions - not necessarily symmetric. As our approach is Fourier analytic, it is general enough to analyze such versatile test functions. For our second result, we prove that random walks on sufficiently good expander graphs fool tests functions computed by AC0 circuits, read-once branching programs, and functions with bounded query complexity. @InProceedings{STOC21p1643, author = {Gil Cohen and Noam Peri and Amnon Ta-Shma}, title = {Expander Random Walks: A Fourier-Analytic Approach}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1643--1655}, doi = {10.1145/3406325.3451049}, year = {2021}, } Publisher's Version Info |
|
Perkins, Will |
STOC '21: "Frozen 1-RSB Structure of ..."
Frozen 1-RSB Structure of the Symmetric Ising Perceptron
Will Perkins and Changji Xu (University of Illinois at Chicago, USA; Harvard University, USA) We prove, under an assumption on the critical points of a real-valued function, that the symmetric Ising perceptron exhibits the `frozen 1-RSB' structure conjectured by Krauth and Mezard in the physics literature; that is, typical solutions of the model lie in clusters of vanishing entropy density. Moreover, we prove this in a very strong form conjectured by Huang, Wong, and Kabashima: a typical solution of the model is isolated with high probability and the Hamming distance to all other solutions is linear in the dimension. The frozen 1-RSB scenario is part of a recent and intriguing explanation of the performance of learning algorithms by Baldassi, Ingrosso, Lucibello, Saglietti, and Zecchina. We prove this structural result by comparing the symmetric Ising perceptron model to a planted model and proving a comparison result between the two models. Our main technical tool towards this comparison is an inductive argument for the concentration of the logarithm of number of solutions in the model. @InProceedings{STOC21p1579, author = {Will Perkins and Changji Xu}, title = {Frozen 1-RSB Structure of the Symmetric Ising Perceptron}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1579--1588}, doi = {10.1145/3406325.3451119}, year = {2021}, } Publisher's Version |
|
Pettie, Seth |
STOC '21: "Information Theoretic Limits ..."
Information Theoretic Limits of Cardinality Estimation: Fisher Meets Shannon
Seth Pettie and Dingyu Wang (University of Michigan, USA) Estimating the cardinality (number of distinct elements) of a large multiset is a classic problem in streaming and sketching, dating back to Flajolet and Martin’s classic Probabilistic Counting (PCSA) algorithm from 1983. In this paper we study the intrinsic tradeoff between the space complexity of the sketch and its estimation error in the random oracle model. We define a new measure of efficiency for cardinality estimators called the Fisher-Shannon (Fish) number H/I. It captures the tension between the limiting Shannon entropy (H) of the sketch and its normalized Fisher information (I), which characterizes the variance of a statistically efficient, asymptotically unbiased estimator. Our results are as follows. (i) We prove that all base-q variants of Flajolet and Martin’s PCSA sketch have Fish-number H0/I0 ≈ 1.98016 and that every base-q variant of (Hyper)LogLog has Fish-number worse than H0/I0, but that they tend to H0/I0 in the limit as q→ ∞. Here H0,I0 are precisely defined constants. (ii) We describe a sketch called Fishmonger that is based on a smoothed, entropy-compressed variant of PCSA with a different estimator function. It is proved that with high probability, Fishmonger processes a multiset of [U] such that at all times, its space is O(log2logU) + (1+o(1))(H0/I0)b ≈ 1.98b bits and its standard error is 1/√b. (iii) We give circumstantial evidence that H0/I0 is the optimum Fish-number of mergeable sketches for Cardinality Estimation. We define a class of linearizable sketches and prove that no member of this class can beat H0/I0. The popular mergeable sketches are, in fact, also linearizable. @InProceedings{STOC21p556, author = {Seth Pettie and Dingyu Wang}, title = {Information Theoretic Limits of Cardinality Estimation: Fisher Meets Shannon}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {556--569}, doi = {10.1145/3406325.3451032}, year = {2021}, } Publisher's Version |
|
Pich, Ján |
STOC '21: "Strong Co-nondeterministic ..."
Strong Co-nondeterministic Lower Bounds for NP Cannot Be Proved Feasibly
Ján Pich and Rahul Santhanam (Czech Academy of Sciences, Czechia; University of Oxford, UK) We show unconditionally that Cook’s theory PV formalizing poly-time reasoning cannot prove, for any non-deterministic poly-time machine M defining a language L(M), that L(M) is inapproximable by co-nondeterministic circuits of sub-exponential size. In fact, our unprovability result holds also for a theory which supports a fragment of Jeřábek’s theory of approximate counting APC1. We also show similar unconditional unprovability results for the conjecture of Rudich about the existence of super-bits. @InProceedings{STOC21p223, author = {Ján Pich and Rahul Santhanam}, title = {Strong Co-nondeterministic Lower Bounds for NP Cannot Be Proved Feasibly}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {223--233}, doi = {10.1145/3406325.3451117}, year = {2021}, } Publisher's Version |
|
Pilipczuk, Marcin |
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C>t -Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C>t-free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an n-vertex C>t-free graph G with weights on vertices, finds in time n(log3 n) a maximum-weight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n(log2 n) assuming G is Pt-free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on Pt-free graphs in two directions: by encompassing the more general setting of C>t-free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> -Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330--341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version |
|
Pilipczuk, Michał |
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C>t -Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C>t-free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an n-vertex C>t-free graph G with weights on vertices, finds in time n(log3 n) a maximum-weight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n(log2 n) assuming G is Pt-free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on Pt-free graphs in two directions: by encompassing the more general setting of C>t-free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> -Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330--341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version |
|
Pitassi, Toniann | STOC '21: "Automating Algebraic Proof ..." |