STOC 2021 – Author Index 
Contents 
Abstracts 
Authors

A B C D E F G H J K L M N O P Q R S T U V W X Y Z
Aamand, Anders 
STOC '21: "Load Balancing with Dynamic ..."
Load Balancing with Dynamic Set of Balls and Bins
Anders Aamand, Jakob Bæk Tejs Knudsen, and Mikkel Thorup (University of Copenhagen, Denmark) In dynamic load balancing, we wish to distribute balls into bins in an environment where both balls and bins can be added and removed. We want to minimize the maximum load of any bin but we also want to minimize the number of balls and bins that are affected when adding or removing a ball or a bin. We want a hashingstyle solution where we given the ID of a ball can find its bin efficiently. We are given a userspecified balancing parameter c=1+ε, where ε∈ (0,1). Let n and m be the current number of balls and bins. Then we want no bin with load above C=⌈ c n/m⌉, referred to as the capacity of the bins. We present a scheme where we can locate a ball checking 1+O(log1/ε) bins in expectation. When inserting or deleting a ball, we expect to move O(1/ε) balls, and when inserting or deleting a bin, we expect to move O(C/ε) balls. Previous bounds were off by a factor 1/ε. The above bounds are best possible when C=O(1) but for larger C, we can do much better: We define f=ε C when C≤ log1/ε, f=ε√C· √log(1/(ε√C)) when log1/ε≤ C<1/2ε^{2}, and f=1 when C≥ 1/2ε^{2}. We show that we expect to move O(1/f) balls when inserting or deleting a ball, and O(C/f) balls when inserting or deleting a bin. Moreover, when C≥ log1/ε, we can search a ball checking only O(1) bins in expectation. For the bounds with larger C, we first have to resolve a much simpler probabilistic problem. Place n balls in m bins of capacity C, one ball at the time. Each ball picks a uniformly random nonfull bin. We show that in expectation and with high probability, the fraction of nonfull bins is Θ(f). Then the expected number of bins that a new ball would have to visit to find one that is not full is Θ(1/f). As it turns out, this is also the complexity of an insertion in our more complicated scheme where both balls and bins can be added and removed. @InProceedings{STOC21p1262, author = {Anders Aamand and Jakob Bæk Tejs Knudsen and Mikkel Thorup}, title = {Load Balancing with Dynamic Set of Balls and Bins}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12621275}, doi = {10.1145/3406325.3451107}, year = {2021}, } Publisher's Version 

Aamari, Eddie 
STOC '21: "Statistical Query Complexity ..."
Statistical Query Complexity of Manifold Estimation
Eddie Aamari and Alexander Knop (LPSM, France; Sorbonne University, France; University of Paris, France; CNRS, France; University of California at San Diego, USA) This paper studies the statistical query (SQ) complexity of estimating ddimensional submanifolds in ℝ^{n}. We propose a purely geometric algorithm called Manifold Propagation, that reduces the problem to three natural geometric routines: projection, tangent space estimation, and point detection. We then provide constructions of these geometric routines in the SQ framework. Given an adversarial STAT(τ) oracle and a target Hausdorff distance precision ε = Ω(τ^{2/(d+1)}), the resulting SQ manifold reconstruction algorithm has query complexity O(n polylog(n) ε^{−d/2}), which is proved to be nearly optimal. In the process, we establish lowrank matrix completion results for SQ’s and lower bounds for randomized SQ estimators in general metric spaces. @InProceedings{STOC21p116, author = {Eddie Aamari and Alexander Knop}, title = {Statistical Query Complexity of Manifold Estimation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {116122}, doi = {10.1145/3406325.3451135}, year = {2021}, } Publisher's Version 

Aaronson, Scott 
STOC '21: "Degree vs. Approximate Degree ..."
Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem
Scott Aaronson, Shalev BenDavid, Robin Kothari, Shravas Rao, and Avishay Tal (University of Texas at Austin, USA; University of Waterloo, Canada; Microsoft Quantum, USA; Microsoft Research, USA; Northwestern University, USA; University of California at Berkeley, USA) Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f, deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function. D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017). We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg conjecture. We show that if f is a nontrivial monotone graph property of an nvertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal. We also show that the approximate degree of any readonce formula on n variables is Θ(√n). @InProceedings{STOC21p1330, author = {Scott Aaronson and Shalev BenDavid and Robin Kothari and Shravas Rao and Avishay Tal}, title = {Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13301342}, doi = {10.1145/3406325.3451047}, year = {2021}, } Publisher's Version 

Abboud, Amir 
STOC '21: "Subcubic Algorithms for Gomory–Hu ..."
Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs
Amir Abboud, Robert Krauthgamer, and Ohad Trabelsi (Weizmann Institute of Science, Israel) Every undirected graph G has a (weighted) cutequivalent tree T, commonly named after Gomory and Hu who discovered it in 1961. Both T and G have the same node set, and for every node pair s,t, the minimum (s,t)cut in T is also an exact minimum (s,t)cut in G. We give the first subcubictime algorithm that constructs such a tree for a simple graph G (unweighted with no parallel edges). Its time complexity is Õ(n^{2.5}), for n=V(G); previously, only Õ(n^{3}) was known, except for restricted cases like sparse graphs. Consequently, we obtain the first algorithm for AllPairs MaxFlow in simple graphs that breaks the cubictime barrier. Gomory and Hu compute this tree using n−1 queries to (singlepair) MaxFlow; the new algorithm can be viewed as a finegrained reduction to Õ(√n) MaxFlow computations on nnode graphs. @InProceedings{STOC21p1725, author = {Amir Abboud and Robert Krauthgamer and Ohad Trabelsi}, title = {Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17251737}, doi = {10.1145/3406325.3451073}, year = {2021}, } Publisher's Version 

Aho, Alfred V. 
STOC '21: "Computational Thinking in ..."
Computational Thinking in Programming Language and Compiler Design (Keynote)
Alfred V. Aho (Columbia University, USA) Abstractions and algorithms are at the heart of computational thinking. In this talk I will discuss the evolution of the theory and practice of programming language and compiler design through the lens of computational thinking. Many of the key concepts in this area were introduced at the ACM Symposium on the Theory of Computing. @InProceedings{STOC21p1, author = {Alfred V. Aho}, title = {Computational Thinking in Programming Language and Compiler Design (Keynote)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11}, doi = {10.1145/3406325.3465350}, year = {2021}, } Publisher's Version 

Alimohammadi, Yeganeh 
STOC '21: "Fractionally LogConcave and ..."
Fractionally LogConcave and SectorStable Polynomials: Counting Planar Matchings and More
Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and ThuyDuong Vuong (Stanford University, USA) We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomerdimer systems in planar, notnecessarilybipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting nonperfect matchings was shown by Jerrum (J Stat Phys 1987) to be #Phard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multisite Glauber dynamics on the set of monomers in a monomerdimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downwardclosed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multisite Glauber dynamics from partitionconstrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multisite Glauber dynamics, we establish two notions for generating polynomials of discrete setvalued distributions: sectorstability and fractional logconcavity. These notions generalize wellstudied properties like realstability and logconcavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional logconcavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a halfplane must be logconcave over the positive orthant. @InProceedings{STOC21p433, author = {Yeganeh Alimohammadi and Nima Anari and Kirankumar Shiragur and ThuyDuong Vuong}, title = {Fractionally LogConcave and SectorStable Polynomials: Counting Planar Matchings and More}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {433446}, doi = {10.1145/3406325.3451123}, year = {2021}, } Publisher's Version Info 

Alman, Josh 
STOC '21: "Kronecker Products, LowDepth ..."
Kronecker Products, LowDepth Circuits, and Matrix Rigidity
Josh Alman (Harvard University, USA) For a matrix M and a positive integer r, the rank r rigidity of M is the smallest number of entries of M which one must change to make its rank at most r. There are many known applications of rigidity lower bounds to a variety of areas in complexity theory, but fewer known applications of rigidity upper bounds. In this paper, we use rigidity upper bounds to prove new upper bounds in a few different models of computation. Our results include:  For any d>1, and over any field F, the N × N WalshHadamard transform has a depthd linear circuit of size O(d · N^{1 + 0.96/d}). This circumvents a known lower bound of Ω(d · N^{1 + 1/d}) for circuits with bounded coefficients over ℂ by Pudlák (2000), by using coefficients of magnitude polynomial in N. Our construction also generalizes to linear transformations given by a Kronecker power of any fixed 2 × 2 matrix.  The N × N WalshHadamard transform has a linear circuit of size ≤ (1.81 + o(1)) N log_{2} N, improving on the bound of ≈ 1.88 N log_{2} N which one obtains from the standard fast WalshHadamard transform.  A new rigidity upper bound, showing that the following classes of matrices are not rigid enough to prove circuit lower bounds using Valiant’s approach: (1) for any field F and any function f : {0,1}^{n} → F, the matrix V_{f} ∈ F^{2n × 2n} given by, for any x,y ∈ {0,1}^{n}, V_{f}[x,y] = f(x ∧ y), and (2) for any field F and any fixedsize matrices M_{1}, …, M_{n} ∈ F^{q × q}, the Kronecker product M_{1} ⊗ M_{2} ⊗ ⋯ ⊗ M_{n}. This generalizes recent results on nonrigidity, using a simpler approach which avoids needing the polynomial method.  New connections between recursive linear transformations like Fourier and WalshHadamard transforms, and circuits for matrix multiplication. @InProceedings{STOC21p772, author = {Josh Alman}, title = {Kronecker Products, LowDepth Circuits, and Matrix Rigidity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {772785}, doi = {10.1145/3406325.3451008}, year = {2021}, } Publisher's Version 

Alon, Noga 
STOC '21: "Boosting Simple Learners ..."
Boosting Simple Learners
Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rulesofthumbs” from an “easytolearn class”. (Schapire and Freund ’12, ShalevShwartz and BenDavid ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ^{2}) weak hypotheses with γmargin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorialgeometric parameter which captures the expressivity of baseclasses in boosting. As a corollary we provide an affirmative answer to the second question for many wellstudied classes, including halfspaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version STOC '21: "Adversarial Laws of Large ..." Adversarial Laws of Large Numbers and Optimal Regret in Online Classification Noga Alon, Omri BenEliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed subpopulation is wellestimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by BenEliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The samplecomplexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from BenDavid, Pál, and ShalevShwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri BenEliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version 

Alweiss, Ryan 
STOC '21: "Discrepancy Minimization via ..."
Discrepancy Minimization via a SelfBalancing Walk
Ryan Alweiss, Yang P. Liu, and Mehtaab Sawhney (Princeton University, USA; Stanford University, USA; Massachusetts Institute of Technology, USA) We study discrepancy minimization for vectors in ℝ^{n} under various settings. The main result is the analysis of a new simple random process in high dimensions through a comparison argument. As corollaries, we obtain bounds which are tight up to logarithmic factors for online vector balancing against oblivious adversaries, resolving several questions posed by Bansal, Jiang, Singla, and Sinha (STOC 2020), as well as a linear time algorithm for logarithmic bounds for the Komlós conjecture. @InProceedings{STOC21p14, author = {Ryan Alweiss and Yang P. Liu and Mehtaab Sawhney}, title = {Discrepancy Minimization via a SelfBalancing Walk}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1420}, doi = {10.1145/3406325.3450994}, year = {2021}, } Publisher's Version Video 

Anari, Nima 
STOC '21: "LogConcave Polynomials in ..."
LogConcave Polynomials in Theory and Applications (Tutorial)
Nima Anari and Cynthia Vinzant (Stanford University, USA; North Carolina State University, USA; University of Washington, USA) Logconcave polynomials give rise to discrete probability distributions with several nice properties. In particular, logconcavity of the generating polynomial guarantees the existence of efficient algorithms for approximately sampling from a distribution and finding the size of its support. This class of distributions contains several important examples, including uniform measures over bases or independent sets of matroids, determinantal point processes and strongly Rayleigh measures, measures defined by mixed volumes in Mikowski sums, the random cluster model in certain regimes, and more. In this tutorial, we will introduce the theory and applications of logconcave polynomials and survey some of the recent developments in this area. @InProceedings{STOC21p12, author = {Nima Anari and Cynthia Vinzant}, title = {LogConcave Polynomials in Theory and Applications (Tutorial)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1212}, doi = {10.1145/3406325.3465351}, year = {2021}, } Publisher's Version STOC '21: "Fractionally LogConcave and ..." Fractionally LogConcave and SectorStable Polynomials: Counting Planar Matchings and More Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and ThuyDuong Vuong (Stanford University, USA) We show fully polynomial time randomized approximation schemes (FPRAS) for counting matchings of a given size, or more generally sampling/counting monomerdimer systems in planar, notnecessarilybipartite, graphs. While perfect matchings on planar graphs can be counted exactly in polynomial time, counting nonperfect matchings was shown by Jerrum (J Stat Phys 1987) to be #Phard, who also raised the question of whether efficient approximate counting is possible. We answer this affirmatively by showing that the multisite Glauber dynamics on the set of monomers in a monomerdimer system always mixes rapidly, and that this dynamics can be implemented efficiently on downwardclosed families of graphs where counting perfect matchings is tractable. As further applications of our results, we show how to sample efficiently using multisite Glauber dynamics from partitionconstrained strongly Rayleigh distributions, and nonsymmetric determinantal point processes. In order to analyze mixing properties of the multisite Glauber dynamics, we establish two notions for generating polynomials of discrete setvalued distributions: sectorstability and fractional logconcavity. These notions generalize wellstudied properties like realstability and logconcavity, but unlike them robustly degrade under useful transformations applied to the distribution. We relate these notions to pairwise correlations in the underlying distribution and the notion of spectral independence introduced by Anari et al. (FOCS 2020), providing a new tool for establishing spectral independence based on geometry of polynomials. As a byproduct of our techniques, we show that polynomials avoiding roots in a sector of the complex plane must satisfy what we call fractional logconcavity; this generalizes a classic result established by Gårding (J Math Mech 1959) who showed homogeneous polynomials that have no roots in a halfplane must be logconcave over the positive orthant. @InProceedings{STOC21p433, author = {Yeganeh Alimohammadi and Nima Anari and Kirankumar Shiragur and ThuyDuong Vuong}, title = {Fractionally LogConcave and SectorStable Polynomials: Counting Planar Matchings and More}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {433446}, doi = {10.1145/3406325.3451123}, year = {2021}, } Publisher's Version Info STOC '21: "LogConcave Polynomials IV: ..." LogConcave Polynomials IV: Approximate Exchange, Tight Mixing Times, and NearOptimal Sampling of Forests Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and ThuyDuong Vuong (Stanford University, USA; University of Washington, USA; North Carolina State University, USA) We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with logconcave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with logconcave polynomials of homogeneous degree k on n variables, we show that the downup random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of wellstudied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over sizek subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a k^{O(k)}approximation of max_{S} µ(S) when µ is generated by a logconcave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage downup random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog^{2} n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almostlinear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearlylinear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k_{1} and k_{2} in time O(n log^{2} n), for fixed parameters k_{1}, k_{2}. @InProceedings{STOC21p408, author = {Nima Anari and Kuikui Liu and Shayan Oveis Gharan and Cynthia Vinzant and ThuyDuong Vuong}, title = {LogConcave Polynomials IV: Approximate Exchange, Tight Mixing Times, and NearOptimal Sampling of Forests}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {408420}, doi = {10.1145/3406325.3451091}, year = {2021}, } Publisher's Version Info 

Arenas, Marcelo 
STOC '21: "A PolynomialTime Approximation ..."
A PolynomialTime Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Counting the number of words of a certain length accepted by a nondeterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature. The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GENNFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k. It is known that #NFA is #Pcomplete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomialtime randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)). Besides, we will mention some consequences and applications of our results. In particular, from wellknown results on counting and uniform generation, we obtain that GENNFA admits a fully polynomialtime almost uniform generator. Moreover, as #NFA is SpanLcomplete under polynomialtime parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS. @InProceedings{STOC21p4, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {A PolynomialTime Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {44}, doi = {10.1145/3406325.3465353}, year = {2021}, } Publisher's Version STOC '21: "When Is Approximate Counting ..." When Is Approximate Counting for Conjunctive Queries Tractable? Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomialtime randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomialtime algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries. The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasipolynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits. @InProceedings{STOC21p1015, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {When Is Approximate Counting for Conjunctive Queries Tractable?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10151027}, doi = {10.1145/3406325.3451014}, year = {2021}, } Publisher's Version 

Argue, C. J. 
STOC '21: "Chasing Convex Bodies with ..."
Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)
C. J. Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang (Carnegie Mellon University, USA; Google Research, USA) The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over ddimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs x t − x t − 1  is minimized. This problem generalizes questions in online algorithms such as caching and the kserver problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)competitive algorithm, based on the notion of the Steiner point of a convex body. @InProceedings{STOC21p5, author = {C. J. Argue and Anupam Gupta and Guru Guruganesh and Ziye Tang}, title = {Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {55}, doi = {10.1145/3406325.3465354}, year = {2021}, } Publisher's Version 

Assadi, Sepehr 
STOC '21: "Graph Streaming Lower Bounds ..."
Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma
Sepehr Assadi and Vishvajeet N (Rutgers University, USA) We study spacepass tradeoffs in graph streaming algorithms for parameter estimation and property testing problems such as estimating the size of maximum matchings and maximum cuts, weight of minimum spanning trees, or testing if a graph is connected or cyclefree versus being far from these properties. We develop a new lower bound technique that proves that for many problems of interest, including all the above, obtaining a (1+є)approximation requires either n^{Ω(1)} space or Ω(1/є) passes, even on highly restricted families of graphs such as boundeddegree planar graphs. For multiple of these problems, this bound matches those of existing algorithms and is thus (asymptotically) optimal. Our results considerably strengthen prior lower bounds even for arbitrary graphs: starting from the influential work of [Verbin, Yu; SODA 2011], there has been a plethora of lower bounds for singlepass algorithms for these problems; however, the only multipass lower bounds proven very recently in [Assadi, Kol, Saxena, Yu; FOCS 2020] rules out sublinearspace algorithms with exponentially smaller o(log(1/є)) passes for these problems. One key ingredient of our proofs is a simple streaming XOR Lemma, a generic hardness amplification result, that we prove: informally speaking, if a ppass sspace streaming algorithm can only solve a decision problem with advantage δ > 0 over random guessing, then it cannot solve XOR of ℓ independent copies of the problem with advantage much better than δ^{ℓ}. This result can be of independent interest and useful for other streaming lower bounds as well. @InProceedings{STOC21p612, author = {Sepehr Assadi and Vishvajeet N}, title = {Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {612625}, doi = {10.1145/3406325.3451110}, year = {2021}, } Publisher's Version 

Azar, Yossi 
STOC '21: "Flow Time Scheduling with ..."
Flow Time Scheduling with Uncertain Processing Time
Yossi Azar, Stefano Leonardi, and Noam Touitou (Tel Aviv University, Israel; Sapienza University of Rome, Italy) We consider the problem of online scheduling on a single machine in order to minimize weighted flow time. The existing algorithms for this problem (STOC ’01, SODA ’03, FOCS ’18) all require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in reallife scenarios. In this paper, we present the first algorithm for weighted flow time which do not require exact knowledge of the processing times of jobs. Specifically, we introduce the Scheduling with Predicted Processing Time (SPPT) problem, where the algorithm is given a prediction for the processing time of each job, instead of its real processing time. For the case of a constant factor distortion between the predictions and the real processing time, our algorithms match all the best known competitiveness bounds for weighted flow time – namely O(logP), O(logD) and O(logW), where P,D,W are the maximum ratios of processing times, densities, and weights, respectively. For larger errors, the competitiveness of our algorithms degrades gracefully. @InProceedings{STOC21p1070, author = {Yossi Azar and Stefano Leonardi and Noam Touitou}, title = {Flow Time Scheduling with Uncertain Processing Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10701080}, doi = {10.1145/3406325.3451023}, year = {2021}, } Publisher's Version 

Babichenko, Yakov 
STOC '21: "Settling the Complexity of ..."
Settling the Complexity of Nash Equilibrium in Congestion Games
Yakov Babichenko and Aviad Rubinstein (Technion, Israel; Stanford University, USA) We consider (i) the problem of finding a (possibly mixed) Nash equilibrium in congestion games, and (ii) the problem of finding an (exponential precision) fixed point of the gradient descent dynamics of a smooth function f:[0,1]^{n} → ℝ. We prove that these problems are equivalent. Our result holds for various explicit descriptions of f, ranging from (almost general) arithmetic circuits, to degree5 polynomials. By a very recent result of [Fearnley et al., STOC 2021], this implies that these problems are PPAD ∩ PLScomplete. As a corollary, we also obtain the following equivalence of complexity classes: CCLS = PPAD ⋂ PLS. @InProceedings{STOC21p1426, author = {Yakov Babichenko and Aviad Rubinstein}, title = {Settling the Complexity of Nash Equilibrium in Congestion Games}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14261437}, doi = {10.1145/3406325.3451039}, year = {2021}, } Publisher's Version 

Bădescu, Costin 
STOC '21: "Improved Quantum Data Analysis ..."
Improved Quantum Data Analysis
Costin Bădescu and Ryan O'Donnell (Carnegie Mellon University, USA) We provide more sampleefficient versions of some basic routines in quantum data analysis, along with simpler proofs. Particularly, we give a quantum ”Threshold Search” algorithm that requires only O((log^{2} m)/є^{2}) samples of a ddimensional state ρ. That is, given observables 0 ≤ A_{1}, A_{2}, …, A_{m} ≤ 1 such that (ρ A_{i}) ≥ 1/2 for at least one i, the algorithm finds j with (ρ A_{j}) ≥ 1/2−є. As a consequence, we obtain a Shadow Tomography algorithm requiring only O((log^{2} m)(logd)/є^{4}) samples, which simultaneously achieves the best known dependence on each parameter m, d, є. This yields the same sample complexity for quantum Hypothesis Selection among m states; we also give an alternative Hypothesis Selection method using O((log^{3} m)/є^{2}) samples. @InProceedings{STOC21p1398, author = {Costin Bădescu and Ryan O'Donnell}, title = {Improved Quantum Data Analysis}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13981411}, doi = {10.1145/3406325.3451109}, year = {2021}, } Publisher's Version 

Bafna, Mitali 
STOC '21: "Playing Unique Games on Certified ..."
Playing Unique Games on Certified SmallSet Expanders
Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer (Harvard University, USA; Carnegie Mellon University, USA; Stanford University, USA; ETH Zurich, Switzerland) We give an algorithm for solving unique games (UG) instances whenever lowdegree sumofsquares proofs certify good bounds on the smallsetexpansion of the underlying constraint graph via a hypercontractive inequality. Our algorithm is in fact more versatile, and succeeds even when the constraint graph is not a smallset expander as long as the structure of nonexpanding small sets is (informally speaking) “characterized” by a lowdegree sumofsquares proof. Our results are obtained by rounding lowentropy solutions — measured via a new global potential function — to sumofsquares (SoS) semidefinite programs. This technique adds to the (currently short) list of general tools for analyzing SoS relaxations for worstcase optimization problems. As corollaries, we obtain the first polynomialtime algorithms for solving any UG instance where the constraint graph is either the noisy hypercube, the short code or the Johnson graph. The prior best algorithm for such instances was the eigenvalue enumeration algorithm of Arora, Barak, and Steurer (2010) which requires quasipolynomial time for the noisy hypercube and nearlyexponential time for the short code and Johnson graphs. All of our results achieve an approximation of 1−є vs δ for UG instances, where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent of the alphabet size. @InProceedings{STOC21p1629, author = {Mitali Bafna and Boaz Barak and Pravesh K. Kothari and Tselil Schramm and David Steurer}, title = {Playing Unique Games on Certified SmallSet Expanders}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16291642}, doi = {10.1145/3406325.3451099}, year = {2021}, } Publisher's Version 

Bakshi, Ainesh 
STOC '21: "Robust Linear Regression: ..."
Robust Linear Regression: Optimal Rates in Polynomial Time
Ainesh Bakshi and Adarsh Prasad (Carnegie Mellon University, USA) We obtain robust and computationally efficient estimators for learning several linear models that achieve statistically optimal convergence rate under minimal distributional assumptions. Concretely, we assume our data is drawn from a khypercontractive distribution and an єfraction is adversarially corrupted. We then describe an estimator that converges to the optimal leastsquares minimizer for the true distribution at a rate proportional to є^{2−2/k}, when the noise is independent of the covariates. We note that no such estimator was known prior to our work, even with access to unbounded computation. The rate we achieve is informationtheoretically optimal and thus we resolve the main open question in Klivans, Kothari and Meka [COLT’18]. Our key insight is to identify an analytic condition that serves as a polynomial relaxation of independence of random variables. In particular, we show that when the moments of the noise and covariates are negativelycorrelated, we obtain the same rate as independent noise. Further, when the condition is not satisfied, we obtain a rate proportional to є^{2−4/k}, and again match the informationtheoretic lower bound. Our central technical contribution is to algorithmically exploit independence of random variables in the ”sumofsquares” framework by formulating it as the aforementioned polynomial inequality. @InProceedings{STOC21p102, author = {Ainesh Bakshi and Adarsh Prasad}, title = {Robust Linear Regression: Optimal Rates in Polynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {102115}, doi = {10.1145/3406325.3451001}, year = {2021}, } Publisher's Version 

Balcan, MariaFlorina 
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design
MariaFlorina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worstcase bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worstcase guarantee. Worstcase instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that datadriven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, applicationspecific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing casebycase analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewiseconstant, linear, or—more generally—piecewisestructured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {MariaFlorina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version 

Bansal, Nikhil 
STOC '21: "kForrelation Optimally Separates ..."
kForrelation Optimally Separates Quantum and Classical Query Complexity
Nikhil Bansal and Makrand Sinha (CWI, Netherlands; Eindhoven University of Technology, Netherlands) Aaronson and Ambainis (SICOMP ‘18) showed that any partial function on N bits that can be computed with an advantage δ over a random guess by making q quantum queries, can also be computed classically with an advantage δ/2 by a randomized decision tree making O_{q}(N^{1−1/2q}δ^{−2}) queries. Moreover, they conjectured the kForrelation problem — a partial function that can be computed with q = ⌈ k/2 ⌉ quantum queries — to be a suitable candidate for exhibiting such an extremal separation. We prove their conjecture by showing a tight lower bound of Ω(N^{1−1/k}) for the randomized query complexity of kForrelation, where δ = 2^{−O(k)}. By standard amplification arguments, this gives an explicit partial function that exhibits an O_{є}(1) vs Ω(N^{1−є}) separation between boundederror quantum and randomized query complexities, where є>0 can be made arbitrarily small. Our proof also gives the same bound for the closely related but nonexplicit kRorrelation function introduced by Tal (FOCS ‘20). Our techniques rely on classical Gaussian tools, in particular, Gaussian interpolation and Gaussian integration by parts, and in fact, give a more general statement. We show that to prove lower bounds for kForrelation against a family of functions, it suffices to bound the ℓ_{1}weight of the Fourier coefficients between levels k and (k−1)k. We also prove new interpolation and integration by parts identities that might be of independent interest in the context of rounding highdimensional Gaussian vectors. @InProceedings{STOC21p1303, author = {Nikhil Bansal and Makrand Sinha}, title = {kForrelation Optimally Separates Quantum and Classical Query Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13031316}, doi = {10.1145/3406325.3451040}, year = {2021}, } Publisher's Version 

Barak, Boaz 
STOC '21: "Playing Unique Games on Certified ..."
Playing Unique Games on Certified SmallSet Expanders
Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer (Harvard University, USA; Carnegie Mellon University, USA; Stanford University, USA; ETH Zurich, Switzerland) We give an algorithm for solving unique games (UG) instances whenever lowdegree sumofsquares proofs certify good bounds on the smallsetexpansion of the underlying constraint graph via a hypercontractive inequality. Our algorithm is in fact more versatile, and succeeds even when the constraint graph is not a smallset expander as long as the structure of nonexpanding small sets is (informally speaking) “characterized” by a lowdegree sumofsquares proof. Our results are obtained by rounding lowentropy solutions — measured via a new global potential function — to sumofsquares (SoS) semidefinite programs. This technique adds to the (currently short) list of general tools for analyzing SoS relaxations for worstcase optimization problems. As corollaries, we obtain the first polynomialtime algorithms for solving any UG instance where the constraint graph is either the noisy hypercube, the short code or the Johnson graph. The prior best algorithm for such instances was the eigenvalue enumeration algorithm of Arora, Barak, and Steurer (2010) which requires quasipolynomial time for the noisy hypercube and nearlyexponential time for the short code and Johnson graphs. All of our results achieve an approximation of 1−є vs δ for UG instances, where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent of the alphabet size. @InProceedings{STOC21p1629, author = {Mitali Bafna and Boaz Barak and Pravesh K. Kothari and Tselil Schramm and David Steurer}, title = {Playing Unique Games on Certified SmallSet Expanders}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16291642}, doi = {10.1145/3406325.3451099}, year = {2021}, } Publisher's Version 

Bartal, Yair 
STOC '21: "NearLinear Time Approximation ..."
NearLinear Time Approximation Schemes for Steiner Tree and Forest in LowDimensional Spaces
Yair Bartal and LeeAd Gottlieb (Hebrew University of Jerusalem, Israel; Ariel University, Israel) We give an algorithm that computes a (1+є)approximate Steiner forest in nearlinear time n · 2^{(1/є)O(ddim2) (loglogn)2}, where ddim is the doubling dimension of the metric space. This improves upon the best previous result due to Chan et al. (SIAM J. Comput. 4 (2018)), who gave a runtime of about n^{2O(ddim)} · 2^{(ddim/є)O(ddim) √logn}. For Steiner tree our methods achieve an even better runtime n (logn)^{(1/є)O(ddim2)}. @InProceedings{STOC21p1028, author = {Yair Bartal and LeeAd Gottlieb}, title = {NearLinear Time Approximation Schemes for Steiner Tree and Forest in LowDimensional Spaces}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10281041}, doi = {10.1145/3406325.3451063}, year = {2021}, } Publisher's Version 

BenDavid, Shai 
STOC '21: "Learnability Can Be Independent ..."
Learnability Can Be Independent of Set Theory (Invited Paper)
Shai BenDavid, Pavel Hrubes, Shay Moran, Amir Shpilka, and Amir Yehudayoff (University of Waterloo, Canada; Czech Academy of Sciences, Czechia; Technion, Israel; Tel Aviv University, Israel) A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its VapnikChervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VCdimension characterization of binary classification learnability. @InProceedings{STOC21p11, author = {Shai BenDavid and Pavel Hrubes and Shay Moran and Amir Shpilka and Amir Yehudayoff}, title = {Learnability Can Be Independent of Set Theory (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1111}, doi = {10.1145/3406325.3465360}, year = {2021}, } Publisher's Version 

BenDavid, Shalev 
STOC '21: "Degree vs. Approximate Degree ..."
Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem
Scott Aaronson, Shalev BenDavid, Robin Kothari, Shravas Rao, and Avishay Tal (University of Texas at Austin, USA; University of Waterloo, Canada; Microsoft Quantum, USA; Microsoft Research, USA; Northwestern University, USA; University of California at Berkeley, USA) Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f, deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function. D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017). We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg conjecture. We show that if f is a nontrivial monotone graph property of an nvertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal. We also show that the approximate degree of any readonce formula on n variables is Θ(√n). @InProceedings{STOC21p1330, author = {Scott Aaronson and Shalev BenDavid and Robin Kothari and Shravas Rao and Avishay Tal}, title = {Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13301342}, doi = {10.1145/3406325.3451047}, year = {2021}, } Publisher's Version 

BenEliezer, Omri 
STOC '21: "Adversarial Laws of Large ..."
Adversarial Laws of Large Numbers and Optimal Regret in Online Classification
Noga Alon, Omri BenEliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed subpopulation is wellestimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by BenEliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The samplecomplexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from BenDavid, Pál, and ShalevShwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri BenEliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version 

Beniamini, Gal 
STOC '21: "Bipartite Perfect Matching ..."
Bipartite Perfect Matching as a Real Polynomial
Gal Beniamini and Noam Nisan (Hebrew University of Jerusalem, Israel) We obtain a description of the Bipartite Perfect Matching decision problem as a multilinear polynomial over the Reals. We show that it has full degree and (1−o_{n}(1))· 2^{n2} monomials with nonzero coefficients. In contrast, we show that in the dual representation (switching the roles of 0 and 1) the number of monomials is only exponential in Θ(n logn). Our proof relies heavily on the fact that the lattice of graphs which are “matchingcovered” is Eulerian. @InProceedings{STOC21p1118, author = {Gal Beniamini and Noam Nisan}, title = {Bipartite Perfect Matching as a Real Polynomial}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11181131}, doi = {10.1145/3406325.3451002}, year = {2021}, } Publisher's Version 

Bernstein, Aaron 
STOC '21: "A Framework for Dynamic Matching ..."
A Framework for Dynamic Matching in Weighted Graphs
Aaron Bernstein, Aditi Dudeja, and Zachary Langley (Rutgers University, USA) We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap. For bipartite graphs, we almost completely settle the gap with a general reduction that converts any algorithm for αapproximate unweighted matching to an algorithm for (1−)αapproximate weighted matching, while only increasing the update time by an O(logn) factor for constant . We also show that our framework leads to significant improvements for nonbipartite graphs, though not in the form of a universal reduction. In particular, we give two algorithms for weighted nonbipartite matching: 1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)approximate maximum weight matching in worstcase update time O(polylog n) with high probability against an adaptive adversary. Our bounds are essentially the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic fully dynamic algorithm that maintains a (2/3−)approximate maximum weight matching in amortized update time O(m^{1/4}). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein and Stein [SODA 2016]. A key feature of our framework is that it uses existing algorithms for unweighted matching as blackboxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models. @InProceedings{STOC21p668, author = {Aaron Bernstein and Aditi Dudeja and Zachary Langley}, title = {A Framework for Dynamic Matching in Weighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {668681}, doi = {10.1145/3406325.3451113}, year = {2021}, } Publisher's Version 

Bhandari, Siddharth 
STOC '21: "Decoding Multivariate Multiplicity ..."
Decoding Multivariate Multiplicity Codes on Product Sets
Siddharth Bhandari, Prahladh Harsha, Mrinal Kumar, and Madhu Sudan (Tata Institute of Fundamental Research, India; IIT Bombay, India; Harvard University, USA) The multiplicity SchwartzZippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct highrate locallydecodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial listdecodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in . @InProceedings{STOC21p1489, author = {Siddharth Bhandari and Prahladh Harsha and Mrinal Kumar and Madhu Sudan}, title = {Decoding Multivariate Multiplicity Codes on Product Sets}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14891501}, doi = {10.1145/3406325.3451027}, year = {2021}, } Publisher's Version 

Bhangale, Amey 
STOC '21: "Optimal Inapproximability ..."
Optimal Inapproximability of Satisfiable kLIN over NonAbelian Groups
Amey Bhangale and Subhash Khot (University of California at Riverside, USA; New York University, USA) A seminal result of Håstad (2001) shows that it is NPhard to find an assignment that satisfies 1/G+ε fraction of the constraints of a given kLIN instance over an abelian group, even if there is an assignment that satisfies (1−ε) fraction of the constraints, for any constant ε>0. Engebretsen, Holmerin and Russell (2004) later showed that the same hardness result holds for kLIN instances over any finite nonabelian group. Unlike the abelian case, where we can efficiently find a solution if the instance is satisfiable, in the nonabelian case, it is NPcomplete to decide if a given system of linear equations is satisfiable or not, as shown by Goldmann and Russell (1999). Surprisingly, for certain nonabelian groups G, given a satisfiable kLIN instance over G, one can in fact do better than just outputting a random assignment using a simple but clever algorithm. The approximation factor achieved by this algorithm varies with the underlying group. In this paper, we show that this algorithm is optimal by proving a tight hardness of approximation of satisfiable kLIN instance over any nonabelian G, assuming P≠ NP. As a corollary, we also get 3query probabilistically checkable proofs with perfect completeness over large alphabets with improved soundness. Our proof crucially uses the quasirandom properties of the nonabelian groups defined by Gowers (2008). @InProceedings{STOC21p1615, author = {Amey Bhangale and Subhash Khot}, title = {Optimal Inapproximability of Satisfiable kLIN over NonAbelian Groups}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16151628}, doi = {10.1145/3406325.3451003}, year = {2021}, } Publisher's Version 

Bhargava, Vishwas 
STOC '21: "Reconstruction Algorithms ..."
Reconstruction Algorithms for LowRank Tensors and Depth3 Multilinear Circuits
Vishwas Bhargava, Shubhangi Saraf, and Ilya Volkovich (Rutgers University, USA; Boston College, USA) We give new and efficient blackbox reconstruction algorithms for some classes of depth3 arithmetic circuits. As a consequence, we obtain the first efficient algorithm for computing the tensor rank and for finding the optimal tensor decomposition as a sum of rankone tensors when then input is a constantrank tensor. More specifically, we provide efficient learning algorithms that run in randomized polynomial time over general fields and in deterministic polynomial time over and for the following classes: 1) Setmultilinear depth3 circuits of constant top fanin ((k) circuits). As a consequence of our algorithm, we obtain the first polynomial time algorithm for tensor rank computation and optimal tensor decomposition of constantrank tensors. This result holds for d dimensional tensors for any d, but is interesting even for d=3. 2) Sums of powers of constantly many linear forms ((k) circuits). As a consequence we obtain the first polynomialtime algorithm for tensor rank computation and optimal tensor decomposition of constantrank symmetric tensors. 3) Multilinear depth3 circuits of constant top fanin (multilinear (k) circuits). Our algorithm works over all fields of characteristic 0 or large enough characteristic. Prior to our work the only efficient algorithms known were over polynomiallysized finite fields (see. KarninShpilka 09’). Prior to our work, the only polynomialtime or even subexponentialtime algorithms known (deterministic or randomized) for subclasses of (k) circuits that also work over large/infinite fields were for the setting when the top fanin k is at most 2 (see Sinha 16’ and Sinha 20’). @InProceedings{STOC21p809, author = {Vishwas Bhargava and Shubhangi Saraf and Ilya Volkovich}, title = {Reconstruction Algorithms for LowRank Tensors and Depth3 Multilinear Circuits}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {809822}, doi = {10.1145/3406325.3451096}, year = {2021}, } Publisher's Version 

Bhattacharyya, Arnab 
STOC '21: "NearOptimal Learning of TreeStructured ..."
NearOptimal Learning of TreeStructured Distributions by ChowLiu
Arnab Bhattacharyya, Sutanu Gayen, Eric Price, and N. V. Vinodchandran (National University of Singapore, Singapore; University of Texas at Austin, USA; University of NebraskaLincoln, USA) We provide finite sample guarantees for the classical ChowLiu algorithm (IEEE Trans. Inform. Theory, 1968) to learn a treestructured graphical model of a distribution. For a distribution P on Σ^{n} and a tree T on n nodes, we say T is an εapproximate tree for P if there is a Tstructured distribution Q such that D(P  Q) is at most ε more than the best possible treestructured distribution for P. We show that if P itself is treestructured, then the ChowLiu algorithm with the plugin estimator for mutual information with O(Σ^{3} nε^{−1}) i.i.d. samples outputs an εapproximate tree for P with constant probability. In contrast, for a general P (which may not be treestructured), Ω(n^{2}ε^{−2}) samples are necessary to find an εapproximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart (STOC, 2018): we prove that for three random variables X,Y,Z each over Σ, testing if I(X; Y ∣ Z) is 0 or ≥ ε is possible with O(Σ^{3}/ε) samples. Finally, we show that for a specific tree T, with O(Σ^{2}nε^{−1}) samples from a distribution P over Σ^{n}, one can efficiently learn the closest Tstructured distribution in KL divergence by applying the add1 estimator at each node. @InProceedings{STOC21p147, author = {Arnab Bhattacharyya and Sutanu Gayen and Eric Price and N. V. Vinodchandran}, title = {NearOptimal Learning of TreeStructured Distributions by ChowLiu}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {147160}, doi = {10.1145/3406325.3451066}, year = {2021}, } Publisher's Version Info 

Bhattiprolu, Vijay 
STOC '21: "A Framework for Quadratic ..."
A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations
Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor (Institute for Advanced Study at Princeton, USA; Princeton University, USA; University of Michigan, USA) We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(A_{ij}) with real entries and an originsymmetric convex body K⊂ ℝ^{n} that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑_{i=1}^{n}∑_{j=1}^{n} A_{ij} x_{i}x_{j}=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like maxcut, Grothendieck/noncommutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using casespecific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constantapproximability necessitates that K has type2 constant that grows slowly with n. However, we show that even when the type2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constantfactor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type2 constant nearly characterizes the approximability of quadratic maximization. @InProceedings{STOC21p870, author = {Vijay Bhattiprolu and Euiwoong Lee and Assaf Naor}, title = {A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {870881}, doi = {10.1145/3406325.3451128}, year = {2021}, } Publisher's Version 

Blais, Eric 
STOC '21: "VC Dimension and DistributionFree ..."
VC Dimension and DistributionFree SampleBased Testing
Eric Blais, Renato Ferreira Pinto Jr., and Nathaniel Harms (University of Waterloo, Canada; Google, Canada) We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distributionfree samplebased model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closelyrelated variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems. @InProceedings{STOC21p504, author = {Eric Blais and Renato Ferreira Pinto Jr. and Nathaniel Harms}, title = {VC Dimension and DistributionFree SampleBased Testing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {504517}, doi = {10.1145/3406325.3451104}, year = {2021}, } Publisher's Version 

Blanca, Antonio 
STOC '21: "Entropy Decay in the Swendsen–Wang ..."
Entropy Decay in the Swendsen–Wang Dynamics on ℤ^{d}
Antonio Blanca, Pietro Caputo, Daniel Parisi, Alistair Sinclair, and Eric Vigoda (Pennsylvania State University, USA; Roma Tre University, Italy; University of California at Berkeley, USA; Georgia Institute of Technology, USA) We study the mixing time of the SwendsenWang dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤ^{d}. This dynamics is a widely used Markov chain that has largely resisted sharp analysis because it is nonlocal, i.e., it changes the entire configuration in one step. We prove that, whenever strong spatial mixing (SSM) holds, the mixing time on any nvertex cube in ℤ^{d} is O(logn), and we prove this is tight by establishing a matching lower bound. The previous best known bound was O(n). SSM is a standard condition corresponding to exponential decay of correlations with distance between spins on the lattice and is known to hold in d=2 dimensions throughout the hightemperature (single phase) region. Our result follows from a modified logSobolev inequality, which expresses the fact that the dynamics contracts relative entropy at a constant rate at each step. The proof of this fact utilizes a new factorization of the entropy in the joint probability space over spins and edges that underlies the SwendsenWang dynamics, which extends to general bipartite graphs of bounded degree. This factorization leads to several additional results, including mixing time bounds for a number of natural local and nonlocal Markov chains on the joint space, as well as for the standard randomcluster dynamics. @InProceedings{STOC21p1551, author = {Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda}, title = {Entropy Decay in the Swendsen–Wang Dynamics on ℤ<sup><i>d</i></sup>}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15511564}, doi = {10.1145/3406325.3451095}, year = {2021}, } Publisher's Version 

Blikstad, Joakim 
STOC '21: "Breaking the Quadratic Barrier ..."
Breaking the Quadratic Barrier for Matroid Intersection
Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, and Danupon Nanongkai (KTH, Sweden; University of Copenhagen, Denmark) The matroid intersection problem is a fundamental problem that has been extensively studied for half a century. In the classic version of this problem, we are given two matroids M_{1} = (V, I_{1}) and M_{2} = (V, I_{2}) on a comment ground set V of n elements, and then we have to find the largest common independent set S ∈ I_{1} ∩ I_{2} by making independence oracle queries of the form ”Is S ∈ I_{1}?” or ”Is S ∈ I_{2}?” for S ⊆ V. The goal is to minimize the number of queries. Beating the existing Õ(n^{2}) bound, known as the quadratic barrier, is an open problem that captures the limits of techniques from two lines of work. The first one is the classic Cunningham’s algorithm [SICOMP 1986], whose Õ(n^{2})query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally, these algorithms take Õ(nr) queries where r denotes the rank which can be as big as n). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requires either approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm with o(n^{2}) independence queries was known. In this work, we break the quadratic barrier with a randomized algorithm guaranteeing Õ(n^{9/5}) independence queries with high probability, and a deterministic algorithm guaranteeing Õ(n^{11/6}) independence queries. Our key insight is simple and fast algorithms to solve a graph reachability problem that arose in the standard augmenting path framework [Edmonds 1968]. Combining this with previous exact and approximation algorithms leads to our results. @InProceedings{STOC21p421, author = {Joakim Blikstad and Jan van den Brand and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Breaking the Quadratic Barrier for Matroid Intersection}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {421432}, doi = {10.1145/3406325.3451092}, year = {2021}, } Publisher's Version 

Bonamy, Marthe 
STOC '21: "Optimal Labelling Schemes ..."
Optimal Labelling Schemes for Adjacency, Comparability, and Reachability
Marthe Bonamy, Louis Esperet, Carla Groenland, and Alex Scott (CNRS, France; Labri, France; University of Bordeaux, France; GSCOP, France; Grenoble Alps University, France; University of Oxford, UK) We construct asymptotically optimal adjacency labelling schemes for every hereditary class containing 2^{Ω(n2)} nvertex graphs as n→ ∞. This regime contains many classes of interest, for instance perfect graphs or comparability graphs, for which we obtain an adjacency labelling scheme with labels of n/4+o(n) bits per vertex. This implies the existence of a reachability labelling scheme for digraphs with labels of n/4+o(n) bits per vertex and comparability labelling scheme for posets with labels of n/4+o(n) bits per element. All these results are best possible, up to the lower order term. @InProceedings{STOC21p1109, author = {Marthe Bonamy and Louis Esperet and Carla Groenland and Alex Scott}, title = {Optimal Labelling Schemes for Adjacency, Comparability, and Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11091117}, doi = {10.1145/3406325.3451102}, year = {2021}, } Publisher's Version 

Bousquet, Olivier 
STOC '21: "A Theory of Universal Learning ..."
A Theory of Universal Learning
Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, and Amir Yehudayoff (Google, Switzerland; Toyota Technological Institute at Chicago, USA; Technion, Israel; Google Research, Israel; Princeton University, USA) How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of VapnikChervonenkis and Valiant, does not explain the behavior of learning curves: the distributionfree PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios. @InProceedings{STOC21p532, author = {Olivier Bousquet and Steve Hanneke and Shay Moran and Ramon van Handel and Amir Yehudayoff}, title = {A Theory of Universal Learning}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {532541}, doi = {10.1145/3406325.3451087}, year = {2021}, } Publisher's Version 

Braverman, Mark 
STOC '21: "New Separations Results for ..."
New Separations Results for External Information
Mark Braverman and Dor Minzer (Princeton University, USA; Massachusetts Institute of Technology, USA) We obtain new separation results for the twoparty external information complexity of Boolean functions. The external information complexity of a function f(x,y) is the minimum amount of information a twoparty protocol computing f must reveal to an outside observer about the input. We prove an exponential separation between external and internal information complexity, which is the best possible; previously no separation was known. We use this result in order to then prove a nearquadratic separation between amortized zeroerror communication complexity and external information complexity for total functions, disproving a conjecture of the first author. Finally, we prove a matching upper bound showing that our separation result is tight. @InProceedings{STOC21p248, author = {Mark Braverman and Dor Minzer}, title = {New Separations Results for External Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {248258}, doi = {10.1145/3406325.3451044}, year = {2021}, } Publisher's Version 

Bressan, Marco 
STOC '21: "Efficient and NearOptimal ..."
Efficient and NearOptimal Algorithms for Sampling Connected Subgraphs
Marco Bressan (University of Milan, Italy) We study the graphlet sampling problem: given an integer k ≥ 3 and a graph G=(V,E), sample a connected induced knode subgraph of G (also called kgraphlet) uniformly at random. This is a fundamental graph mining primitive, with applications in social network analysis and bioinformatics. The two stateoftheart techniques are random walks and color coding. The random walk is elegant, but the current upper bounds and lower bounds on its mixing time suffer a gap of Δ^{k−1} where Δ is the maximum degree of G. Color coding is better understood, but requires a 2^{O(k)} mtime preprocessing over the entire graph. Moreover, no efficient algorithm is known for sampling graphlets uniformly — random walks and color coding yield only єuniform samples. In this work, we provide the following results: (i) A nearoptimal mixing time bound for the classic kgraphlet random walk, as a function of the mixing time of G. In particular, ignoring k^{O(k)} factors, we show that the kgraphlet random walk mixes in Θ(t(G) · ρ(G)^{k−1}) steps, where t(G) is the mixing time of G and ρ(G) is the ratio between its maximum and minimum degree, and on some graphs this is tight up to lgn factors. (ii) The first efficient algorithm for uniform graphlet sampling. The algorithm has a preprocessing phase that uses time O(n k^{2} lgk + m) and space O(n), and a sampling phase that takes k^{O(k)} lgΔ time per sample. It is based on ordering G in a simple way, so to virtually partition the graphlets into buckets, and then sampling from those buckets using rejection sampling. The algorithm can be used also for counting, with additive guarantees. (iii) A nearoptimal algorithm for єuniform graphlet sampling, with a preprocessing phase that runs in time O(k^{6} є^{−1} n lgn) and space O(n), and a sampling phase that takes k^{O(k)}(1/є)^{10} lg1/є expected time per sample. The algorithm is based on a nontrivial sketching of the ordering of G, followed by emulating uniform sampling through coupling arguments. @InProceedings{STOC21p1132, author = {Marco Bressan}, title = {Efficient and NearOptimal Algorithms for Sampling Connected Subgraphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11321143}, doi = {10.1145/3406325.3451042}, year = {2021}, } Publisher's Version 

Bringmann, Karl 
STOC '21: "Sparse Nonnegative Convolution ..."
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Karl Bringmann, Nick Fischer, and Vasileios Nakos (Saarland University, Germany; MPIINF, Germany) Computing the convolution A ⋆ B of two lengthn vectors A,B is an ubiquitous computational primitive, with applications in a variety of disciplines. Within theoretical computer science, applications range from string problems to Knapsacktype problems, and from 3SUM to AllPairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of A,B are nonnegative integers. The classical algorithm to compute A⋆ B uses the Fast Fourier Transform (FFT) and runs in time O(n logn). However, in many cases A and B might satisfy sparsity conditions, and hence one could hope for significant gains compared to the standard FFT algorithm. The ideal goal would be an O(k logk)time algorithm, where k is the number of nonzero elements in the output, i.e., the size of the support of A ⋆ B. This problem is referred to as sparse nonnegative convolution, and has received a considerable amount of attention in the literature; the fastest algorithms to date run in time O(k log^{2} n). The main result of this paper is the first O(k logk)time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length n and the largest entry of A and B are subexponential in k. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constanterror randomized algorithms. Specifically, if D(n) is the time to convolve two nonnegative lengthn vectors with success probability 2/3, and S(k) is the time to convolve two nonnegative vectors with output size k with success probability 2/3, then S(k) = O(D(k) + k (loglogk)^{2}). Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function. @InProceedings{STOC21p1711, author = {Karl Bringmann and Nick Fischer and Vasileios Nakos}, title = {Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17111724}, doi = {10.1145/3406325.3451090}, year = {2021}, } Publisher's Version 

Brown, Gavin 
STOC '21: "When Is Memorization of Irrelevant ..."
When Is Memorization of Irrelevant Training Data Necessary for HighAccuracy Learning?
Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, and Kunal Talwar (Boston University, USA; Apple, USA) Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are highdimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the nextsymbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text and imagerelated prediction problems. To establish our results, we reduce from a family of oneway communication problems for which we prove new information complexity lower bounds. @InProceedings{STOC21p123, author = {Gavin Brown and Mark Bun and Vitaly Feldman and Adam Smith and Kunal Talwar}, title = {When Is Memorization of Irrelevant Training Data Necessary for HighAccuracy Learning?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {123132}, doi = {10.1145/3406325.3451131}, year = {2021}, } Publisher's Version 

Bruna, Joan 
STOC '21: "Continuous LWE ..."
Continuous LWE
Joan Bruna, Oded Regev, Min Jae Song, and Yi Tang (New York University, USA; University of Michigan, USA) We introduce a continuous analogue of the Learning with Errors (LWE) problem, which we name CLWE. We give a polynomialtime quantum reduction from worstcase lattice problems to CLWE, showing that CLWE enjoys similar hardness guarantees to those of LWE. Alternatively, our result can also be seen as opening new avenues of (quantum) attacks on lattice problems. Our work resolves an open problem regarding the computational complexity of learning mixtures of Gaussians without separability assumptions (Diakonikolas 2016, Moitra 2018). As an additional motivation, (a slight variant of) CLWE was considered in the context of robust machine learning (Diakonikolas et al.~FOCS 2017), where hardness in the statistical query (SQ) model was shown; our work addresses the open question regarding its computational hardness (Bubeck et al.~ICML 2019). @InProceedings{STOC21p694, author = {Joan Bruna and Oded Regev and Min Jae Song and Yi Tang}, title = {Continuous LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {694707}, doi = {10.1145/3406325.3451000}, year = {2021}, } Publisher's Version 

Bun, Mark 
STOC '21: "When Is Memorization of Irrelevant ..."
When Is Memorization of Irrelevant Training Data Necessary for HighAccuracy Learning?
Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, and Kunal Talwar (Boston University, USA; Apple, USA) Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are highdimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the nextsymbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text and imagerelated prediction problems. To establish our results, we reduce from a family of oneway communication problems for which we prove new information complexity lower bounds. @InProceedings{STOC21p123, author = {Gavin Brown and Mark Bun and Vitaly Feldman and Adam Smith and Kunal Talwar}, title = {When Is Memorization of Irrelevant Training Data Necessary for HighAccuracy Learning?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {123132}, doi = {10.1145/3406325.3451131}, year = {2021}, } Publisher's Version 

Caputo, Pietro 
STOC '21: "Entropy Decay in the Swendsen–Wang ..."
Entropy Decay in the Swendsen–Wang Dynamics on ℤ^{d}
Antonio Blanca, Pietro Caputo, Daniel Parisi, Alistair Sinclair, and Eric Vigoda (Pennsylvania State University, USA; Roma Tre University, Italy; University of California at Berkeley, USA; Georgia Institute of Technology, USA) We study the mixing time of the SwendsenWang dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤ^{d}. This dynamics is a widely used Markov chain that has largely resisted sharp analysis because it is nonlocal, i.e., it changes the entire configuration in one step. We prove that, whenever strong spatial mixing (SSM) holds, the mixing time on any nvertex cube in ℤ^{d} is O(logn), and we prove this is tight by establishing a matching lower bound. The previous best known bound was O(n). SSM is a standard condition corresponding to exponential decay of correlations with distance between spins on the lattice and is known to hold in d=2 dimensions throughout the hightemperature (single phase) region. Our result follows from a modified logSobolev inequality, which expresses the fact that the dynamics contracts relative entropy at a constant rate at each step. The proof of this fact utilizes a new factorization of the entropy in the joint probability space over spins and edges that underlies the SwendsenWang dynamics, which extends to general bipartite graphs of bounded degree. This factorization leads to several additional results, including mixing time bounds for a number of natural local and nonlocal Markov chains on the joint space, as well as for the standard randomcluster dynamics. @InProceedings{STOC21p1551, author = {Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda}, title = {Entropy Decay in the Swendsen–Wang Dynamics on ℤ<sup><i>d</i></sup>}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15511564}, doi = {10.1145/3406325.3451095}, year = {2021}, } Publisher's Version 

Cecchetto, Federica 
STOC '21: "Bridging the Gap between Tree ..."
Bridging the Gap between Tree and Connectivity Augmentation: Unified and Stronger Approaches
Federica Cecchetto, Vera Traub, and Rico Zenklusen (ETH Zurich, Switzerland) We consider the Connectivity Augmentation Problem (CAP), a classical problem in the area of Survivable Network Design. It is about increasing the edgeconnectivity of a graph by one unit in the cheapest possible way. More precisely, given a kedgeconnected graph G=(V,E) and a set of extra edges, the task is to find a minimum cardinality subset of extra edges whose addition to G makes the graph (k+1)edgeconnected. If k is odd, the problem is known to reduce to the Tree Augmentation Problem (TAP)—i.e., G is a spanning tree—for which significant progress has been achieved recently, leading to approximation factors below 1.5 (the currently best factor is 1.458). However, advances on TAP did not carry over to CAP so far. Indeed, only very recently, Byrka, Grandoni, and Ameli (STOC 2020) managed to obtain the first approximation factor below 2 for CAP by presenting a 1.91approximation algorithm based on a method that is disjoint from recent advances for TAP. We first bridge the gap between TAP and CAP, by presenting techniques that allow for leveraging insights and methods from TAP to approach CAP. We then introduce a new way to get approximation factors below 1.5, based on a new analysis technique. Through these ingredients, we obtain a approximation algorithm for CAP, and therefore also TAP. This leads to the currently best approximation result for both problems in a unified way, by significantly improving on the abovementioned 1.91approximation for CAP and also the previously best approximation factor of 1.458 for TAP by Grandoni, Kalaitzis, and Zenklusen (STOC 2018). Additionally, a feature we inherit from recent TAP advances is that our approach can deal with the weighted setting when the ratio between the largest to smallest cost on extra links is bounded, in which case we obtain approximation factors below 1.5. @InProceedings{STOC21p370, author = {Federica Cecchetto and Vera Traub and Rico Zenklusen}, title = {Bridging the Gap between Tree and Connectivity Augmentation: Unified and Stronger Approaches}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {370383}, doi = {10.1145/3406325.3451086}, year = {2021}, } Publisher's Version 

Chase, Zachary 
STOC '21: "Separating Words and Trace ..."
Separating Words and Trace Reconstruction
Zachary Chase (University of Oxford, UK) We prove that for any distinct x,y ∈ {0,1}^{n}, there is a deterministic finite automaton with O(n^{1/3}) states that accepts x but not y. This improves Robson’s 1989 bound of O(n^{2/5}). Using a similar complex analytic technique, we improve the upper bound on worst case trace reconstruction, showing that any unknown string x ∈ {0,1}^{n} can be reconstructed with high probability from exp(O(n^{1/5})) independently generated traces. @InProceedings{STOC21p21, author = {Zachary Chase}, title = {Separating Words and Trace Reconstruction}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {2131}, doi = {10.1145/3406325.3451118}, year = {2021}, } Publisher's Version 

Chattopadhyay, Arkadev 
STOC '21: "Lower Bounds for Monotone ..."
Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Arkadev Chattopadhyay, Rajit Datta, and Partha Mukhopadhyay (Tata Institute of Fundamental Research, India; Chennai Mathematical Institute, India) Valiant (1980) showed that general arithmetic circuits with negation can be exponentially more powerful than monotone ones. We give the first improvement to this classical result: we construct a family of polynomials P_{n} in n variables, each of its monomials has nonnegative coefficient, such that P_{n} can be computed by a polynomialsize depththree formula but every monotone circuit computing it has size 2^{Ω(n1/4/log(n))}. The polynomial P_{n} embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif (2020) to refute the LogApproximateRank Conjecture in communication complexity. To prove our lower bound for P_{n}, we develop a general connection between corruption of combinatorial rectangles by any function f ∘ XOR and corruption of product polynomials by a certain polynomial P^{f} that is an arithmetic embedding of f. This connection should be of independent interest. Using further ideas from communication complexity, we construct another family of setmultilinear polynomials f_{n,m} such that both F_{n,m} − є· f_{n,m} and F_{n,m} + є· f_{n,m} have monotone circuit complexity 2^{Ω(n/log(n))} if є ≥ 2^{− Ω( m )} and F_{n,m} ∏_{i=1}^{n} (x_{i,1} +⋯+x_{i,m}), with m = O( n/logn ). The polynomials f_{n,m} have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits has been advocated recently by Hrubeš (2020) as a first step towards proving lower bounds against general circuits via his new approach. @InProceedings{STOC21p786, author = {Arkadev Chattopadhyay and Rajit Datta and Partha Mukhopadhyay}, title = {Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {786799}, doi = {10.1145/3406325.3451069}, year = {2021}, } Publisher's Version 

Chen, Lijie 
STOC '21: "Simple and Fast Derandomization ..."
Simple and Fast Derandomization from Very Hard Functions: Eliminating Randomness at Almost No Cost
Lijie Chen and Roei Tell (Massachusetts Institute of Technology, USA) Extending the classical “hardnesstorandomness” lineofworks, Doron, Moshkovitz, Oh, and Zuckerman (FOCS 2020) recently proved that derandomization with nearquadratic time overhead is possible, under the assumption that there exists a function in DTIME[2^{n}] that cannot be computed by randomized SVN circuits of size 2^{(1−є)· n} for a small є. In this work we extend their inquiry and answer several open questions that arose from their work. For a time function T(n), consider the following assumption: Nonuniformly secure oneway functions exist, and for δ=δ(є) and k=k_{T}(є) there exists a problem in DTIME[2^{k· n}] that is hard for algorithms that run in time 2^{(k−δ)· n} and use 2^{(1−δ)· n} bits of advice. Under this assumption, we show that: 1. (Worstcase derandomization.) Probabilistic algorithms that run in time T(n) can be deterministically simulated in time n· T(n)^{1+є}. 2. (Averagecase derandomization.) For polynomial time functions T(n)=poly(n), we can improve the derandomization time to n^{є}· T(n) if we allow the derandomization to succeed only on average, rather than in the worstcase. 3. (Conditional optimality.) For worstcase derandomization, the multiplicative time overhead of n is essentially optimal, conditioned on a counting version of the nondeterministic strong exponentialtime hypothesis (i.e., on #NSETH). Lastly, we present an alternative proof for the result of Doron, Moshkovitz, Oh, and Zuckerman that is simpler and more versatile. In fact, we show how to simplify the analysis not only of their construction, but of any construction that “extracts randomness from a pseudoentropic string”. @InProceedings{STOC21p283, author = {Lijie Chen and Roei Tell}, title = {Simple and Fast Derandomization from Very Hard Functions: Eliminating Randomness at Almost No Cost}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {283291}, doi = {10.1145/3406325.3451059}, year = {2021}, } Publisher's Version STOC '21: "InverseExponential Correlation ..." InverseExponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma Lijie Chen and Xin Lyu (Massachusetts Institute of Technology, USA; Tsinghua University, China) In this work we prove that there is a function f ∈ E^{ NP} such that, for every sufficiently large n and d = √n/logn, f_{n} (f restricted to nbit inputs) cannot be (1/2 + 2^{−d})approximated by F_{2}polynomials of degree d. We also observe that a minor improvement (e.g., improving d to n^{1/2+ε} for any ε > 0) over our result would imply E^{ NP} cannot be computed by depth3 AC^{0}circuits of 2^{n1/2 + ε} size, which is a notoriously hard open question in complexity theory. Using the same proof techniques, we are also able to construct extremely rigid matrices over F_{2} in P^{ NP}. More specifically, we show that for every constant ε ∈ (0,1), there is a P^{ NP} algorithm which on input 1^{n} outputs an n× n F_{2}matrix H_{n} satisfying R_{Hn}(2^{log1 − ε n}) ≥ (1/2 − exp(−log^{2/3 · ε} n) ) · n^{2}, for every sufficiently large n. This improves the recent P^{ NP} constructions of rigid matrices in [Alman and Chen, FOCS 2019] and [Bhangale et al., FOCS 2020], which only give Ω(n^{2}) rigidity. The key ingredient in the proof of our new results is a new derandomized XOR lemma based on approximate linear sums, which roughly says that given an ninput function f which cannot be 0.99approximated by certain linear sum of s many functions in F within ℓ_{1}distance, one can construct a new function Amp^{f} with O(n) input bits, which cannot be (1/2+s^{Ω(1)})approximated by Ffunctions. Taking F to be a function collection containing lowdegree F_{2}polynomials or lowrank F_{2}matrices, our results are then obtained by first using the algorithmic method to construct a function which is weakly hard against linear sums of F in the above sense, and then applying the derandomized XOR lemma to f. We obtain our new derandomized XOR lemma by giving a generalization of the famous hardcore lemma by Impagliazzo. Our generalization in some sense constructs a nonBoolean hardcore of a weakly hard function f with respect to Ffunctions, from the weak inapproximability of f by any linear sum of F with bounded ℓ_{p}norm. This generalization recovers the original hardcore lemma by considering the ℓ_{∞}norm. Surprisingly, when we switch to the ℓ_{1}norm, we immediately rediscover Levin’s proof of Yao’s XOR Lemma. That is, these first two proofs of Yao’s XOR Lemma can be unified with our new perspective. For proving the correlation bounds, our new derandomized XOR lemma indeed works with the ℓ_{4/3}norm. @InProceedings{STOC21p761, author = {Lijie Chen and Xin Lyu}, title = {InverseExponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {761771}, doi = {10.1145/3406325.3451132}, year = {2021}, } Publisher's Version Info STOC '21: "Almost Optimal SuperConstantPass ..." Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu (Massachusetts Institute of Technology, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA) We give an almost quadratic n^{2−o(1)} lower bound on the space consumption of any o(√logn)pass streaming algorithm solving the (directed) st reachability problem. This means that any such algorithm must essentially store the entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional fundamental problems, including maximum matching, shortest path, matrix rank, and linear programming. Our main technical contribution is the definition and construction of set hiding graphs, that may be of independent interest: we give a general way of encoding a set S ⊆ [k] as a directed graph with n = k^{ 1 + o( 1 ) } vertices, such that deciding whether i ∈ S boils down to deciding if t_{i} is reachable from s_{i}, for a specific pair of vertices (s_{i},t_{i}) in the graph. Furthermore, we prove that our graph “hides” S, in the sense that no lowspace streaming algorithm with a small number of passes can learn (almost) anything about S. @InProceedings{STOC21p570, author = {Lijie Chen and Gillat Kol and Dmitry Paramonov and Raghuvansh R. Saxena and Zhao Song and Huacheng Yu}, title = {Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {570583}, doi = {10.1145/3406325.3451038}, year = {2021}, } Publisher's Version 

Chen, Sitan 
STOC '21: "Algorithmic Foundations for ..."
Algorithmic Foundations for the Diffraction Limit
Sitan Chen and Ankur Moitra (Massachusetts Institute of Technology, USA) For more than a century and a half it has been widelybelieved (but was never rigorously shown) that the physics of diffraction imposes certain fundamental limits on the resolution of an optical system. However our understanding of what exactly can and cannot be resolved has never risen above heuristic arguments which, even worse, appear contradictory. In this work we remedy this gap by studying the diffraction limit as a statistical inverse problem and, based on connections to provable algorithms for learning mixture models, we rigorously prove upper and lower bounds on the statistical and algorithmic complexity needed to resolve closely spaced point sources. In particular we show that there is a phase transition where the sample complexity goes from polynomial to exponential. Surprisingly, we show that this does not occur at the Abbe limit, which has long been presumed to be the true diffraction limit. @InProceedings{STOC21p490, author = {Sitan Chen and Ankur Moitra}, title = {Algorithmic Foundations for the Diffraction Limit}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {490503}, doi = {10.1145/3406325.3451078}, year = {2021}, } Publisher's Version 

Chen, Zongchen 
STOC '21: "Optimal Mixing of Glauber ..."
Optimal Mixing of Glauber Dynamics: Entropy Factorization via HighDimensional Expansion
Zongchen Chen, Kuikui Liu, and Eric Vigoda (Georgia Institute of Technology, USA; University of Washington, USA) We prove an optimal mixing time bound for the singlesite update Markov chain known as the Glauber dynamics or Gibbs sampling in a variety of settings. Our work presents an improved version of the spectral independence approach of Anari et al. (2020) and shows O(nlogn) mixing time on any nvertex graph of bounded degree when the maximum eigenvalue of an associated influence matrix is bounded. As an application of our results, for the hardcore model on independent sets weighted by a fugacity λ, we establish O(nlogn) mixing time for the Glauber dynamics on any nvertex graph of constant maximum degree Δ when λ<λ_{c}(Δ) where λ_{c}(Δ) is the critical point for the uniqueness/nonuniqueness phase transition on the Δregular tree. More generally, for any antiferromagnetic 2spin system we prove O(nlogn) mixing time of the Glauber dynamics on any bounded degree graph in the corresponding tree uniqueness region. Our results apply more broadly; for example, we also obtain O(nlogn) mixing for qcolorings of trianglefree graphs of maximum degree Δ when the number of colors satisfies q > α Δ where α ≈ 1.763, and O(mlogn) mixing for generating random matchings of any graph with bounded degree and m edges. Our approach is based on two steps. First, we show that the approximate tensorization of entropy (i.e., factorizing entropy into single vertices), which is a key step for establishing the modified logSobolev inequality in many previous works, can be deduced from entropy factorization into blocks of fixed linear size. Second, we adapt the localtoglobal scheme of Alev and Lau (2020) to establish such block factorization of entropy in a more general setting of pure weighted simplicial complexes satisfying local spectral expansion; this also substantially generalizes the result of Cryan et al. (2019). @InProceedings{STOC21p1537, author = {Zongchen Chen and Kuikui Liu and Eric Vigoda}, title = {Optimal Mixing of Glauber Dynamics: Entropy Factorization via HighDimensional Expansion}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15371550}, doi = {10.1145/3406325.3451035}, year = {2021}, } Publisher's Version 

Cheu, Albert 
STOC '21: "The Limits of Pan Privacy ..."
The Limits of Pan Privacy and Shuffle Privacy for Learning and Estimation
Albert Cheu and Jonathan Ullman (Northeastern University, USA) There has been a recent wave of interest in intermediate trust models for differential privacy that eliminate the need for a fully trusted central data collector, but overcome the limitations of local differential privacy. This interest has led to the introduction of the shuffle model (Cheu et al., EUROCRYPT 2019; Erlingsson et al., SODA 2019) and revisiting the panprivate model (Dwork et al., ITCS 2010). The message of this line of work is that, for a variety of lowdimensional problems—such as counts, means, and histograms—these intermediate models offer nearly as much power as central differential privacy. However, there has been considerably less success using these models for highdimensional learning and estimation problems. In this work we prove the first nontrivial lower bounds for highdimensional learning and estimation in both the panprivate model and the general multimessage shuffle model. Our lower bounds apply to a variety of problems—for example, we show that, private agnostic learning of parity functions over d bits requires Ω(2^{d/2}) samples in these models, and privately selecting the most common attribute from a set of d choices requires Ω(d^{1/2}) samples, both of which are exponential separations from the central model. @InProceedings{STOC21p1081, author = {Albert Cheu and Jonathan Ullman}, title = {The Limits of Pan Privacy and Shuffle Privacy for Learning and Estimation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10811094}, doi = {10.1145/3406325.3450995}, year = {2021}, } Publisher's Version 

Chuzhoy, Julia 
STOC '21: "Decremental AllPairs Shortest ..."
Decremental AllPairs Shortest Paths in Deterministic NearLinear Time
Julia Chuzhoy (Toyota Technological Institute at Chicago, USA) We study the decremental AllPairs Shortest Paths (APSP) problem in undirected edgeweighted graphs. The input to the problem is an undirected nvertex medge graph G with nonnegative lengths on edges, that undergoes an online sequence of edge deletions. The goal is to support approximate shortestpaths queries: given a pair x,y of vertices of G, return a path P connecting x to y, whose length is within factor α of the length of the shortest xy path, in time Õ(E(P)), where α is the approximation factor of the algorithm. APSP is one of the most basic and extensively studied dynamic graph problems. A long line of work culminated in the algorithm of [Chechik, FOCS 2018] with near optimal guarantees: for any constant 0<є≤ 1 and parameter k≥ 1, the algorithm achieves approximation factor (2+є)k−1, and total update time O(mn^{1/k+o(1)}log(nL)), where L is the ratio of longest to shortest edge lengths. Unfortunately, as much of prior work, the algorithm is randomized and needs to assume an oblivious adversary; that is, the input edgedeletion sequence is fixed in advance and may not depend on the algorithm’s behavior. In many realworld scenarios, and in applications of APSP to static graph problems, it is crucial that the algorithm works against an adaptive adversary, where the edge deletion sequence may depend on the algorithm’s past behavior arbitrarily; ideally, such an algorithm should be deterministic. Unfortunately, unlike the obliviousadversary setting, its adaptiveadversary counterpart is still poorly understood. For unweighted graphs, the algorithm of [Henzinger, Krinninger and Nanongkai, FOCS ’13, SICOMP ’16] achieves a (1+є)approximation with total update time Õ(mn/є); the best current total update time guarantee of n^{2.5+O(є)} is achieved by the recent deterministic algorithm of [Chuzhoy, Saranurak, SODA’21], with 2^{O(1/є)}multiplicative and 2^{O(log3/4n/є)}additive approximation. To the best of our knowledge, for arbitrary nonnegative edge weights, the fastest current adaptiveupdate algorithm has total update time O(n^{3}logL/є), achieving a (1+є)approximation. Even if we are willing to settle for any o(n)approximation factor, no currently known algorithm has a better than Θ(n^{3}) total update time in weighted graphs and better than Θ(n^{2.5}) total update time in unweighted graphs. Several conditional lower bounds suggest that no algorithm with a sufficiently small approximation factor can achieve an o(n^{3}) total update time. Our main result is a deterministic algorithm for decremental APSP in undirected edgeweighted graphs, that, for any Ω(1/loglogm)≤ є< 1, achieves approximation factor (logm)^{2O(1/є)}, with total update time O(m^{1+O(є)}· (logm)^{O(1/є2)}· logL). In particular, we obtain a (polylogm)approximation in time Õ(m^{1+є}) for any constant є, and, for any slowly growing function f(m), we obtain (logm)^{f(m)}approximation in time m^{1+o(1)}. We also provide an algorithm with similar guarantees for decremental Sparse Neighborhood Covers. @InProceedings{STOC21p626, author = {Julia Chuzhoy}, title = {Decremental AllPairs Shortest Paths in Deterministic NearLinear Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {626639}, doi = {10.1145/3406325.3451025}, year = {2021}, } Publisher's Version 

Cohen, Alex 
STOC '21: "Structure vs. Randomness for ..."
Structure vs. Randomness for Bilinear Maps
Alex Cohen and Guy Moshkovitz (Yale University, USA; City University of New York, USA) We prove that the slice rank of a 3tensor (a combinatorial notion introduced by Tao in the context of the capset problem), the analytic rank (a Fouriertheoretic notion introduced by Gowers and Wolf), and the geometric rank (a recently introduced algebrogeometric notion) are all equivalent up to an absolute constant. As a corollary, we obtain strong tradeoffs on the arithmetic complexity of a biased bililnear map, and on the separation between computing a bilinear map exactly and on average. Our result settles open questions of Haramaty and Shpilka [STOC 2010], and of Lovett [Discrete Anal., 2019] for 3tensors. @InProceedings{STOC21p800, author = {Alex Cohen and Guy Moshkovitz}, title = {Structure vs. Randomness for Bilinear Maps}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {800808}, doi = {10.1145/3406325.3451007}, year = {2021}, } Publisher's Version 

Cohen, Gil 
STOC '21: "Expander Random Walks: A FourierAnalytic ..."
Expander Random Walks: A FourierAnalytic Approach
Gil Cohen, Noam Peri, and Amnon TaShma (Tel Aviv University, Israel) In this work we ask the following basic question: assume the vertices of an expander graph are labelled by 0,1. What “test” functions f : { 0,1}^{t} → {0,1} cannot distinguish t independent samples from those obtained by a random walk? The expander hitting property due to Ajtai, Komlos and Szemeredi (STOC 1987) is captured by the AND test function, whereas the fundamental expander Chernoff bound due to Gillman (SICOMP 1998), Heally (Computational Complexity 2008) is about test functions indicating whether the weight is close to the mean. In fact, it is known that all threshold functions are fooled by a random walk (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Recently, it was shown that even the highly sensitive PARITY function is fooled by a random walk TaShma (STOC 2017). We focus on balanced labels. Our first main result is proving that all symmetric functions are fooled by a random walk. Put differently, we prove a central limit theorem (CLT) for expander random walks with respect to the total variation distance, significantly strengthening the classic CLT for Markov Chains that is established with respect to the Kolmogorov distance (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Our approach significantly deviates from prior works. We first study how well a Fourier character χ_{S} is fooled by a random walk as a function of S. Then, given a test function f, we expand f in the Fourier basis and combine the above with known results on the Fourier spectrum of f. We also proceed further and consider general test functions  not necessarily symmetric. As our approach is Fourier analytic, it is general enough to analyze such versatile test functions. For our second result, we prove that random walks on sufficiently good expander graphs fool tests functions computed by AC^{0} circuits, readonce branching programs, and functions with bounded query complexity. @InProceedings{STOC21p1643, author = {Gil Cohen and Noam Peri and Amnon TaShma}, title = {Expander Random Walks: A FourierAnalytic Approach}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16431655}, doi = {10.1145/3406325.3451049}, year = {2021}, } Publisher's Version Info 

CohenAddad, Vincent 
STOC '21: "A Quasipolynomial (2 + ε)Approximation ..."
A Quasipolynomial (2 + ε)Approximation for Planar Sparsest Cut
Vincent CohenAddad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (nonuniform) sparsest cut problem is the following graphpartitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constantfactor approximations are known. We consider the problem for planar graphs, and give a (2+)approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a SheraliAdamsstyle linear programming relaxation of the problem, which we then round. This should be compared with the polynomialtime approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ_{1}embeddings, and achieves an O(√logn)approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent CohenAddad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10561069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version STOC '21: "A New Coreset Framework for ..." A New Coreset Framework for Clustering Vincent CohenAddad, David Saulpic, and Chris Schwiegelshohn (Google, Switzerland; Sorbonne University, France; CNRS, France; LIP6, France; Aarhus University, Denmark) Given a metric space, the (k,z)clustering problem consists of finding k centers such that the sum of the of distances raised to the power z of every point to its closest center is minimized. This encapsulates the famous kmedian (z=1) and kmeans (z=2) clustering problems. Designing smallspace sketches of the data that approximately preserves the cost of the solutions, also known as coresets, has been an important research direction over the last 15 years. In this paper, we present a new, simple coreset framework that simultaneously improves upon the best known bounds for a large variety of settings, ranging from Euclidean space, doubling metric, minorfree metric, and the general metric cases. @InProceedings{STOC21p169, author = {Vincent CohenAddad and David Saulpic and Chris Schwiegelshohn}, title = {A New Coreset Framework for Clustering}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {169182}, doi = {10.1145/3406325.3451022}, year = {2021}, } Publisher's Version Info 

Croquevielle, Luis Alberto 
STOC '21: "A PolynomialTime Approximation ..."
A PolynomialTime Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Counting the number of words of a certain length accepted by a nondeterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature. The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GENNFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k. It is known that #NFA is #Pcomplete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomialtime randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)). Besides, we will mention some consequences and applications of our results. In particular, from wellknown results on counting and uniform generation, we obtain that GENNFA admits a fully polynomialtime almost uniform generator. Moreover, as #NFA is SpanLcomplete under polynomialtime parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS. @InProceedings{STOC21p4, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {A PolynomialTime Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {44}, doi = {10.1145/3406325.3465353}, year = {2021}, } Publisher's Version STOC '21: "When Is Approximate Counting ..." When Is Approximate Counting for Conjunctive Queries Tractable? Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomialtime randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomialtime algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries. The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasipolynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits. @InProceedings{STOC21p1015, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {When Is Approximate Counting for Conjunctive Queries Tractable?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10151027}, doi = {10.1145/3406325.3451014}, year = {2021}, } Publisher's Version 

Curticapean, Radu 
STOC '21: "A Full Complexity Dichotomy ..."
A Full Complexity Dichotomy for Immanant Families
Radu Curticapean (IT University of Copenhagen, Denmark) Given an integer n ≥ 1 and an irreducible character χ_{λ} of S_{n} for some partition λ of n, the immanant imm_{λ}:ℂ^{n× n}→ℂ maps matrices A∈ℂ^{n× n} to imm_{λ}(A)=∑_{π∈ Sn}χ_{λ}(π)∏_{i=1}^{n}A_{i,π(i)}. Important special cases include the determinant and permanent, which are obtained from the sign and trivial character, respectively. It is known that immanants can be evaluated in polynomial time for characters that are “close” to the sign character: Given a partition λ of n with s parts, let b(λ):=n−s count the boxes to the right of the first column in the Young diagram of λ. For a family of partitions Λ, let b(Λ) := max_{λ∈Λ}b(λ) and write Imm(Λ) for the problem of evaluating imm_{λ}(A) on input A and λ∈Λ. On the positive side, if b(Λ)<∞, then Imm(Λ) is known to be polynomialtime computable. This subsumes the case of the determinant. Conversely, if b(Λ)=∞, then previously known hardness results suggest that Imm(Λ) cannot be solved in polynomial time. However, these results only address certain restricted classes of families Λ. In this paper, we show that the assumption FPT≠ #W[1] from parameterized complexity rules out polynomialtime algorithms for Imm(Λ) for any computationally reasonable family of partitions Λ with b(Λ)=∞. We give an analogous result in algebraic complexity under the assumption VFPT≠ VW[1]. Furthermore, if b(λ) even grows polynomially in Λ, we show that Imm(Λ) is hard for #P and VNP. This concludes a series of partial results on the complexity of immanants obtained over the last 35 years. @InProceedings{STOC21p1770, author = {Radu Curticapean}, title = {A Full Complexity Dichotomy for Immanant Families}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17701783}, doi = {10.1145/3406325.3451124}, year = {2021}, } Publisher's Version 

Dagan, Yuval 
STOC '21: "Adversarial Laws of Large ..."
Adversarial Laws of Large Numbers and Optimal Regret in Online Classification
Noga Alon, Omri BenEliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed subpopulation is wellestimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by BenEliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The samplecomplexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from BenDavid, Pál, and ShalevShwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri BenEliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version STOC '21: "Learning Ising Models from ..." Learning Ising Models from One or Multiple Samples Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for onesample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a highdimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the aforedescribed multiplesample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anticoncentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version 

Dalirrooyfard, Mina 
STOC '21: "Tight Conditional Lower Bounds ..."
Tight Conditional Lower Bounds for Approximating Diameter in Directed Graphs
Mina Dalirrooyfard and Nicole Wein (Massachusetts Institute of Technology, USA) Among the most fundamental graph parameters is the Diameter, the largest distance between any pair of vertices in a graph. Computing the Diameter of a graph with m edges requires m^{2−o(1)} time under the Strong Exponential Time Hypothesis (SETH), which can be prohibitive for very large graphs, so efficient approximation algorithms for Diameter are desired. There is a folklore algorithm that gives a 2approximation for Diameter in Õ(m) time (where Õ notation suppresses logarithmic factors). Additionally, a line of work [SODA’96, STOC’13, SODA’14] concludes with a 3/2approximation algorithm for Diameter in weighted directed graphs that runs in Õ(m^{3/2}) time. For directed graphs, these are the only known approximation algorithms for Diameter. The 3/2approximation algorithm is known to be tight under SETH: Roditty and Vassilevska W. [STOC’13] proved that under SETH any 3/2−ε approximation algorithm for Diameter in undirected unweighted graphs requires m^{2−o(1)} time, and then Backurs, Roditty, Segal, Vassilevska W., and Wein [STOC’18] and the followup work of Li proved that under SETH any 5/3−ε approximation algorithm for Diameter in undirected unweighted graphs requires m^{3/2−o(1)} time. Whether or not the folklore 2approximation algorithm is tight, however, is unknown, and has been explicitly posed as an open problem in numerous papers. Towards this question, Bonnet recently proved that under SETH, any 7/4−ε approximation requires m^{4/3−o(1)}, only for directed weighted graphs. We completely resolve this question for directed graphs by proving that the folklore 2approximation algorithm is conditionally optimal. In doing so, we obtain a series of conditional lower bounds that together with prior work, give a complete timeaccuracy tradeoff that is tight with the three known algorithms for directed graphs. Specifically, we prove that under SETH for any δ>0, a (2k−1/k−δ)approximation algorithm for Diameter on directed unweighted graphs requires m^{k/k−1−o(1)} time. @InProceedings{STOC21p1697, author = {Mina Dalirrooyfard and Nicole Wein}, title = {Tight Conditional Lower Bounds for Approximating Diameter in Directed Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16971710}, doi = {10.1145/3406325.3451130}, year = {2021}, } Publisher's Version 

Daskalakis, Constantinos 
STOC '21: "SampleOptimal and Efficient ..."
SampleOptimal and Efficient Learning of Tree Ising Models
Constantinos Daskalakis and Qinxuan Pan (Massachusetts Institute of Technology, USA) We show that nvariable treestructured Ising models can be learned computationallyefficiently to within total variation distance є from an optimal O(n lnn/є^{2}) samples, where O(·) hides an absolute constant which, importantly, does not depend on the model being learned—neither its tree nor the magnitude of its edge strengths, on which we place no assumptions. Our guarantees hold, in fact, for the celebrated ChowLiu algorithm [1968], using the plugin estimator for estimating mutual information. While this (or any other) algorithm may fail to identify the structure of the underlying model correctly from a finite sample, we show that it will still learn a treestructured model that is єclose to the true one in total variation distance, a guarantee called “proper learning.” Our guarantees do not follow from known results for the ChowLiu algorithm and the ensuing literature on learning graphical models, including the very recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or samplesuboptimal and/or timeinefficient algorithms, unless further assumptions are placed on the model, such as bounds on the “strengths” of the model’s edges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sampleoptimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation. @InProceedings{STOC21p133, author = {Constantinos Daskalakis and Qinxuan Pan}, title = {SampleOptimal and Efficient Learning of Tree Ising Models}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {133146}, doi = {10.1145/3406325.3451006}, year = {2021}, } Publisher's Version STOC '21: "The Complexity of Constrained ..." The Complexity of Constrained MinMax Optimization Constantinos Daskalakis, Stratis Skoulakis, and Manolis Zampetakis (Massachusetts Institute of Technology, USA; Singapore University of Technology and Design, Singapore; University of California at Berkeley, USA) Despite its important applications in Machine Learning, minmax optimization of objective functions that are nonconvexnonconcave remains elusive. Not only are there no known firstorder methods converging to even approximate local minmax equilibria (a.k.a. approximate saddle points), but the computational complexity of identifying them is also poorly understood. In this paper, we provide a characterization of the computational complexity as well as of the limitations of firstorder methods in this problem. Specifically, we show that in linearly constrained minmax optimization problems with nonconvexnonconcave objectives an approximate local minmax equilibrium of large enough approximation is guaranteed to exist, but computing such a point is PPADcomplete. The same is true of computing an approximate fixed point of the (Projected) Gradient Descent/Ascent update dynamics, which is computationally equivalent to computing approximate local minmax equilibria. An important byproduct of our proof is to establish an unconditional hardness result in the NemirovskyYudin 1983 oracle optimization model, where we are given oracle access to the values of some function f : P → [−1, 1] and its gradient ∇ f, where P ⊆ [0, 1]^{d} is a known convex polytope. We show that any algorithm that uses such firstorder oracle access to f and finds an εapproximate local minmax equilibrium needs to make a number of oracle queries that is exponential in at least one of 1/ε, L, G, or d, where L and G are respectively the smoothness and Lipschitzness of f. This comes in sharp contrast to minimization problems, where finding approximate local minima in the same setting can be done with Projected Gradient Descent using O(L/ε) many queries. Our result is the first to show an exponential separation between these two fundamental optimization problems in the oracle model. @InProceedings{STOC21p1466, author = {Constantinos Daskalakis and Stratis Skoulakis and Manolis Zampetakis}, title = {The Complexity of Constrained MinMax Optimization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14661478}, doi = {10.1145/3406325.3451125}, year = {2021}, } Publisher's Version STOC '21: "Learning Ising Models from ..." Learning Ising Models from One or Multiple Samples Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for onesample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a highdimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the aforedescribed multiplesample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anticoncentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version 

Datta, Rajit 
STOC '21: "Lower Bounds for Monotone ..."
Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Arkadev Chattopadhyay, Rajit Datta, and Partha Mukhopadhyay (Tata Institute of Fundamental Research, India; Chennai Mathematical Institute, India) Valiant (1980) showed that general arithmetic circuits with negation can be exponentially more powerful than monotone ones. We give the first improvement to this classical result: we construct a family of polynomials P_{n} in n variables, each of its monomials has nonnegative coefficient, such that P_{n} can be computed by a polynomialsize depththree formula but every monotone circuit computing it has size 2^{Ω(n1/4/log(n))}. The polynomial P_{n} embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif (2020) to refute the LogApproximateRank Conjecture in communication complexity. To prove our lower bound for P_{n}, we develop a general connection between corruption of combinatorial rectangles by any function f ∘ XOR and corruption of product polynomials by a certain polynomial P^{f} that is an arithmetic embedding of f. This connection should be of independent interest. Using further ideas from communication complexity, we construct another family of setmultilinear polynomials f_{n,m} such that both F_{n,m} − є· f_{n,m} and F_{n,m} + є· f_{n,m} have monotone circuit complexity 2^{Ω(n/log(n))} if є ≥ 2^{− Ω( m )} and F_{n,m} ∏_{i=1}^{n} (x_{i,1} +⋯+x_{i,m}), with m = O( n/logn ). The polynomials f_{n,m} have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits has been advocated recently by Hrubeš (2020) as a first step towards proving lower bounds against general circuits via his new approach. @InProceedings{STOC21p786, author = {Arkadev Chattopadhyay and Rajit Datta and Partha Mukhopadhyay}, title = {Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {786799}, doi = {10.1145/3406325.3451069}, year = {2021}, } Publisher's Version 

De, Anindya 
STOC '21: "Robust Testing of Low Dimensional ..."
Robust Testing of Low Dimensional Functions
Anindya De, Elchanan Mossel, and Joe Neeman (University of Pennsylvania, USA; Massachusetts Institute of Technology, USA; University of Texas at Austin, USA) A natural problem in highdimensional inference is to decide if a classifier f:ℝ^{n} → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝ^{n} → {−1,1}, a linear kjunta if it is completely determined by some kdimensional subspace of the input space. A recent work of the authors showed that linear kjuntas are testable. Thus there exists an algorithm to distinguish between: (1) f: ℝ^{n} → {−1,1} which is a linear kjunta with surface area s. (2) f is єfar from any linear kjunta with surface area (1+є)s. The query complexity of the algorithm is independent of the ambient dimension n. Following the surge of interest in noisetolerant property testing, in this paper we prove a noisetolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between: (1) f: ℝ^{n} → {−1,1} has correlation at least c with some linear kjunta with surface area s. (2) f has correlation at most c−є with any linear kjunta with surface area at most s. The query complexity of our tester is k^{poly(s/є)}. Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear kjuntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity k^{O(poly(logk/є))} for the class of intersection of khalfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no nontrivial noise tolerant testers were known even for a single halfspace. @InProceedings{STOC21p584, author = {Anindya De and Elchanan Mossel and Joe Neeman}, title = {Robust Testing of Low Dimensional Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {584597}, doi = {10.1145/3406325.3451115}, year = {2021}, } Publisher's Version 

DeBlasio, Dan 
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design
MariaFlorina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worstcase bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worstcase guarantee. Worstcase instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that datadriven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, applicationspecific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing casebycase analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewiseconstant, linear, or—more generally—piecewisestructured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {MariaFlorina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version 

De Kroon, Jari J. H. 
STOC '21: "Vertex Deletion Parameterized ..."
Vertex Deletion Parameterized by Elimination Distance and Even Less
Bart M. P. Jansen, Jari J. H. de Kroon, and Michał Włodarczyk (Eindhoven University of Technology, Netherlands) We study the parameterized complexity of various classic vertexdeletion problems such as Odd cycle transversal, Vertex planarization, and Chordal vertex deletion under hybrid parameterizations. Existing FPT algorithms for these problems either focus on the parameterization by solution size, detecting solutions of size k in time f(k) · n^{O(1)}, or width parameterizations, finding arbitrarily large optimal solutions in time f(w) · n^{O(1)} for some width measure w like treewidth. We unify these lines of research by presenting FPT algorithms for parameterizations that can simultaneously be arbitrarily much smaller than the solution size and the treewidth. The first class of parameterizations is based on the notion of elimination distance of the input graph to the target graph class , which intuitively measures the number of rounds needed to obtain a graph in by removing one vertex from each connected component in each round. The second class of parameterizations consists of a relaxation of the notion of treewidth, allowing arbitrarily large bags that induce subgraphs belonging to the target class of the deletion problem as long as these subgraphs have small neighborhoods. Both kinds of parameterizations have been introduced recently and have already spawned several independent results. Our contribution is twofold. First, we present a framework for computing approximately optimal decompositions related to these graph measures. Namely, if the cost of an optimal decomposition is k, we show how to find a decomposition of cost k^{O(1)} in time f(k) · n^{O(1)}. This is applicable to any class for which we can solve the socalled separation problem. Secondly, we exploit the constructed decompositions for solving vertexdeletion problems by extending ideas from algorithms using iterative compression and the finite state property. For the three mentioned vertexdeletion problems, and all problems which can be formulated as hitting a finite set of connected forbidden (a) minors or (b) (induced) subgraphs, we obtain FPT algorithms with respect to both studied parameterizations. For example, we present an algorithm running in time n^{O(1)} + 2^{kO(1)}·(n+m) and polynomial space for Odd cycle transversal parameterized by the elimination distance k to the class of bipartite graphs. @InProceedings{STOC21p1757, author = {Bart M. P. Jansen and Jari J. H. de Kroon and Michał Włodarczyk}, title = {Vertex Deletion Parameterized by Elimination Distance and Even Less}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17571769}, doi = {10.1145/3406325.3451068}, year = {2021}, } Publisher's Version 

De Rezende, Susanna F. 
STOC '21: "Automating Algebraic Proof ..."
Automating Algebraic Proof Systems Is NPHard
Susanna F. de Rezende, Mika Göös, Jakob Nordström, Toniann Pitassi, Robert Robere, and Dmitry Sokolov (Czech Academy of Sciences, Czechia; EPFL, Switzerland; University of Copenhagen, Denmark; Lund University, Sweden; University of Toronto, Canada; Institute for Advanced Study at Princeton, USA; McGill University, Canada; St. Petersburg State University, Russia; Russian Academy of Sciences, Russia) We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula F, it is NPhard to find a refutation of F in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time polynomial in the size of the shortest such refutation. Our work extends, and gives a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020) that established an analogous result for Resolution. @InProceedings{STOC21p209, author = {Susanna F. de Rezende and Mika Göös and Jakob Nordström and Toniann Pitassi and Robert Robere and Dmitry Sokolov}, title = {Automating Algebraic Proof Systems Is NPHard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {209222}, doi = {10.1145/3406325.3451080}, year = {2021}, } Publisher's Version 

Dershowitz, Nachum 
STOC '21: "The Communication Complexity ..."
The Communication Complexity of Multiparty Set Disjointness under Product Distributions
Nachum Dershowitz, Rotem Oshman, and Tal Roth (Tel Aviv University, Israel) In the multiparty numberinhand set disjointness problem, we have k players, with private inputs X_{1},…,X_{k} ⊆ [n]. The players’ goal is to check whether ∩_{ℓ=1}^{k} X_{ℓ} = ∅. It is known that in the shared blackboard model of communication, set disjointness requires Ω(n logk + k) bits of communication, and in the coordinator model, it requires Ω(kn) bits. However, these two lower bounds require that the players’ inputs can be highly correlated. We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the twoparty case. Our main result is a nearlytight bound of Θ^{̃}(n^{1−1/k} + k) for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the twoparty case do not scale to more than two players. @InProceedings{STOC21p1194, author = {Nachum Dershowitz and Rotem Oshman and Tal Roth}, title = {The Communication Complexity of Multiparty Set Disjointness under Product Distributions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11941207}, doi = {10.1145/3406325.3451106}, year = {2021}, } Publisher's Version 

Diakonikolas, Ilias 
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of WisconsinMadison, USA; MPIINF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are єfar from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sampleoptimal testers for a range of properties. While one can always boost the confidence probability of any such tester by blackbox amplification, this generic boosting method typically leads to suboptimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sampleoptimal, within constant factors, as a function of all relevant parameters. We also show matching informationtheoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version STOC '21: "Efficiently Learning Halfspaces ..." Efficiently Learning Halfspaces with Tsybakov Noise Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, and Nikos Zarifis (University of WisconsinMadison, USA; University of California at San Diego, USA) We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomialtime algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of wellbehaved distributions including logconcave distributions. This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the nonoptimality of a candidate halfspace and gave a quasipolynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomialtime certificate algorithm. @InProceedings{STOC21p88, author = {Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, title = {Efficiently Learning Halfspaces with Tsybakov Noise}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88101}, doi = {10.1145/3406325.3450998}, year = {2021}, } Publisher's Version 

Dick, Travis 
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design
MariaFlorina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worstcase bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worstcase guarantee. Worstcase instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that datadriven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, applicationspecific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing casebycase analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewiseconstant, linear, or—more generally—piecewisestructured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {MariaFlorina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version 

Dikkala, Nishanth 
STOC '21: "Learning Ising Models from ..."
Learning Ising Models from One or Multiple Samples
Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for onesample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a highdimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the aforedescribed multiplesample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anticoncentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version 

Dobzinski, Shahar 
STOC '21: "The Communication Complexity ..."
The Communication Complexity of Payment Computation
Shahar Dobzinski and Shiri Ron (Weizmann Institute of Science, Israel) Let (f,P) be an incentive compatible mechanism where f is the social choice function and P is the payment function. In many important settings, f uniquely determines P (up to a constant) and therefore a common approach is to focus on the design of f and neglect the role of the payment function. Fadel and Segal [JET, 2009] question this approach by taking the lenses of communication complexity: can it be that the communication complexity of an incentive compatible mechanism that implements f (that is, computes both the output and the payments) is much larger than the communication complexity of computing the output? I.e., can it be that cc_{IC}(f)>>cc(f)? Fadel and Segal show that for every f, cc_{IC}(f)≤ exp(cc(f)). They also show that fully computing the incentive compatible mechanism is strictly harder than computing only the output: there exists a social choice function f such that cc_{IC}(f)=cc(f)+1. In a followup work, Babaioff, Blumrosen, Naor, and Schapira [EC’08] provide a social choice function f such that cc_{IC}(f)=Θ(n· cc(f)), where n is the number of players. The question of whether the exponential upper bound of Fadel and Segal is tight remained wide open. In this paper we solve this question by explicitly providing a function f such that cc_{IC}(f)= exp(cc(f)). In fact, we establish this via two very different proofs. In contrast, we show that if the players are riskneutral and we can compromise on a randomized truthfulinexpectation implementation (and not on deterministic expost implementation) gives that cc_{TIE}(f)=poly(n,cc(f)) for every function f, as long as the domain of f is single parameter or a convex multiparameter domain. We also provide efficient algorithms for deterministic computation of payments in several important domains. @InProceedings{STOC21p933, author = {Shahar Dobzinski and Shiri Ron}, title = {The Communication Complexity of Payment Computation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {933946}, doi = {10.1145/3406325.3451083}, year = {2021}, } Publisher's Version 

Dong, Sally 
STOC '21: "A NearlyLinear Time Algorithm ..."
A NearlyLinear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path
Sally Dong, Yin Tat Lee, and Guanghao Ye (University of Washington, USA; Microsoft Research, USA) Arising from structural graph theory, treewidth has become a focus of study in fixedparameter tractable algorithms. Many NPhard problems are known to be solvable in O(n · 2^{O(τ)}) time, where τ is the treewidth of the input graph. Analogously, many problems in P should be solvable in O(n · τ^{O(1)}) time; however, due to the lack of appropriate tools, only a few such results are currently known. In our paper, we show this holds for linear programs: Given a linear program of the form min_{Ax=b,ℓ ≤ x≤ u} c^{⊤} x whose dual graph G_{A} has treewidth τ, and a corresponding widthτ tree decomposition, we show how to solve it in time O(n · τ^{2} log(1/ε)), where n is the number of variables and ε is the relative accuracy. When a tree decomposition is not given, we use existing techniques in vertex separators to obtain algorithms with O(n · τ^{4} log(1/ε)) and O(n · τ^{2} log(1/ε) + n^{1.5}) runtimes. Besides being the first of its kind, our algorithm has runtime nearly matching the fastest runtime for solving the subproblem Ax=b (under the assumption that no fast matrix multiplication is used). We obtain these results by combining recent techniques in interiorpoint methods (IPMs), sketching, and a novel representation of the solution under a multiscale basis similar to the wavelet basis. This representation further yields the first IPM with o(rank(A)) time per iteration when the treewidth is small. @InProceedings{STOC21p1784, author = {Sally Dong and Yin Tat Lee and Guanghao Ye}, title = {A NearlyLinear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17841797}, doi = {10.1145/3406325.3451056}, year = {2021}, } Publisher's Version 

Dory, Michal 
STOC '21: "Distributed Weighted MinCut ..."
Distributed Weighted MinCut in NearlyOptimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimumweight cut (mincut) is a basic measure of a network’s connectivity strength. While the mincut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own mincut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearlyoptimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)approximation at best while the exact Õ(n^{0.8}D^{0.2} + n^{0.9})time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hopdiameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)time algorithm for computing mincut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of MukhopadhyayNanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted MinCut in NearlyOptimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11441153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version 

Dudeja, Aditi 
STOC '21: "A Framework for Dynamic Matching ..."
A Framework for Dynamic Matching in Weighted Graphs
Aaron Bernstein, Aditi Dudeja, and Zachary Langley (Rutgers University, USA) We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap. For bipartite graphs, we almost completely settle the gap with a general reduction that converts any algorithm for αapproximate unweighted matching to an algorithm for (1−)αapproximate weighted matching, while only increasing the update time by an O(logn) factor for constant . We also show that our framework leads to significant improvements for nonbipartite graphs, though not in the form of a universal reduction. In particular, we give two algorithms for weighted nonbipartite matching: 1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)approximate maximum weight matching in worstcase update time O(polylog n) with high probability against an adaptive adversary. Our bounds are essentially the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic fully dynamic algorithm that maintains a (2/3−)approximate maximum weight matching in amortized update time O(m^{1/4}). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein and Stein [SODA 2016]. A key feature of our framework is that it uses existing algorithms for unweighted matching as blackboxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models. @InProceedings{STOC21p668, author = {Aaron Bernstein and Aditi Dudeja and Zachary Langley}, title = {A Framework for Dynamic Matching in Weighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {668681}, doi = {10.1145/3406325.3451113}, year = {2021}, } Publisher's Version 

Dütting, Paul 
STOC '21: "Efficient TwoSided Markets ..."
Efficient TwoSided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for twosided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with nearoptimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimumpossible amount of prior information, from each seller distribution is sufficient for a large class of twosided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any onesided mechanism into a twosided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackboxtype mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient TwoSided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14521465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version 

Dwork, Cynthia 
STOC '21: "Outcome Indistinguishability ..."
Outcome Indistinguishability
Cynthia Dwork, Michael P. Kim, Omer Reingold, Guy N. Rothblum, and Gal Yona (Harvard University, USA; University of California at Berkeley, USA; Stanford University, USA; Weizmann Institute of Science, Israel) Prediction algorithms assign numbers to individuals that are popularly understood as individual “probabilities”—what is the probability of 5year survival after cancer diagnosis?—and which increasingly form the basis for lifealtering decisions. Drawing on an understanding of computational indistinguishability developed in complexity theory and cryptography, we introduce Outcome Indistinguishability. Predictors that are Outcome Indistinguishable (OI) yield a generative model for outcomes that cannot be efficiently refuted on the basis of the reallife observations produced by . We investigate a hierarchy of OI definitions, whose stringency increases with the degree to which distinguishers may access the predictor in question. Our findings reveal that OI behaves qualitatively differently than previously studied notions of indistinguishability. First, we provide constructions at all levels of the hierarchy. Then, leveraging recentlydeveloped machinery for proving averagecase finegrained hardness, we obtain lower bounds on the complexity of the more stringent forms of OI. This hardness result provides the first scientific grounds for the political argument that, when inspecting algorithmic risk prediction instruments, auditors should be granted oracle access to the algorithm, not simply historical predictions. @InProceedings{STOC21p1095, author = {Cynthia Dwork and Michael P. Kim and Omer Reingold and Guy N. Rothblum and Gal Yona}, title = {Outcome Indistinguishability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10951108}, doi = {10.1145/3406325.3451064}, year = {2021}, } Publisher's Version 

Efremenko, Klim 
STOC '21: "Optimal Error Resilience of ..."
Optimal Error Resilience of Adaptive Message Exchange
Klim Efremenko, Gillat Kol, and Raghuvansh R. Saxena (BenGurion University of the Negev, Israel; Princeton University, USA) We study the error resilience of the message exchange task: Two parties, each holding a private input, want to exchange their inputs. However, the channel connecting them is governed by an adversary that may corrupt a constant fraction of the transmissions. What is the maximum fraction of corruptions that still allows the parties to exchange their inputs? For the nonadaptive channel, where the parties must agree in advance on the order in which they communicate, the maximum error resilience was shown to be 1/4 (see Braverman and Rao, STOC 2011). The problem was also studied over the adaptive channel, where the order in which the parties communicate may not be predetermined (Ghaffari, Haeupler, and Sudan, STOC 2014; Efremenko, Kol, and Saxena, STOC 2020). These works show that the adaptive channel admits much richer set of protocols but leave open the question of finding its maximum error resilience. In this work, we show that the maximum error resilience of a protocol for message exchange over the adaptive channel is 5/16, thereby settling the above question. Our result requires improving both the known upper bounds and the known lower bounds for the problem. @InProceedings{STOC21p1235, author = {Klim Efremenko and Gillat Kol and Raghuvansh R. Saxena}, title = {Optimal Error Resilience of Adaptive Message Exchange}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12351247}, doi = {10.1145/3406325.3451077}, year = {2021}, } Publisher's Version 

Efron, Yuval 
STOC '21: "Distributed Weighted MinCut ..."
Distributed Weighted MinCut in NearlyOptimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimumweight cut (mincut) is a basic measure of a network’s connectivity strength. While the mincut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own mincut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearlyoptimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)approximation at best while the exact Õ(n^{0.8}D^{0.2} + n^{0.9})time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hopdiameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)time algorithm for computing mincut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of MukhopadhyayNanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted MinCut in NearlyOptimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11441153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version 

Esperet, Louis 
STOC '21: "Optimal Labelling Schemes ..."
Optimal Labelling Schemes for Adjacency, Comparability, and Reachability
Marthe Bonamy, Louis Esperet, Carla Groenland, and Alex Scott (CNRS, France; Labri, France; University of Bordeaux, France; GSCOP, France; Grenoble Alps University, France; University of Oxford, UK) We construct asymptotically optimal adjacency labelling schemes for every hereditary class containing 2^{Ω(n2)} nvertex graphs as n→ ∞. This regime contains many classes of interest, for instance perfect graphs or comparability graphs, for which we obtain an adjacency labelling scheme with labels of n/4+o(n) bits per vertex. This implies the existence of a reachability labelling scheme for digraphs with labels of n/4+o(n) bits per vertex and comparability labelling scheme for posets with labels of n/4+o(n) bits per element. All these results are best possible, up to the lower order term. @InProceedings{STOC21p1109, author = {Marthe Bonamy and Louis Esperet and Carla Groenland and Alex Scott}, title = {Optimal Labelling Schemes for Adjacency, Comparability, and Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11091117}, doi = {10.1145/3406325.3451102}, year = {2021}, } Publisher's Version 

Fearnley, John 
STOC '21: "The Complexity of Gradient ..."
The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
John Fearnley, Paul W. Goldberg, Alexandros Hollender, and Rahul Savani (University of Liverpool, UK; University of Oxford, UK) We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two wellknown classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a KarushKuhnTucker (KKT) point of a continuously differentiable function over the domain [0,1]^{2} is PPAD ∩ PLScomplete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search)  which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD ∩ PLS and contains many interesting problems  is itself equal to PPAD ∩ PLS. @InProceedings{STOC21p46, author = {John Fearnley and Paul W. Goldberg and Alexandros Hollender and Rahul Savani}, title = {The Complexity of Gradient Descent: CLS = PPAD ∩ PLS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {4659}, doi = {10.1145/3406325.3451052}, year = {2021}, } Publisher's Version 

Fefferman, Bill 
STOC '21: "Eliminating Intermediate Measurements ..."
Eliminating Intermediate Measurements in SpaceBounded Quantum Computation
Bill Fefferman and Zachary Remscrim (University of Chicago, USA) A foundational result in the theory of quantum computation, known as the "principle of safe storage," shows that it is always possible to take a quantum circuit and produce an equivalent circuit that makes all measurements at the end of the computation. While this procedure is time efficient, meaning that it does not introduce a large overhead in the number of gates, it uses extra ancillary qubits, and so is not generally space efficient. It is quite natural to ask whether it is possible to eliminate intermediate measurements without increasing the number of ancillary qubits. We give an affirmative answer to this question by exhibiting a procedure to eliminate all intermediate measurements that is simultaneously space efficient and time efficient. In particular, this shows that the definition of a spacebounded quantum complexity class is robust to allowing or forbidding intermediate measurements. A key component of our approach, which may be of independent interest, involves showing that the wellconditioned versions of many standard linearalgebraic problems may be solved by a quantum computer in less space than seems possible by a classical computer. @InProceedings{STOC21p1343, author = {Bill Fefferman and Zachary Remscrim}, title = {Eliminating Intermediate Measurements in SpaceBounded Quantum Computation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13431356}, doi = {10.1145/3406325.3451051}, year = {2021}, } Publisher's Version 

Feldman, Vitaly 
STOC '21: "When Is Memorization of Irrelevant ..."
When Is Memorization of Irrelevant Training Data Necessary for HighAccuracy Learning?
Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, and Kunal Talwar (Boston University, USA; Apple, USA) Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when the examples are highdimensional and have entropy much higher than the sample size, and even when most of that information is ultimately irrelevant to the task at hand. Further, our results do not depend on the training algorithm or the class of models used for learning. Our problems are simple and fairly natural variants of the nextsymbol prediction and the cluster labeling tasks. These tasks can be seen as abstractions of text and imagerelated prediction problems. To establish our results, we reduce from a family of oneway communication problems for which we prove new information complexity lower bounds. @InProceedings{STOC21p123, author = {Gavin Brown and Mark Bun and Vitaly Feldman and Adam Smith and Kunal Talwar}, title = {When Is Memorization of Irrelevant Training Data Necessary for HighAccuracy Learning?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {123132}, doi = {10.1145/3406325.3451131}, year = {2021}, } Publisher's Version 

Feng, Weiming 
STOC '21: "Sampling Constraint Satisfaction ..."
Sampling Constraint Satisfaction Solutions in the Local Lemma Regime
Weiming Feng, Kun He, and Yitong Yin (Nanjing University, China; Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China) We give a Markov chain based algorithm for sampling almost uniform solutions of constraint satisfaction problems (CSPs). Assuming a canonical setting for the Lovász local lemma, where each constraint is violated by a small number of forbidden local configurations, our sampling algorithm is accurate in a local lemma regime, and the running time is a fixed polynomial whose dependency on n is close to linear, where n is the number of variables. Our main approach is a new technique called state compression, which generalizes the “mark/unmark” paradigm of Moitra, and can give fast locallemmabased sampling algorithms. As concrete applications of our technique, we give the current best almostuniform samplers for hypergraph colorings and for CNF solutions. @InProceedings{STOC21p1565, author = {Weiming Feng and Kun He and Yitong Yin}, title = {Sampling Constraint Satisfaction Solutions in the Local Lemma Regime}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15651578}, doi = {10.1145/3406325.3451101}, year = {2021}, } Publisher's Version 

Feng, Yiding 
STOC '21: "Revelation Gap for Pricing ..."
Revelation Gap for Pricing from Samples
Yiding Feng, Jason D. Hartline, and Yingkai Li (Northwestern University, USA) This paper considers priorindependent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a nontrivial revelation gap for revenue maximization. Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the samplebid mechanism (a simple nontruthful mechanism) and upperbound its priorindependent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve priorindependent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a nontrivial revelation gap is shown as the samplebid mechanism outperforms the optimal priorindependent truthful mechanism. On the hardness side, we prove that no (possibly nontruthful) mechanism can achieve priorindependent approximation ratio better than 1.073 even for uniform distributions. @InProceedings{STOC21p1438, author = {Yiding Feng and Jason D. Hartline and Yingkai Li}, title = {Revelation Gap for Pricing from Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14381451}, doi = {10.1145/3406325.3451057}, year = {2021}, } Publisher's Version 

Ferreira Pinto Jr., Renato 
STOC '21: "VC Dimension and DistributionFree ..."
VC Dimension and DistributionFree SampleBased Testing
Eric Blais, Renato Ferreira Pinto Jr., and Nathaniel Harms (University of Waterloo, Canada; Google, Canada) We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distributionfree samplebased model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closelyrelated variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems. @InProceedings{STOC21p504, author = {Eric Blais and Renato Ferreira Pinto Jr. and Nathaniel Harms}, title = {VC Dimension and DistributionFree SampleBased Testing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {504517}, doi = {10.1145/3406325.3451104}, year = {2021}, } Publisher's Version 

Filtser, Arnold 
STOC '21: "Clan Embeddings into Trees, ..."
Clan Embeddings into Trees, and Low Treewidth Graphs
Arnold Filtser and Hung Le (Columbia University, USA; University of Massachusetts, USA) In low distortion metric embeddings, the goal is to embed a host “hard” metric space into a “simpler” target space while approximately preserving pairwise distances. A highly desirable target space is that of a tree metric. Unfortunately, such embedding will result in a huge distortion. A celebrated bypass to this problem is stochastic embedding with logarithmic expected distortion. Another bypass is Ramseytype embedding, where the distortion guarantee applies only to a subset of the points. However, both these solutions fail to provide an embedding into a single tree with a worstcase distortion guarantee on all pairs. In this paper, we propose a novel third bypass called clan embedding. Here each point x is mapped to a subset of points f(x), called a clan, with a special chief point χ(x)∈ f(x). The clan embedding has multiplicative distortion t if for every pair (x,y) some copy y′∈ f(y) in the clan of y is close to the chief of x: min_{y′∈ f(y)}d(y′,χ(x))≤ t· d(x,y). Our first result is a clan embedding into a tree with multiplicative distortion O(logn/є) such that each point has 1+є copies (in expectation). In addition, we provide a “spanning” version of this theorem for graphs and use it to devise the first compact routing scheme with constant size routing tables. We then focus on minorfree graphs of diameter prameterized by D, which were known to be stochastically embeddable into bounded treewidth graphs with expected additive distortion є D. We devise Ramseytype embedding and clan embedding analogs of the stochastic embedding. We use these embeddings to construct the first (bicriteria quasipolynomial time) approximation scheme for the metric ρdominating set and metric ρindependent set problems in minorfree graphs. @InProceedings{STOC21p342, author = {Arnold Filtser and Hung Le}, title = {Clan Embeddings into Trees, and Low Treewidth Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {342355}, doi = {10.1145/3406325.3451043}, year = {2021}, } Publisher's Version 

Fischer, Nick 
STOC '21: "Sparse Nonnegative Convolution ..."
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Karl Bringmann, Nick Fischer, and Vasileios Nakos (Saarland University, Germany; MPIINF, Germany) Computing the convolution A ⋆ B of two lengthn vectors A,B is an ubiquitous computational primitive, with applications in a variety of disciplines. Within theoretical computer science, applications range from string problems to Knapsacktype problems, and from 3SUM to AllPairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of A,B are nonnegative integers. The classical algorithm to compute A⋆ B uses the Fast Fourier Transform (FFT) and runs in time O(n logn). However, in many cases A and B might satisfy sparsity conditions, and hence one could hope for significant gains compared to the standard FFT algorithm. The ideal goal would be an O(k logk)time algorithm, where k is the number of nonzero elements in the output, i.e., the size of the support of A ⋆ B. This problem is referred to as sparse nonnegative convolution, and has received a considerable amount of attention in the literature; the fastest algorithms to date run in time O(k log^{2} n). The main result of this paper is the first O(k logk)time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length n and the largest entry of A and B are subexponential in k. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constanterror randomized algorithms. Specifically, if D(n) is the time to convolve two nonnegative lengthn vectors with success probability 2/3, and S(k) is the time to convolve two nonnegative vectors with output size k with success probability 2/3, then S(k) = O(D(k) + k (loglogk)^{2}). Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function. @InProceedings{STOC21p1711, author = {Karl Bringmann and Nick Fischer and Vasileios Nakos}, title = {Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17111724}, doi = {10.1145/3406325.3451090}, year = {2021}, } Publisher's Version 

Fusco, Federico 
STOC '21: "Efficient TwoSided Markets ..."
Efficient TwoSided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for twosided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with nearoptimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimumpossible amount of prior information, from each seller distribution is sufficient for a large class of twosided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any onesided mechanism into a twosided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackboxtype mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient TwoSided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14521465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version 

Gabriel, Franck 
STOC '21: "Neural Tangent Kernel: Convergence ..."
Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
Arthur Jacot, Franck Gabriel, and Clément Hongler (EPFL, Switzerland) The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features. @InProceedings{STOC21p6, author = {Arthur Jacot and Franck Gabriel and Clément Hongler}, title = {Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {66}, doi = {10.1145/3406325.3465355}, year = {2021}, } Publisher's Version 

Garg, Jugal 
STOC '21: "Approximating Nash Social ..."
Approximating Nash Social Welfare under Rado Valuations
Jugal Garg, Edin Husić, and László A. Végh (University of Illinois at UrbanaChampaign, USA; London School of Economics and Political Science, UK) We consider the problem of approximating maximum Nash social welfare (NSW) while allocating a set of indivisible items to n agents. The NSW is a popular objective that provides a balanced tradeoff between the often conflicting requirements of fairness and efficiency, defined as the weighted geometric mean of the agents’ valuations. For the symmetric additive case of the problem, where agents have the same weight with additive valuations, the first constantfactor approximation algorithm was obtained in 2015. Subsequent work has obtained constantfactor approximation algorithms for the symmetric case under mild generalizations of additive, and O(n)approximation algorithms for subadditive valuations and for the asymmetric case. In this paper, we make significant progress towards both symmetric and asymmetric NSW problems. We present the first constantfactor approximation algorithm for the symmetric case under Rado valuations. Rado valuations form a general class of valuation functions that arise from maximum cost independent matching problems, including as special cases assignment (OXS) valuations and weighted matroid rank functions. Furthermore, our approach also gives the first constantfactor approximation algorithm for the asymmetric case under Rado valuations, provided that the maximum ratio between the weights is bounded by a constant. @InProceedings{STOC21p1412, author = {Jugal Garg and Edin Husić and László A. Végh}, title = {Approximating Nash Social Welfare under Rado Valuations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14121425}, doi = {10.1145/3406325.3451031}, year = {2021}, } Publisher's Version 

Gartland, Peter 
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C_{>t} Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C_{>t}free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an nvertex C_{>t}free graph G with weights on vertices, finds in time n^{(log3 n)} a maximumweight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n^{(log2 n)} assuming G is P_{t}free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on P_{t}free graphs in two directions: by encompassing the more general setting of C_{>t}free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version 

Gawrychowski, Paweł 
STOC '21: "Fully Dynamic Approximation ..."
Fully Dynamic Approximation of LIS in Polylogarithmic Time
Paweł Gawrychowski and Wojciech Janczewski (University of Wrocław, Poland) We revisit the problem of maintaining the longest increasing subsequence (LIS) of an array under (i) inserting an element, and (ii) deleting an element of an array. In a recent breakthrough, Mitzenmacher and Seddighin [STOC 2020] designed an algorithm that maintains an O((1/є)^{O(1/є)})approximation of LIS under both operations with worstcase update time Õ(n^{є}), for any constant є>0 (Õ hides factors polynomial in logn, where n is the length of the input). We exponentially improve on their result by designing an algorithm that maintains an (1+є) approximation of LIS under both operations with worstcase update time Õ(є^{−5}). Instead of working with the grid packing technique introduced by Mitzenmacher and Seddighin, we take a different approach building on a new tool that might be of independent interest: LIS sparsification. A particularly interesting consequence of our result is an improved solution for the socalled ErdősSzekeres partitioning, in which we seek a partition of a given permutation of {1,2,…,n} into O(√n) monotone subsequences. This problem has been repeatedly stated as one of the natural examples in which we see a large gap between the decisiontree complexity and algorithmic complexity. The result of Mitzenmacher and Seddighin implies an O(n^{1+є}) time solution for this problem, for any є>0. Our algorithm (in fact, its simpler decremental version) further improves this to Õ(n). @InProceedings{STOC21p654, author = {Paweł Gawrychowski and Wojciech Janczewski}, title = {Fully Dynamic Approximation of LIS in Polylogarithmic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {654667}, doi = {10.1145/3406325.3451137}, year = {2021}, } Publisher's Version 

Gay, Romain 
STOC '21: "Indistinguishability Obfuscation ..."
Indistinguishability Obfuscation from Circular Security
Romain Gay and Rafael Pass (IBM Research, Switzerland; Cornell Tech, USA) We show the existence of indistinguishability obfuscators (iO) for general circuits assuming subexponential security of: (a) the Learning with Errors (LWE) assumption (with subexponential modulustonoise ratio); (b) a circular security conjecture regarding the GentrySahaiWaters' (GSW) encryption scheme and a Packed version of Regev's encryption scheme. The circular security conjecture states that a notion of leakageresilient security, that we prove is satisfied by GSW assuming LWE, is retained in the presence of an encrypted keycycle involving GSW and Packed Regev. @InProceedings{STOC21p736, author = {Romain Gay and Rafael Pass}, title = {Indistinguishability Obfuscation from Circular Security}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {736749}, doi = {10.1145/3406325.3451070}, year = {2021}, } Publisher's Version 

Gayen, Sutanu 
STOC '21: "NearOptimal Learning of TreeStructured ..."
NearOptimal Learning of TreeStructured Distributions by ChowLiu
Arnab Bhattacharyya, Sutanu Gayen, Eric Price, and N. V. Vinodchandran (National University of Singapore, Singapore; University of Texas at Austin, USA; University of NebraskaLincoln, USA) We provide finite sample guarantees for the classical ChowLiu algorithm (IEEE Trans. Inform. Theory, 1968) to learn a treestructured graphical model of a distribution. For a distribution P on Σ^{n} and a tree T on n nodes, we say T is an εapproximate tree for P if there is a Tstructured distribution Q such that D(P  Q) is at most ε more than the best possible treestructured distribution for P. We show that if P itself is treestructured, then the ChowLiu algorithm with the plugin estimator for mutual information with O(Σ^{3} nε^{−1}) i.i.d. samples outputs an εapproximate tree for P with constant probability. In contrast, for a general P (which may not be treestructured), Ω(n^{2}ε^{−2}) samples are necessary to find an εapproximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart (STOC, 2018): we prove that for three random variables X,Y,Z each over Σ, testing if I(X; Y ∣ Z) is 0 or ≥ ε is possible with O(Σ^{3}/ε) samples. Finally, we show that for a specific tree T, with O(Σ^{2}nε^{−1}) samples from a distribution P over Σ^{n}, one can efficiently learn the closest Tstructured distribution in KL divergence by applying the add1 estimator at each node. @InProceedings{STOC21p147, author = {Arnab Bhattacharyya and Sutanu Gayen and Eric Price and N. V. Vinodchandran}, title = {NearOptimal Learning of TreeStructured Distributions by ChowLiu}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {147160}, doi = {10.1145/3406325.3451066}, year = {2021}, } Publisher's Version Info 

Ghaffari, Mohsen 
STOC '21: "HopConstrained Oblivious ..."
HopConstrained Oblivious Routing
Mohsen Ghaffari, Bernhard Haeupler, and Goran Zuzic (ETH Zurich, Switzerland; Carnegie Mellon University, USA) We prove the existence of an oblivious routing scheme that is poly(logn)competitive in terms of (congestion + dilation), thus resolving a wellknown question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to minimize (congestion + dilation), defined as follows: The dilation is the maximum path hoplength, and the congestion is the maximum number of paths that include any single edge. The routing scheme obliviously and randomly selects a path for each packet independent of (the existence of) the other packets. Despite this obliviousness, the selected paths have (congestion + dilation) within a poly(logn) factor of the best possible value. More precisely, for any integer hopbound h, this oblivious routing scheme selects paths of length at most h · poly(logn) and is poly(logn)competitive in terms of congestion in comparison to the best possible congestion achievable via paths of length at most h hops. These paths can be sampled in polynomial time. This result can be viewed as an analogue of the celebrated oblivious routing results of R'acke [FOCS 2002, STOC 2008], which are O(logn)competitive in terms of congestion, but are not competitive in terms of dilation. @InProceedings{STOC21p1208, author = {Mohsen Ghaffari and Bernhard Haeupler and Goran Zuzic}, title = {HopConstrained Oblivious Routing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12081220}, doi = {10.1145/3406325.3451098}, year = {2021}, } Publisher's Version 

Gharan, Shayan Oveis 
STOC '21: "LogConcave Polynomials IV: ..."
LogConcave Polynomials IV: Approximate Exchange, Tight Mixing Times, and NearOptimal Sampling of Forests
Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and ThuyDuong Vuong (Stanford University, USA; University of Washington, USA; North Carolina State University, USA) We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with logconcave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with logconcave polynomials of homogeneous degree k on n variables, we show that the downup random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of wellstudied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over sizek subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a k^{O(k)}approximation of max_{S} µ(S) when µ is generated by a logconcave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage downup random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog^{2} n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almostlinear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearlylinear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k_{1} and k_{2} in time O(n log^{2} n), for fixed parameters k_{1}, k_{2}. @InProceedings{STOC21p408, author = {Nima Anari and Kuikui Liu and Shayan Oveis Gharan and Cynthia Vinzant and ThuyDuong Vuong}, title = {LogConcave Polynomials IV: Approximate Exchange, Tight Mixing Times, and NearOptimal Sampling of Forests}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {408420}, doi = {10.1145/3406325.3451091}, year = {2021}, } Publisher's Version Info STOC '21: "A (Slightly) Improved Approximation ..." A (Slightly) Improved Approximation Algorithm for Metric TSP Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan (University of Washington, USA) For some > 10^{−36} we give a randomized 3/2− approximation algorithm for metric TSP. @InProceedings{STOC21p32, author = {Anna R. Karlin and Nathan Klein and Shayan Oveis Gharan}, title = {A (Slightly) Improved Approximation Algorithm for Metric TSP}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {3245}, doi = {10.1145/3406325.3451009}, year = {2021}, } Publisher's Version 

Ghazi, Badih 
STOC '21: "SampleEfficient Proper PAC ..."
SampleEfficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d^{6}), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2^{O(d)} on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {SampleEfficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version 

Ghoshal, Suprovat 
STOC '21: "Hardness of Learning DNFs ..."
Hardness of Learning DNFs using Halfspaces
Suprovat Ghoshal and Rishi Saket (University of Michigan, USA; IBM Research, India) The problem of learning tterm DNF formulas (for t = O(1)) has been studied extensively in the PAC model since its introduction by Valiant (STOC 1984). A tterm DNF can be efficiently learnt using a tterm DNF only if t = 1 i.e., when it is an AND, while even weakly learning a 2term DNF using a constant term DNF was shown to be NPhard by Khot and Saket (FOCS 2008). On the other hand, Feldman, Guruswami, Raghavendra and Wu (FOCS 2009) showed the hardness of weakly learning a noisy AND using a halfspace – the latter being a generalization of an AND, while Khot and Saket (STOC 2008) showed that an intersection of two halfspaces is hard to weakly learn using any function of constantly many halfspaces. The question of whether a 2term DNF is efficiently learnable using 2 or constantly many halfspaces remained open. In this work we answer this question in the negative by showing the hardness of weakly learning a 2term DNF as well as a noisy AND using any function of a constant number of halfspaces. In particular we prove the following. For any constants ν, ζ > 0 and ℓ ∈ N, given a distribution over pointvalue pairs {0,1}^{n} × {0,1}, it is NPhard to decide whether, (i) YES Case. There is a 2term DNF that classifies all the points of the distribution, and an AND that classifies at least 1−ζ fraction of the points correctly. (ii) NO Case. Any boolean function depending on at most ℓ halfspaces classifies at most 1/2 + ν fraction of the points of the distribution correctly. Our result generalizes and strengthens the previous best results mentioned above on the hardness of learning a 2term DNF, learning an intersection of two halfspaces, and learning a noisy AND. @InProceedings{STOC21p467, author = {Suprovat Ghoshal and Rishi Saket}, title = {Hardness of Learning DNFs using Halfspaces}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {467480}, doi = {10.1145/3406325.3451067}, year = {2021}, } Publisher's Version 

Giakkoupis, George 
STOC '21: "Efficient Randomized DCAS ..."
Efficient Randomized DCAS
George Giakkoupis, Mehrdad Jafari Giv, and Philipp Woelfel (Inria, France; University of Rennes, France; CNRS, France; IRISA, France; University of Calgary, Canada) Double CompareAndSwap (DCAS) is a tremendously useful synchronization primitive, which is also notoriously difficult to implement efficiently from objects that are provided by hardware. We present a randomized implementation of DCAS with O(logn) expected amortized step complexity against the oblivious adversary, where n is the number of processes in the system. This is the only algorithm todate that achieves sublinear step complexity. We achieve that by first implementing two novel algorithms as building blocks. One is a mechanism that allows processes to repeatedly agree on a random value among multiple proposed ones, and the other one is a restricted bipartite version of DCAS. @InProceedings{STOC21p1221, author = {George Giakkoupis and Mehrdad Jafari Giv and Philipp Woelfel}, title = {Efficient Randomized DCAS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12211234}, doi = {10.1145/3406325.3451133}, year = {2021}, } Publisher's Version 

Gilyén, András 
STOC '21: "(Sub)Exponential Advantage ..."
(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem
András Gilyén, Matthew B. Hastings, and Umesh Vazirani (California Institute of Technology, USA; Microsoft Quantum, USA; Microsoft Research, USA; University of California at Berkeley, USA) We demonstrate the possibility of (sub)exponential quantum speedup via a quantum algorithm that follows an adiabatic path of a gapped Hamiltonian with no sign problem. The Hamiltonian that exhibits this speedup comes from the adjacency matrix of an undirected graph whose vertices are labeled by nbit strings, and we can view the adiabatic evolution as an efficient O(poly(n))time quantum algorithm for finding a specific “EXIT” vertex in the graph given the “ENTRANCE” vertex. On the other hand we show that if the graph is given via an adjacencylist oracle, there is no classical algorithm that finds the “EXIT” with probability greater than exp(−n^{δ}) using at most exp(n^{δ}) queries for δ= 1/5 − o(1). Our construction of the graph is somewhat similar to the “weldedtrees” construction of Childs et al., but uses additional ideas of Hastings for achieving a spectral gap and a short adiabatic path. @InProceedings{STOC21p1357, author = {András Gilyén and Matthew B. Hastings and Umesh Vazirani}, title = {(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13571369}, doi = {10.1145/3406325.3451060}, year = {2021}, } Publisher's Version 

Giv, Mehrdad Jafari 
STOC '21: "Efficient Randomized DCAS ..."
Efficient Randomized DCAS
George Giakkoupis, Mehrdad Jafari Giv, and Philipp Woelfel (Inria, France; University of Rennes, France; CNRS, France; IRISA, France; University of Calgary, Canada) Double CompareAndSwap (DCAS) is a tremendously useful synchronization primitive, which is also notoriously difficult to implement efficiently from objects that are provided by hardware. We present a randomized implementation of DCAS with O(logn) expected amortized step complexity against the oblivious adversary, where n is the number of processes in the system. This is the only algorithm todate that achieves sublinear step complexity. We achieve that by first implementing two novel algorithms as building blocks. One is a mechanism that allows processes to repeatedly agree on a random value among multiple proposed ones, and the other one is a restricted bipartite version of DCAS. @InProceedings{STOC21p1221, author = {George Giakkoupis and Mehrdad Jafari Giv and Philipp Woelfel}, title = {Efficient Randomized DCAS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12211234}, doi = {10.1145/3406325.3451133}, year = {2021}, } Publisher's Version 

Goldberg, Paul W. 
STOC '21: "The Complexity of Gradient ..."
The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
John Fearnley, Paul W. Goldberg, Alexandros Hollender, and Rahul Savani (University of Liverpool, UK; University of Oxford, UK) We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two wellknown classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a KarushKuhnTucker (KKT) point of a continuously differentiable function over the domain [0,1]^{2} is PPAD ∩ PLScomplete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search)  which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD ∩ PLS and contains many interesting problems  is itself equal to PPAD ∩ PLS. @InProceedings{STOC21p46, author = {John Fearnley and Paul W. Goldberg and Alexandros Hollender and Rahul Savani}, title = {The Complexity of Gradient Descent: CLS = PPAD ∩ PLS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {4659}, doi = {10.1145/3406325.3451052}, year = {2021}, } Publisher's Version 

Golowich, Noah 
STOC '21: "SampleEfficient Proper PAC ..."
SampleEfficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d^{6}), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2^{O(d)} on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {SampleEfficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version 

Gonen, Alon 
STOC '21: "Boosting Simple Learners ..."
Boosting Simple Learners
Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rulesofthumbs” from an “easytolearn class”. (Schapire and Freund ’12, ShalevShwartz and BenDavid ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ^{2}) weak hypotheses with γmargin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorialgeometric parameter which captures the expressivity of baseclasses in boosting. As a corollary we provide an affirmative answer to the second question for many wellstudied classes, including halfspaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version 

Göös, Mika 
STOC '21: "Automating Algebraic Proof ..."
Automating Algebraic Proof Systems Is NPHard
Susanna F. de Rezende, Mika Göös, Jakob Nordström, Toniann Pitassi, Robert Robere, and Dmitry Sokolov (Czech Academy of Sciences, Czechia; EPFL, Switzerland; University of Copenhagen, Denmark; Lund University, Sweden; University of Toronto, Canada; Institute for Advanced Study at Princeton, USA; McGill University, Canada; St. Petersburg State University, Russia; Russian Academy of Sciences, Russia) We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula F, it is NPhard to find a refutation of F in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time polynomial in the size of the shortest such refutation. Our work extends, and gives a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020) that established an analogous result for Resolution. @InProceedings{STOC21p209, author = {Susanna F. de Rezende and Mika Göös and Jakob Nordström and Toniann Pitassi and Robert Robere and Dmitry Sokolov}, title = {Automating Algebraic Proof Systems Is NPHard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {209222}, doi = {10.1145/3406325.3451080}, year = {2021}, } Publisher's Version 

Gottlieb, LeeAd 
STOC '21: "NearLinear Time Approximation ..."
NearLinear Time Approximation Schemes for Steiner Tree and Forest in LowDimensional Spaces
Yair Bartal and LeeAd Gottlieb (Hebrew University of Jerusalem, Israel; Ariel University, Israel) We give an algorithm that computes a (1+є)approximate Steiner forest in nearlinear time n · 2^{(1/є)O(ddim2) (loglogn)2}, where ddim is the doubling dimension of the metric space. This improves upon the best previous result due to Chan et al. (SIAM J. Comput. 4 (2018)), who gave a runtime of about n^{2O(ddim)} · 2^{(ddim/є)O(ddim) √logn}. For Steiner tree our methods achieve an even better runtime n (logn)^{(1/є)O(ddim2)}. @InProceedings{STOC21p1028, author = {Yair Bartal and LeeAd Gottlieb}, title = {NearLinear Time Approximation Schemes for Steiner Tree and Forest in LowDimensional Spaces}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10281041}, doi = {10.1145/3406325.3451063}, year = {2021}, } Publisher's Version 

Gouleakis, Themis 
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of WisconsinMadison, USA; MPIINF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are єfar from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sampleoptimal testers for a range of properties. While one can always boost the confidence probability of any such tester by blackbox amplification, this generic boosting method typically leads to suboptimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sampleoptimal, within constant factors, as a function of all relevant parameters. We also show matching informationtheoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version 

Groenland, Carla 
STOC '21: "Optimal Labelling Schemes ..."
Optimal Labelling Schemes for Adjacency, Comparability, and Reachability
Marthe Bonamy, Louis Esperet, Carla Groenland, and Alex Scott (CNRS, France; Labri, France; University of Bordeaux, France; GSCOP, France; Grenoble Alps University, France; University of Oxford, UK) We construct asymptotically optimal adjacency labelling schemes for every hereditary class containing 2^{Ω(n2)} nvertex graphs as n→ ∞. This regime contains many classes of interest, for instance perfect graphs or comparability graphs, for which we obtain an adjacency labelling scheme with labels of n/4+o(n) bits per vertex. This implies the existence of a reachability labelling scheme for digraphs with labels of n/4+o(n) bits per vertex and comparability labelling scheme for posets with labels of n/4+o(n) bits per element. All these results are best possible, up to the lower order term. @InProceedings{STOC21p1109, author = {Marthe Bonamy and Louis Esperet and Carla Groenland and Alex Scott}, title = {Optimal Labelling Schemes for Adjacency, Comparability, and Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11091117}, doi = {10.1145/3406325.3451102}, year = {2021}, } Publisher's Version 

Grosof, Isaac 
STOC '21: "Load Balancing Guardrails: ..."
Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)
Isaac Grosof, Ziv Scully, and Mor HarcholBalter (Carnegie Mellon University, USA) This talk is about scheduling and load balancing in a multiserver system, with the goal of minimizing mean response time in a general stochastic setting. We will specifically concentrate on the common case of a load balancing system, where a frontend load balancer (a.k.a. dispatcher) dispatches requests to multiple backend servers, each with their own queue. Much is known about load balancing in the case where the scheduling at the servers is FirstComeFirstServed (FCFS). However, to minimize mean response time, we need to use ShortestRemainingProcessingTime (SRPT) scheduling at the servers. Unfortunately, there is almost nothing known about optimal dispatching when SRPT scheduling is used at the servers. To make things worse, it turns out that the traditional dispatching policies that are used in practice with FCFS servers often have poor performance in systems with SRPT servers. In this talk, we devise a simple fix that can be applied to any dispatching policy. This fix, called "guardrails" ensures that the dispatching policy yields optimal mean response time under heavy traffic, when used in a system with SRPT servers. Any dispatching policy, when augmented with guardrails becomes heavytraffic optimal. Our results also yield the first analytical bounds on mean response time for load balancing systems with SRPT scheduling at the servers. Load balancing and scheduling are highly studied both in the stochastic and the worstcase scheduling communities. One aim of this talk is to contrast some differences in the approaches of the two communities when tackling multiserver scheduling problems. @InProceedings{STOC21p10, author = {Isaac Grosof and Ziv Scully and Mor HarcholBalter}, title = {Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1010}, doi = {10.1145/3406325.3465359}, year = {2021}, } Publisher's Version 

Guo, Zeyu 
STOC '21: "Efficient ListDecoding with ..."
Efficient ListDecoding with Constant Alphabet and List Sizes
Zeyu Guo and Noga RonZewi (University of Haifa, Israel) We present an explicit and efficient algebraic construction of capacityachieving list decodable codes with both constant alphabet and constant list sizes. More specifically, for any R ∈ (0,1) and є>0, we give an algebraic construction of an infinite family of errorcorrecting codes of rate R, over an alphabet of size (1/є)^{O(1/є2)}, that can be list decoded from a (1−R−є)fraction of errors with list size at most exp(poly(1/є)). Moreover, the codes can be encoded in time poly(1/є, n), the output list is contained in a linear subspace of dimension at most poly(1/є), and a basis for this subspace can be found in time poly(1/є, n). Thus, both encoding and list decoding can be performed in fully polynomialtime poly(1/є, n), except for pruning the subspace and outputting the final list which takes time exp(poly(1/є)) · poly(n). In contrast, prior explicit and efficient constructions of capacityachieving list decodable codes either required a much higher complexity in terms of 1/є (and were additionally much less structured), or had superconstant alphabet or list sizes. Our codes are quite natural and structured. Specifically, we use algebraicgeometric (AG) codes with evaluation points restricted to a subfield, and with the message space restricted to a (carefully chosen) linear subspace. Our main observation is that the output list of AG codes with subfield evaluation points is contained in an affine shift of the image of a blocktriangularToeplitz (BTT) matrix, and that the list size can potentially be reduced to a constant by restricting the message space to a BTT evasive subspace, which is a large subspace that intersects the image of any BTT matrix in a constant number of points. We further show how to explicitly construct such BTT evasive subspaces, based on the explicit subspace designs of Guruswami and Kopparty (Combinatorica, 2016), and composition. @InProceedings{STOC21p1502, author = {Zeyu Guo and Noga RonZewi}, title = {Efficient ListDecoding with Constant Alphabet and List Sizes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15021515}, doi = {10.1145/3406325.3451046}, year = {2021}, } Publisher's Version 

Gupta, Anupam 
STOC '21: "Chasing Convex Bodies with ..."
Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)
C. J. Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang (Carnegie Mellon University, USA; Google Research, USA) The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over ddimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs x t − x t − 1  is minimized. This problem generalizes questions in online algorithms such as caching and the kserver problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)competitive algorithm, based on the notion of the Steiner point of a convex body. @InProceedings{STOC21p5, author = {C. J. Argue and Anupam Gupta and Guru Guruganesh and Ziye Tang}, title = {Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {55}, doi = {10.1145/3406325.3465354}, year = {2021}, } Publisher's Version STOC '21: "A Quasipolynomial (2 + ε)Approximation ..." A Quasipolynomial (2 + ε)Approximation for Planar Sparsest Cut Vincent CohenAddad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (nonuniform) sparsest cut problem is the following graphpartitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constantfactor approximations are known. We consider the problem for planar graphs, and give a (2+)approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a SheraliAdamsstyle linear programming relaxation of the problem, which we then round. This should be compared with the polynomialtime approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ_{1}embeddings, and achieves an O(√logn)approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent CohenAddad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10561069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version 

Guruganesh, Guru 
STOC '21: "Chasing Convex Bodies with ..."
Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)
C. J. Argue, Anupam Gupta, Guru Guruganesh, and Ziye Tang (Carnegie Mellon University, USA; Google Research, USA) The problem of chasing convex functions is easy to state: faced with a sequence of convex functions f t over ddimensional Euclidean spaces, the goal of the algorithm is to output a point x t at each time, so that the sum of the function costs f t (x t ), plus the movement costs x t − x t − 1  is minimized. This problem generalizes questions in online algorithms such as caching and the kserver problem. In 1994, Friedman and Linial posed the question of getting an algorithm with a competitive ratio that depends only on the dimension d. In this talk we give an O (d)competitive algorithm, based on the notion of the Steiner point of a convex body. @InProceedings{STOC21p5, author = {C. J. Argue and Anupam Gupta and Guru Guruganesh and Ziye Tang}, title = {Chasing Convex Bodies with Linear Competitive Ratio (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {55}, doi = {10.1145/3406325.3465354}, year = {2021}, } Publisher's Version 

Gurvits, Leonid 
STOC '21: "Capacity Lower Bounds via ..."
Capacity Lower Bounds via Productization
Leonid Gurvits and Jonathan Leake (City College of New York, USA; TU Berlin, Germany) We give a sharp lower bound on the capacity of a real stable polynomial, depending only on the value of its gradient at x = 1. This result implies a sharp improvement to a similar inequality proved by LinialSamorodnitskyWigderson in 2000, which was crucial to the analysis of their permanent approximation algorithm. Such inequalities have played an important role in the recent work on operator scaling and its generalizations and applications, and in fact we use our bound to construct a new scaling algorithm for real stable polynomials. Our bound is also quite similar to one used very recently by KarlinKleinOveis Gharan to give an improved approximation factor for metric TSP. The new technique we develop to prove this bound is productization, which says that any real stable polynomial can be approximated at any point in the positive orthant by a product of linear forms. Beyond the results of this paper, our main hope is that this new technique will allow us to avoid ”frightening technicalities”, in the words of Laurent and Schrijver, that often accompany combinatorial lower bounds. In particular, we believe that this technique will be useful towards simplifying and improving further the approximation factor given in the fantastic work of KarlinKleinOveis Gharan on metric TSP. @InProceedings{STOC21p847, author = {Leonid Gurvits and Jonathan Leake}, title = {Capacity Lower Bounds via Productization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {847858}, doi = {10.1145/3406325.3451105}, year = {2021}, } Publisher's Version 

Haah, Jeongwan 
STOC '21: "Fiber Bundle Codes: Breaking ..."
Fiber Bundle Codes: Breaking the N^{1/2} polylog(N) Barrier for Quantum LDPC Codes
Matthew B. Hastings, Jeongwan Haah, and Ryan O'Donnell (Station Q, USA; Microsoft Quantum, USA; Carnegie Mellon University, USA) We present a quantum LDPC code family that has distance Ω(N^{3/5}/polylog(N)) and Θ(N^{3/5}) logical qubits, where N is the code length. This is the first quantum LDPC code construction that achieves distance greater than N^{1/2} polylog(N). The construction is based on generalizing the homological product of codes to a fiber bundle. @InProceedings{STOC21p1276, author = {Matthew B. Hastings and Jeongwan Haah and Ryan O'Donnell}, title = {Fiber Bundle Codes: Breaking the <i>N</i><sup>1/2</sup> polylog(<i>N</i>) Barrier for Quantum LDPC Codes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12761288}, doi = {10.1145/3406325.3451005}, year = {2021}, } Publisher's Version 

Haeupler, Bernhard 
STOC '21: "Tree Embeddings for HopConstrained ..."
Tree Embeddings for HopConstrained Network Design
Bernhard Haeupler, D. Ellis Hershkowitz, and Goran Zuzic (Carnegie Mellon University, USA; ETH Zurich, Switzerland) Network design problems aim to compute lowcost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hopunconstrained counterparts. A significant algorithmic barrier in this setting is the fact that hopconstrained distances in graphs are very far from being a metric. We show that, nonetheless, hopconstrained distances can be approximated by distributions over ``partial tree metrics.'' We build this result into a powerful and versatile algorithmic tool which, similarly to classic probabilistic tree embeddings, reduces hopconstrained problems in general graphs to hopunconstrained problems on trees. We then use this tool to give the first polylogarithmic bicriteria approximations for the hopconstrained variants of many classic network design problems. These include Steiner forest, group Steiner tree, group Steiner forest, buyatbulk network design as well as online and oblivious versions of many of these problems. @InProceedings{STOC21p356, author = {Bernhard Haeupler and D. Ellis Hershkowitz and Goran Zuzic}, title = {Tree Embeddings for HopConstrained Network Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {356369}, doi = {10.1145/3406325.3451053}, year = {2021}, } Publisher's Version STOC '21: "HopConstrained Oblivious ..." HopConstrained Oblivious Routing Mohsen Ghaffari, Bernhard Haeupler, and Goran Zuzic (ETH Zurich, Switzerland; Carnegie Mellon University, USA) We prove the existence of an oblivious routing scheme that is poly(logn)competitive in terms of (congestion + dilation), thus resolving a wellknown question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to minimize (congestion + dilation), defined as follows: The dilation is the maximum path hoplength, and the congestion is the maximum number of paths that include any single edge. The routing scheme obliviously and randomly selects a path for each packet independent of (the existence of) the other packets. Despite this obliviousness, the selected paths have (congestion + dilation) within a poly(logn) factor of the best possible value. More precisely, for any integer hopbound h, this oblivious routing scheme selects paths of length at most h · poly(logn) and is poly(logn)competitive in terms of congestion in comparison to the best possible congestion achievable via paths of length at most h hops. These paths can be sampled in polynomial time. This result can be viewed as an analogue of the celebrated oblivious routing results of R'acke [FOCS 2002, STOC 2008], which are O(logn)competitive in terms of congestion, but are not competitive in terms of dilation. @InProceedings{STOC21p1208, author = {Mohsen Ghaffari and Bernhard Haeupler and Goran Zuzic}, title = {HopConstrained Oblivious Routing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12081220}, doi = {10.1145/3406325.3451098}, year = {2021}, } Publisher's Version STOC '21: "UniversallyOptimal Distributed ..." UniversallyOptimal Distributed Algorithms for Known Topologies Bernhard Haeupler, David Wajc, and Goran Zuzic (Carnegie Mellon University, USA; ETH Zurich, Switzerland; Stanford University, USA) Many distributed optimization algorithms achieve existentiallyoptimal running times, meaning that there exists some pathological worstcase topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (i) What network topology parameters determine the complexity of distributed optimization? (ii) Are there universallyoptimal algorithms that are as fast as possible on every topology? We resolve these 25yearold open problems in the knowntopology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, (1+є)min cut, various approximate shortest paths problems, subgraph connectivity, etc. In particular, we provide several (equivalent) graph parameters and show they are tight universal lower bounds for the above problems, fully characterizing their inherent complexity. Our results also imply that algorithms based on the lowcongestion shortcut framework match the above lower bound, making them universally optimal if shortcuts are efficiently approximable. @InProceedings{STOC21p1166, author = {Bernhard Haeupler and David Wajc and Goran Zuzic}, title = {UniversallyOptimal Distributed Algorithms for Known Topologies}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11661179}, doi = {10.1145/3406325.3451081}, year = {2021}, } Publisher's Version 

Halldórsson, Magnús M. 
STOC '21: "Efficient Randomized Distributed ..."
Efficient Randomized Distributed Coloring in CONGEST
Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan (Reykjavik University, Iceland; University of Freiburg, Germany; Technion, Israel) Distributed vertex coloring is one of the classic problems and probably also the most widely studied problems in the area of distributed graph algorithms. We present a new randomized distributed vertex coloring algorithm for the standard CONGEST model, where the network is modeled as an nnode graph G, and where the nodes of G operate in synchronous communication rounds in which they can exchange O(logn)bit messages over all the edges of G. For graphs with maximum degree Δ, we show that the (Δ+1)list coloring problem (and therefore also the standard (Δ+1)coloring problem) can be solved in O(log^{5}logn) rounds. Previously such a result was only known for the significantly more powerful LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary size. The best previous (Δ+1)coloring algorithm in the CONGEST model had a running time of O(logΔ + log^{6}logn) rounds. As a function of n alone, the best previous algorithm therefore had a round complexity of O(logn), which is a bound that can also be achieved by a na'ive folklore algorithm. For large maximum degree Δ, our algorithm hence is an exponential improvement over the previous state of the art. @InProceedings{STOC21p1180, author = {Magnús M. Halldórsson and Fabian Kuhn and Yannic Maus and Tigran Tonoyan}, title = {Efficient Randomized Distributed Coloring in CONGEST}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11801193}, doi = {10.1145/3406325.3451089}, year = {2021}, } Publisher's Version 

Hanneke, Steve 
STOC '21: "A Theory of Universal Learning ..."
A Theory of Universal Learning
Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, and Amir Yehudayoff (Google, Switzerland; Toyota Technological Institute at Chicago, USA; Technion, Israel; Google Research, Israel; Princeton University, USA) How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of VapnikChervonenkis and Valiant, does not explain the behavior of learning curves: the distributionfree PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios. @InProceedings{STOC21p532, author = {Olivier Bousquet and Steve Hanneke and Shay Moran and Ramon van Handel and Amir Yehudayoff}, title = {A Theory of Universal Learning}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {532541}, doi = {10.1145/3406325.3451087}, year = {2021}, } Publisher's Version 

HarcholBalter, Mor 
STOC '21: "Load Balancing Guardrails: ..."
Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)
Isaac Grosof, Ziv Scully, and Mor HarcholBalter (Carnegie Mellon University, USA) This talk is about scheduling and load balancing in a multiserver system, with the goal of minimizing mean response time in a general stochastic setting. We will specifically concentrate on the common case of a load balancing system, where a frontend load balancer (a.k.a. dispatcher) dispatches requests to multiple backend servers, each with their own queue. Much is known about load balancing in the case where the scheduling at the servers is FirstComeFirstServed (FCFS). However, to minimize mean response time, we need to use ShortestRemainingProcessingTime (SRPT) scheduling at the servers. Unfortunately, there is almost nothing known about optimal dispatching when SRPT scheduling is used at the servers. To make things worse, it turns out that the traditional dispatching policies that are used in practice with FCFS servers often have poor performance in systems with SRPT servers. In this talk, we devise a simple fix that can be applied to any dispatching policy. This fix, called "guardrails" ensures that the dispatching policy yields optimal mean response time under heavy traffic, when used in a system with SRPT servers. Any dispatching policy, when augmented with guardrails becomes heavytraffic optimal. Our results also yield the first analytical bounds on mean response time for load balancing systems with SRPT scheduling at the servers. Load balancing and scheduling are highly studied both in the stochastic and the worstcase scheduling communities. One aim of this talk is to contrast some differences in the approaches of the two communities when tackling multiserver scheduling problems. @InProceedings{STOC21p10, author = {Isaac Grosof and Ziv Scully and Mor HarcholBalter}, title = {Load Balancing Guardrails: Keeping Your Heavy Traffic on the Road to Low Response Times (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1010}, doi = {10.1145/3406325.3465359}, year = {2021}, } Publisher's Version 

Harms, Nathaniel 
STOC '21: "VC Dimension and DistributionFree ..."
VC Dimension and DistributionFree SampleBased Testing
Eric Blais, Renato Ferreira Pinto Jr., and Nathaniel Harms (University of Waterloo, Canada; Google, Canada) We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distributionfree samplebased model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closelyrelated variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems. @InProceedings{STOC21p504, author = {Eric Blais and Renato Ferreira Pinto Jr. and Nathaniel Harms}, title = {VC Dimension and DistributionFree SampleBased Testing}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {504517}, doi = {10.1145/3406325.3451104}, year = {2021}, } Publisher's Version 

Harsha, Prahladh 
STOC '21: "Decoding Multivariate Multiplicity ..."
Decoding Multivariate Multiplicity Codes on Product Sets
Siddharth Bhandari, Prahladh Harsha, Mrinal Kumar, and Madhu Sudan (Tata Institute of Fundamental Research, India; IIT Bombay, India; Harvard University, USA) The multiplicity SchwartzZippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct highrate locallydecodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial listdecodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in . @InProceedings{STOC21p1489, author = {Siddharth Bhandari and Prahladh Harsha and Mrinal Kumar and Madhu Sudan}, title = {Decoding Multivariate Multiplicity Codes on Product Sets}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14891501}, doi = {10.1145/3406325.3451027}, year = {2021}, } Publisher's Version 

Hartline, Jason D. 
STOC '21: "Revelation Gap for Pricing ..."
Revelation Gap for Pricing from Samples
Yiding Feng, Jason D. Hartline, and Yingkai Li (Northwestern University, USA) This paper considers priorindependent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a nontrivial revelation gap for revenue maximization. Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the samplebid mechanism (a simple nontruthful mechanism) and upperbound its priorindependent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve priorindependent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a nontrivial revelation gap is shown as the samplebid mechanism outperforms the optimal priorindependent truthful mechanism. On the hardness side, we prove that no (possibly nontruthful) mechanism can achieve priorindependent approximation ratio better than 1.073 even for uniform distributions. @InProceedings{STOC21p1438, author = {Yiding Feng and Jason D. Hartline and Yingkai Li}, title = {Revelation Gap for Pricing from Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14381451}, doi = {10.1145/3406325.3451057}, year = {2021}, } Publisher's Version 

Hastings, Matthew B. 
STOC '21: "(Sub)Exponential Advantage ..."
(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem
András Gilyén, Matthew B. Hastings, and Umesh Vazirani (California Institute of Technology, USA; Microsoft Quantum, USA; Microsoft Research, USA; University of California at Berkeley, USA) We demonstrate the possibility of (sub)exponential quantum speedup via a quantum algorithm that follows an adiabatic path of a gapped Hamiltonian with no sign problem. The Hamiltonian that exhibits this speedup comes from the adjacency matrix of an undirected graph whose vertices are labeled by nbit strings, and we can view the adiabatic evolution as an efficient O(poly(n))time quantum algorithm for finding a specific “EXIT” vertex in the graph given the “ENTRANCE” vertex. On the other hand we show that if the graph is given via an adjacencylist oracle, there is no classical algorithm that finds the “EXIT” with probability greater than exp(−n^{δ}) using at most exp(n^{δ}) queries for δ= 1/5 − o(1). Our construction of the graph is somewhat similar to the “weldedtrees” construction of Childs et al., but uses additional ideas of Hastings for achieving a spectral gap and a short adiabatic path. @InProceedings{STOC21p1357, author = {András Gilyén and Matthew B. Hastings and Umesh Vazirani}, title = {(Sub)Exponential Advantage of Adiabatic Quantum Computation with No Sign Problem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13571369}, doi = {10.1145/3406325.3451060}, year = {2021}, } Publisher's Version STOC '21: "Fiber Bundle Codes: Breaking ..." Fiber Bundle Codes: Breaking the N^{1/2} polylog(N) Barrier for Quantum LDPC Codes Matthew B. Hastings, Jeongwan Haah, and Ryan O'Donnell (Station Q, USA; Microsoft Quantum, USA; Carnegie Mellon University, USA) We present a quantum LDPC code family that has distance Ω(N^{3/5}/polylog(N)) and Θ(N^{3/5}) logical qubits, where N is the code length. This is the first quantum LDPC code construction that achieves distance greater than N^{1/2} polylog(N). The construction is based on generalizing the homological product of codes to a fiber bundle. @InProceedings{STOC21p1276, author = {Matthew B. Hastings and Jeongwan Haah and Ryan O'Donnell}, title = {Fiber Bundle Codes: Breaking the <i>N</i><sup>1/2</sup> polylog(<i>N</i>) Barrier for Quantum LDPC Codes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12761288}, doi = {10.1145/3406325.3451005}, year = {2021}, } Publisher's Version 

Hazan, Elad 
STOC '21: "Boosting Simple Learners ..."
Boosting Simple Learners
Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rulesofthumbs” from an “easytolearn class”. (Schapire and Freund ’12, ShalevShwartz and BenDavid ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ^{2}) weak hypotheses with γmargin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorialgeometric parameter which captures the expressivity of baseclasses in boosting. As a corollary we provide an affirmative answer to the second question for many wellstudied classes, including halfspaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version 

Hązła, Jan 
STOC '21: "On Codes Decoding a Constant ..."
On Codes Decoding a Constant Fraction of Errors on the BSC
Jan Hązła, Alex Samorodnitsky, and Ori Sberlo (EPFL, Switzerland; Hebrew University of Jerusalem, Israel; Tel Aviv University, Israel) We strengthen the results from a recent work by the second author, achieving bounds on the weight distribution of binary linear codes that are successful under blockMAP (as well as bitMAP) decoding on the BEC. We conclude that a linear code that is successful on the BEC can also decode over a range of binary memoryless symmetric (BMS) channels. In particular, applying the result of Kudekar, Kumar, Mondelli, Pfister, Şaşoğlu and Urbanke from STOC 2016, we prove that a Reed–Muller code of positive rate R decodes errors on the p with high probability if p < 1/2 − √2^{−R}(1−2^{−R}). @InProceedings{STOC21p1479, author = {Jan Hązła and Alex Samorodnitsky and Ori Sberlo}, title = {On Codes Decoding a Constant Fraction of Errors on the BSC}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14791488}, doi = {10.1145/3406325.3451015}, year = {2021}, } Publisher's Version 

He, Kun 
STOC '21: "Sampling Constraint Satisfaction ..."
Sampling Constraint Satisfaction Solutions in the Local Lemma Regime
Weiming Feng, Kun He, and Yitong Yin (Nanjing University, China; Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China) We give a Markov chain based algorithm for sampling almost uniform solutions of constraint satisfaction problems (CSPs). Assuming a canonical setting for the Lovász local lemma, where each constraint is violated by a small number of forbidden local configurations, our sampling algorithm is accurate in a local lemma regime, and the running time is a fixed polynomial whose dependency on n is close to linear, where n is the number of variables. Our main approach is a new technique called state compression, which generalizes the “mark/unmark” paradigm of Moitra, and can give fast locallemmabased sampling algorithms. As concrete applications of our technique, we give the current best almostuniform samplers for hypergraph colorings and for CNF solutions. @InProceedings{STOC21p1565, author = {Weiming Feng and Kun He and Yitong Yin}, title = {Sampling Constraint Satisfaction Solutions in the Local Lemma Regime}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15651578}, doi = {10.1145/3406325.3451101}, year = {2021}, } Publisher's Version 

Hershkowitz, D. Ellis 
STOC '21: "Tree Embeddings for HopConstrained ..."
Tree Embeddings for HopConstrained Network Design
Bernhard Haeupler, D. Ellis Hershkowitz, and Goran Zuzic (Carnegie Mellon University, USA; ETH Zurich, Switzerland) Network design problems aim to compute lowcost structures such as routes, trees and subgraphs. Often, it is natural and desirable to require that these structures have small hop length or hop diameter. Unfortunately, optimization problems with hop constraints are much harder and less well understood than their hopunconstrained counterparts. A significant algorithmic barrier in this setting is the fact that hopconstrained distances in graphs are very far from being a metric. We show that, nonetheless, hopconstrained distances can be approximated by distributions over ``partial tree metrics.'' We build this result into a powerful and versatile algorithmic tool which, similarly to classic probabilistic tree embeddings, reduces hopconstrained problems in general graphs to hopunconstrained problems on trees. We then use this tool to give the first polylogarithmic bicriteria approximations for the hopconstrained variants of many classic network design problems. These include Steiner forest, group Steiner tree, group Steiner forest, buyatbulk network design as well as online and oblivious versions of many of these problems. @InProceedings{STOC21p356, author = {Bernhard Haeupler and D. Ellis Hershkowitz and Goran Zuzic}, title = {Tree Embeddings for HopConstrained Network Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {356369}, doi = {10.1145/3406325.3451053}, year = {2021}, } Publisher's Version 

Hirahara, Shuichi 
STOC '21: "AverageCase Hardness of NP ..."
AverageCase Hardness of NP from Exponential WorstCase Hardness Assumptions
Shuichi Hirahara (National Institute of Informatics, Japan) A longstanding and central open question in the theory of averagecase complexity is to base averagecase hardness of NP on worstcase hardness of NP. A frontier question along this line is to prove that PH is hard on average if UP requires (sub)exponential worstcase complexity. The difficulty of resolving this question has been discussed from various perspectives based on technical barrier results, such as the limits of blackbox reductions and the nonexistence of worstcase hardness amplification procedures in PH. In this paper, we overcome these barriers and resolve the open question by presenting the following main results: 1. UP ⊈DTIME(2^{O(n / logn)}) implies DistNP ⊈AvgP. 2. PH ⊈DTIME(2^{O(n / logn)}) implies DistPH ⊈AvgP. 3. NP ⊈DTIME(2^{O(n / logn)}) implies DistNP ⊈Avg_{P} P. Here, Avg_{P} P denotes Pcomputable averagecase polynomial time, which interpolates averagecase polynomialtime and worstcase polynomialtime. We complement this result by showing that DistPH ⊈AvgP if and only if DistPH ⊈Avg_{P} P. At the core of all of our results is a new notion of universal heuristic scheme, whose running time is Pcomputable averagecase polynomial time under every polynomialtime samplable distribution. Our proofs are based on the metacomplexity of timebounded Kolmogorov complexity: We analyze averagecase complexity through the lens of worstcase metacomplexity using a new “algorithmic” proof of language compression and weak symmetry of information for timebounded Kolmogorov complexity. @InProceedings{STOC21p292, author = {Shuichi Hirahara}, title = {AverageCase Hardness of NP from Exponential WorstCase Hardness Assumptions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {292302}, doi = {10.1145/3406325.3451065}, year = {2021}, } Publisher's Version 

Hollender, Alexandros 
STOC '21: "The Complexity of Gradient ..."
The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
John Fearnley, Paul W. Goldberg, Alexandros Hollender, and Rahul Savani (University of Liverpool, UK; University of Oxford, UK) We study search problems that can be solved by performing Gradient Descent on a bounded convex polytopal domain and show that this class is equal to the intersection of two wellknown classes: PPAD and PLS. As our main underlying technical contribution, we show that computing a KarushKuhnTucker (KKT) point of a continuously differentiable function over the domain [0,1]^{2} is PPAD ∩ PLScomplete. This is the first natural problem to be shown complete for this class. Our results also imply that the class CLS (Continuous Local Search)  which was defined by Daskalakis and Papadimitriou as a more “natural” counterpart to PPAD ∩ PLS and contains many interesting problems  is itself equal to PPAD ∩ PLS. @InProceedings{STOC21p46, author = {John Fearnley and Paul W. Goldberg and Alexandros Hollender and Rahul Savani}, title = {The Complexity of Gradient Descent: CLS = PPAD ∩ PLS}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {4659}, doi = {10.1145/3406325.3451052}, year = {2021}, } Publisher's Version 

Holmgren, Justin 
STOC '21: "Fiat–Shamir via ListRecoverable ..."
Fiat–Shamir via ListRecoverable Codes (or: Parallel Repetition of GMW Is Not ZeroKnowledge)
Justin Holmgren, Alex Lombardi, and Ron D. Rothblum (NTT Research, USA; Massachusetts Institute of Technology, USA; Technion, Israel) In a seminal work, Goldreich, Micali and Wigderson (CRYPTO ’86) demonstrated the wide applicability of zeroknowledge proofs by constructing such a proof system for the NPcomplete problem of graph 3coloring. A longstanding open question has been whether parallel repetition of their protocol preserves zero knowledge. In this work, we answer this question in the negative, assuming a standard cryptographic assumption (i.e., the hardness of learning with errors (LWE)). Leveraging a connection observed by Dwork, Naor, Reingold, and Stockmeyer (FOCS ’99), our negative result is obtained by making positive progress on a related fundamental problem in cryptography: securely instantiating the FiatShamir heuristic for eliminating interaction in publiccoin interactive protocols. A recent line of work has shown how to instantiate the heuristic securely, albeit only for a limited class of protocols. Our main result shows how to instantiate FiatShamir for parallel repetitions of much more general interactive proofs. In particular, we construct hash functions that, assuming LWE, securely realize the FiatShamir transform for the following rich classes of protocols: 1) The parallel repetition of any “commitandopen” protocol (such as the GMW protocol mentioned above), when a specific (natural) commitment scheme is used. Commitandopen protocols are a ubiquitous paradigm for constructing general purpose publiccoin zero knowledge proofs. 2) The parallel repetition of any base protocol that (1) satisfies a stronger notion of soundness called roundbyround soundness, and (2) has an efficient procedure, using a suitable trapdoor, for recognizing “bad verifier randomness” that would allow the prover to cheat. Our results are obtained by establishing a new connection between the FiatShamir transform and listrecoverable codes. In contrast to the usual focus in coding theory, we focus on a parameter regime in which the input lists are extremely large, but the rate can be small. We give a (probabilistic) construction based on ParvareshVardy codes (FOCS ’05) that suffices for our applications. @InProceedings{STOC21p750, author = {Justin Holmgren and Alex Lombardi and Ron D. Rothblum}, title = {Fiat–Shamir via ListRecoverable Codes (or: Parallel Repetition of GMW Is Not ZeroKnowledge)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {750760}, doi = {10.1145/3406325.3451116}, year = {2021}, } Publisher's Version 

Hongler, Clément 
STOC '21: "Neural Tangent Kernel: Convergence ..."
Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
Arthur Jacot, Franck Gabriel, and Clément Hongler (EPFL, Switzerland) The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features. @InProceedings{STOC21p6, author = {Arthur Jacot and Franck Gabriel and Clément Hongler}, title = {Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {66}, doi = {10.1145/3406325.3465355}, year = {2021}, } Publisher's Version 

Hrubes, Pavel 
STOC '21: "Learnability Can Be Independent ..."
Learnability Can Be Independent of Set Theory (Invited Paper)
Shai BenDavid, Pavel Hrubes, Shay Moran, Amir Shpilka, and Amir Yehudayoff (University of Waterloo, Canada; Czech Academy of Sciences, Czechia; Technion, Israel; Tel Aviv University, Israel) A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its VapnikChervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VCdimension characterization of binary classification learnability. @InProceedings{STOC21p11, author = {Shai BenDavid and Pavel Hrubes and Shay Moran and Amir Shpilka and Amir Yehudayoff}, title = {Learnability Can Be Independent of Set Theory (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1111}, doi = {10.1145/3406325.3465360}, year = {2021}, } Publisher's Version 

Huang, Zhiyi 
STOC '21: "Online Stochastic Matching, ..."
Online Stochastic Matching, Poisson Arrivals, and the Natural Linear Program
Zhiyi Huang and Xinkai Shu (University of Hong Kong, China) We study the online stochastic matching problem. Consider a bipartite graph with offline vertices on one side, and with i.i.d.online vertices on the other side. The offline vertices and the distribution of online vertices are known to the algorithm beforehand. The realization of the online vertices, however, is revealed one at a time, upon which the algorithm immediately decides how to match it. For maximizing the cardinality of the matching, we give a 0.711competitive online algorithm, which improves the best previous ratio of 0.706. When the offline vertices are weighted, we introduce a 0.7009competitive online algorithm for maximizing the total weight of the matched offline vertices, which improves the best previous ratio of 0.662. Conceptually, we find that the analysis of online algorithms simplifies if the online vertices follow a Poisson process, and establish an approximate equivalence between this Poisson arrival model and online stochstic matching. Technically, we propose a natural linear program for the Poisson arrival model, and demonstrate how to exploit its structure by introducing a converse of Jensen’s inequality. Moreover, we design an algorithmic amortization to replace the analytic one in previous work, and as a result get the first vertexweighted online stochastic matching algorithm that improves the results in the weaker random arrival model. @InProceedings{STOC21p682, author = {Zhiyi Huang and Xinkai Shu}, title = {Online Stochastic Matching, Poisson Arrivals, and the Natural Linear Program}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {682693}, doi = {10.1145/3406325.3451079}, year = {2021}, } Publisher's Version 

Husić, Edin 
STOC '21: "Approximating Nash Social ..."
Approximating Nash Social Welfare under Rado Valuations
Jugal Garg, Edin Husić, and László A. Végh (University of Illinois at UrbanaChampaign, USA; London School of Economics and Political Science, UK) We consider the problem of approximating maximum Nash social welfare (NSW) while allocating a set of indivisible items to n agents. The NSW is a popular objective that provides a balanced tradeoff between the often conflicting requirements of fairness and efficiency, defined as the weighted geometric mean of the agents’ valuations. For the symmetric additive case of the problem, where agents have the same weight with additive valuations, the first constantfactor approximation algorithm was obtained in 2015. Subsequent work has obtained constantfactor approximation algorithms for the symmetric case under mild generalizations of additive, and O(n)approximation algorithms for subadditive valuations and for the asymmetric case. In this paper, we make significant progress towards both symmetric and asymmetric NSW problems. We present the first constantfactor approximation algorithm for the symmetric case under Rado valuations. Rado valuations form a general class of valuation functions that arise from maximum cost independent matching problems, including as special cases assignment (OXS) valuations and weighted matroid rank functions. Furthermore, our approach also gives the first constantfactor approximation algorithm for the asymmetric case under Rado valuations, provided that the maximum ratio between the weights is bounded by a constant. @InProceedings{STOC21p1412, author = {Jugal Garg and Edin Husić and László A. Végh}, title = {Approximating Nash Social Welfare under Rado Valuations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14121425}, doi = {10.1145/3406325.3451031}, year = {2021}, } Publisher's Version 

Jacot, Arthur 
STOC '21: "Neural Tangent Kernel: Convergence ..."
Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)
Arthur Jacot, Franck Gabriel, and Clément Hongler (EPFL, Switzerland) The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I'll introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features. @InProceedings{STOC21p6, author = {Arthur Jacot and Franck Gabriel and Clément Hongler}, title = {Neural Tangent Kernel: Convergence and Generalization in Neural Networks (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {66}, doi = {10.1145/3406325.3465355}, year = {2021}, } Publisher's Version 

Jain, Aayush 
STOC '21: "Indistinguishability Obfuscation ..."
Indistinguishability Obfuscation from WellFounded Assumptions
Aayush Jain, Huijia Lin, and Amit Sahai (University of California at Los Angeles, USA; University of Washington, USA) Indistinguishability obfuscation, introduced by [Barak et. al. Crypto 2001], aims to compile programs into unintelligible ones while preserving functionality. It is a fascinating and powerful object that has been shown to enable a host of new cryptographic goals and beyond. However, constructions of indistinguishability obfuscation have remained elusive, with all other proposals relying on heuristics or newly conjectured hardness assumptions. In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four wellfounded assumptions. We prove: Informal Theorem: Let τ ∈ (0,∞), δ ∈ (0,1), ∈ (0,1) be arbitrary constants. Assume subexponential security of the following assumptions:  the Learning With Errors (LWE) assumption with subexponential modulustonoise ratio 2^{kє} and noises of magnitude polynomial in k, where k is the dimension of the LWE secret,  the Learning Parity with Noise (LPN) assumption over general prime fields ℤ_{p} with polynomially many LPN samples and error rate 1/ℓ^{δ}, where ℓ is the dimension of the LPN secret,  the existence of a Boolean PseudoRandom Generator (PRG) in NC^{0} with stretch n^{1+τ}, where n is the length of the PRG seed,  the Decision Linear (DLIN) assumption on symmetric bilinear groups of prime order. Then, (subexponentially secure) indistinguishability obfuscation for all polynomialsize circuits exists. Further, assuming only polynomial security of the aforementioned assumptions, there exists collusion resistant publickey functional encryption for all polynomialsize circuits. @InProceedings{STOC21p60, author = {Aayush Jain and Huijia Lin and Amit Sahai}, title = {Indistinguishability Obfuscation from WellFounded Assumptions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {6073}, doi = {10.1145/3406325.3451093}, year = {2021}, } Publisher's Version 

Jain, Vishesh 
STOC '21: "Perfectly Sampling k ..."
Perfectly Sampling k ≥ (8/3 + o(1))ΔColorings in Graphs
Vishesh Jain, Ashwin Sah, and Mehtaab Sawhney (Simons Institute for the Theory of Computing Berkeley, USA; Massachusetts Institute of Technology, USA) We present a randomized algorithm which takes as input an undirected graph G on n vertices with maximum degree Δ, and a number of colors k ≥ (8/3 + o_{Δ}(1))Δ, and returns – in expected time Õ(nΔ^{2}logk) – a proper kcoloring of G distributed perfectly uniformly on the set of all proper kcolorings of G. Notably, our sampler breaks the barrier at k = 3Δ encountered in recent work of Bhandari and Chakraborty [STOC 2020]. We also discuss how our methods may be modified to relax the restriction on k to k ≥ (8/3 − є_{0})Δ for an absolute constant є_{0} > 0. As in the work of Bhandari and Chakraborty, and the pioneering work of Huber [STOC 1998], our sampler is based on Coupling from the Past [Propp&Wilson, Random Struct. Algorithms, 1995] and the bounding chain method [Huber, STOC 1998; H'aggstr'om& Nelander, Scand. J. Statist., 1999]. Our innovations include a novel bounding chain routine inspired by Jerrum’s analysis of the Glauber dynamics [Random Struct. Algorithms, 1995], as well as a preconditioning routine for bounding chains which uses the algorithmic Lovász Local Lemma [Moser&Tardos, J.ACM, 2010]. @InProceedings{STOC21p1589, author = {Vishesh Jain and Ashwin Sah and Mehtaab Sawhney}, title = {Perfectly Sampling <i>k</i> ≥ (8/3 + <i>o</i>(1))ΔColorings in Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15891600}, doi = {10.1145/3406325.3451012}, year = {2021}, } Publisher's Version 

Janczewski, Wojciech 
STOC '21: "Fully Dynamic Approximation ..."
Fully Dynamic Approximation of LIS in Polylogarithmic Time
Paweł Gawrychowski and Wojciech Janczewski (University of Wrocław, Poland) We revisit the problem of maintaining the longest increasing subsequence (LIS) of an array under (i) inserting an element, and (ii) deleting an element of an array. In a recent breakthrough, Mitzenmacher and Seddighin [STOC 2020] designed an algorithm that maintains an O((1/є)^{O(1/є)})approximation of LIS under both operations with worstcase update time Õ(n^{є}), for any constant є>0 (Õ hides factors polynomial in logn, where n is the length of the input). We exponentially improve on their result by designing an algorithm that maintains an (1+є) approximation of LIS under both operations with worstcase update time Õ(є^{−5}). Instead of working with the grid packing technique introduced by Mitzenmacher and Seddighin, we take a different approach building on a new tool that might be of independent interest: LIS sparsification. A particularly interesting consequence of our result is an improved solution for the socalled ErdősSzekeres partitioning, in which we seek a partition of a given permutation of {1,2,…,n} into O(√n) monotone subsequences. This problem has been repeatedly stated as one of the natural examples in which we see a large gap between the decisiontree complexity and algorithmic complexity. The result of Mitzenmacher and Seddighin implies an O(n^{1+є}) time solution for this problem, for any є>0. Our algorithm (in fact, its simpler decremental version) further improves this to Õ(n). @InProceedings{STOC21p654, author = {Paweł Gawrychowski and Wojciech Janczewski}, title = {Fully Dynamic Approximation of LIS in Polylogarithmic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {654667}, doi = {10.1145/3406325.3451137}, year = {2021}, } Publisher's Version 

Jansen, Bart M. P. 
STOC '21: "Vertex Deletion Parameterized ..."
Vertex Deletion Parameterized by Elimination Distance and Even Less
Bart M. P. Jansen, Jari J. H. de Kroon, and Michał Włodarczyk (Eindhoven University of Technology, Netherlands) We study the parameterized complexity of various classic vertexdeletion problems such as Odd cycle transversal, Vertex planarization, and Chordal vertex deletion under hybrid parameterizations. Existing FPT algorithms for these problems either focus on the parameterization by solution size, detecting solutions of size k in time f(k) · n^{O(1)}, or width parameterizations, finding arbitrarily large optimal solutions in time f(w) · n^{O(1)} for some width measure w like treewidth. We unify these lines of research by presenting FPT algorithms for parameterizations that can simultaneously be arbitrarily much smaller than the solution size and the treewidth. The first class of parameterizations is based on the notion of elimination distance of the input graph to the target graph class , which intuitively measures the number of rounds needed to obtain a graph in by removing one vertex from each connected component in each round. The second class of parameterizations consists of a relaxation of the notion of treewidth, allowing arbitrarily large bags that induce subgraphs belonging to the target class of the deletion problem as long as these subgraphs have small neighborhoods. Both kinds of parameterizations have been introduced recently and have already spawned several independent results. Our contribution is twofold. First, we present a framework for computing approximately optimal decompositions related to these graph measures. Namely, if the cost of an optimal decomposition is k, we show how to find a decomposition of cost k^{O(1)} in time f(k) · n^{O(1)}. This is applicable to any class for which we can solve the socalled separation problem. Secondly, we exploit the constructed decompositions for solving vertexdeletion problems by extending ideas from algorithms using iterative compression and the finite state property. For the three mentioned vertexdeletion problems, and all problems which can be formulated as hitting a finite set of connected forbidden (a) minors or (b) (induced) subgraphs, we obtain FPT algorithms with respect to both studied parameterizations. For example, we present an algorithm running in time n^{O(1)} + 2^{kO(1)}·(n+m) and polynomial space for Odd cycle transversal parameterized by the elimination distance k to the class of bipartite graphs. @InProceedings{STOC21p1757, author = {Bart M. P. Jansen and Jari J. H. de Kroon and Michał Włodarczyk}, title = {Vertex Deletion Parameterized by Elimination Distance and Even Less}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17571769}, doi = {10.1145/3406325.3451068}, year = {2021}, } Publisher's Version 

Jawale, Ruta 
STOC '21: "SNARGs for Bounded Depth Computations ..."
SNARGs for Bounded Depth Computations and PPAD Hardness from SubExponential LWE
Ruta Jawale, Yael Tauman Kalai, Dakshita Khurana, and Rachel Zhang (University of Illinois at UrbanaChampaign, USA; Microsoft Research, USA; Massachusetts Institute of Technology, USA) We construct a succinct noninteractive publiclyverifiable delegation scheme for any logspace uniform circuit under the subexponential Learning With Errors (LWE) assumption. For a circuit C:{0,1}^{N}→{0,1} of size S and depth D, the prover runs in time poly(S), the communication complexity is D · polylog(S), and the verifier runs in time (D+N) ·polylog(S). To obtain this result, we introduce a new cryptographic primitive: a lossy correlationintractable hash function family. We use this primitive to soundly instantiate the FiatShamir transform for a large class of interactive proofs, including the interactive sumcheck protocol and the GKR protocol, assuming the subexponential hardness of LWE. Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (subexponential) averagecase hardness of PPAD, assuming the subexponential hardness of LWE. @InProceedings{STOC21p708, author = {Ruta Jawale and Yael Tauman Kalai and Dakshita Khurana and Rachel Zhang}, title = {SNARGs for Bounded Depth Computations and PPAD Hardness from SubExponential LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {708721}, doi = {10.1145/3406325.3451055}, year = {2021}, } Publisher's Version 

Jayaram, Rajesh 
STOC '21: "A PolynomialTime Approximation ..."
A PolynomialTime Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)
Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Counting the number of words of a certain length accepted by a nondeterministic finite automaton (NFA) is a fundamental problem, which has many applications in different areas such as graph databases, knowledge compilation, and information extraction. Along with this, generating such words uniformly at random is also a relevant problem, particularly in scenarios where returning varied outputs is a desirable feature. The previous problems are formalized as follows. The input of #NFA is an NFA N and a length k given in unary (that is, given as a string 0^k), and then the task is to compute the number of strings of length k accepted by N. The input of GENNFA is the same as #NFA, but now the task is to generate uniformly, at random, a string accepted by N of length k. It is known that #NFA is #Pcomplete, so an efficient algorithm to compute this function exactly is not expected to exist. However, this does not preclude the existence of an efficient approximation algorithm for it. In this talk, we will show that #NFA admits a fully polynomialtime randomized approximation scheme (FPRAS). Prior to our work, it was open whether #NFA admits an FPRAS; in fact, the best randomized approximation scheme known for #NFA ran in time n^O(log(n)). Besides, we will mention some consequences and applications of our results. In particular, from wellknown results on counting and uniform generation, we obtain that GENNFA admits a fully polynomialtime almost uniform generator. Moreover, as #NFA is SpanLcomplete under polynomialtime parsimonious reductions, we obtain that every function in the complexity class SpanL admits an FPRAS. @InProceedings{STOC21p4, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {A PolynomialTime Approximation Algorithm for Counting Words Accepted by an NFA (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {44}, doi = {10.1145/3406325.3465353}, year = {2021}, } Publisher's Version STOC '21: "When Is Approximate Counting ..." When Is Approximate Counting for Conjunctive Queries Tractable? Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros (PUC, Chile; IMFD, Chile; Carnegie Mellon University, USA) Conjunctive queries are one of the most common class of queries used in database systems, and the best studied in the literature. A seminal result of Grohe, Schwentick, and Segoufin (STOC 2001) demonstrates that for every class G of graphs, the evaluation of all conjunctive queries whose underlying graph is in G is tractable if, and only if, G has bounded treewidth. In this work, we extend this characterization to the counting problem for conjunctive queries. Specifically, for every class C of conjunctive queries with bounded treewidth, we introduce the first fully polynomialtime randomized approximation scheme (FPRAS) for counting answers to a query in C, and the first polynomialtime algorithm for sampling answers uniformly from a query in C. As a corollary, it follows that for every class G of graphs, the counting problem for conjunctive queries whose underlying graph is in G admits an FPRAS if, and only if, G has bounded treewidth (unless BPP is different from P). In fact, our FPRAS is more general, and also applies to conjunctive queries with bounded hypertree width, as well as unions of such queries. The key ingredient in our proof is the resolution of a fundamental counting problem from automata theory. Specifically, we demonstrate the first FPRAS and polynomial time sampler for the set of trees of size n accepted by a tree automaton, which improves the prior quasipolynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney ’97. We demonstrate how this algorithm can be used to obtain an FPRAS for many open problems, such as counting solutions to constraint satisfaction problems (CSP) with bounded hypertree width, counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits. @InProceedings{STOC21p1015, author = {Marcelo Arenas and Luis Alberto Croquevielle and Rajesh Jayaram and Cristian Riveros}, title = {When Is Approximate Counting for Conjunctive Queries Tractable?}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10151027}, doi = {10.1145/3406325.3451014}, year = {2021}, } Publisher's Version 

Jeronimo, Fernando Granha 
STOC '21: "NearLinear Time Decoding ..."
NearLinear Time Decoding of TaShma’s Codes via Splittable Regularity
Fernando Granha Jeronimo, Shashank Srivastava, and Madhur Tulsiani (University of Chicago, USA; Toyota Technological Institute at Chicago, USA) The Gilbert–Varshamov bound nonconstructively establishes the existence of binary codes of distance 1/2−є/2 and rate Ω(є^{2}). In a breakthrough result, TaShma [STOC 2017] constructed the first explicit family of nearly optimal binary codes with distance 1/2−є/2 and rate Ω(є^{2+α}), where α → 0 as є → 0. Moreover, the codes in TaShma’s construction are єbalanced, where the distance between distinct codewords is not only bounded from below by 1/2−є/2, but also from above by 1/2+є/2. Polynomial time decoding algorithms for (a slight modification of) TaShma’s codes appeared in [FOCS 2020], and were based on the SumofSquares (SoS) semidefinite programming hierarchy. The running times for these algorithms were of the form N^{Oα(1)} for unique decoding, and N^{Oє,α(1)} for the setting of “gentle list decoding”, with large exponents of N even when α is a fixed constant. We derive new algorithms for both these tasks, running in time Õ_{є}(N). Our algorithms also apply to the general setting of decoding directsum codes. Our algorithms follow from new structural and algorithmic results for collections of ktuples (ordered hypergraphs) possessing a “structured expansion” property, which we call splittability. This property was previously identified and used in the analysis of SoSbased decoding and constraint satisfaction algorithms, and is also known to be satisfied by TaShma’s code construction. We obtain a new weak regularity decomposition for (possibly sparse) splittable collections W ⊆ [n]^{k}, similar to the regularity decomposition for dense structures by Frieze and Kannan [FOCS 1996]. These decompositions are also computable in nearlinear time Õ(W ), and form a key component of our algorithmic results. @InProceedings{STOC21p1527, author = {Fernando Granha Jeronimo and Shashank Srivastava and Madhur Tulsiani}, title = {NearLinear Time Decoding of TaShma’s Codes via Splittable Regularity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15271536}, doi = {10.1145/3406325.3451126}, year = {2021}, } Publisher's Version 

Jia, He 
STOC '21: "Reducing Isotropy and Volume ..."
Reducing Isotropy and Volume to KLS: An O*(n^{3}ψ^{2}) Volume Algorithm
He Jia, Aditi Laddha, Yin Tat Lee, and Santosh Vempala (Georgia Institute of Technology, USA; University of Washington, USA; Microsoft Research, USA) We show that the volume of a convex body in R^{n} in the general membership oracle model can be computed to within relative error ε using O(n^{3}ψ^{2}/ε^{2}) oracle queries, where ψ is the KLS constant. With the current bound of ψ=O(n^{o(1)}), this gives an O(n^{3+o(1)}/ε^{2}) algorithm, the first improvement on the LovászVempala O(n^{4}/ε^{2}) algorithm from 2003. The main new ingredient is an O(n^{3}ψ^{2}) algorithm for isotropic transformation, following which we can apply the O(n^{3}/ε^{2}) volume algorithm of Cousins and Vempala for wellrounded convex bodies. A positive resolution of the KLS conjecture would imply an O(n^{3}/є^{2}) volume algorithm. We also give an efficient implementation of the new algorithm for convex polytopes defined by m inequalities in R^{n}: polytope volume can be estimated in time O(mn^{c}/ε^{2}) where c<3.2 depends on the current matrix multiplication exponent and improves on the previous best bound. @InProceedings{STOC21p961, author = {He Jia and Aditi Laddha and Yin Tat Lee and Santosh Vempala}, title = {Reducing Isotropy and Volume to KLS: An <i>O</i>*(<i>n</i><sup>3</sup><i>ψ</i><sup>2</sup>) Volume Algorithm}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {961974}, doi = {10.1145/3406325.3451018}, year = {2021}, } Publisher's Version 

Jiang, Shunhua 
STOC '21: "A Faster Algorithm for Solving ..."
A Faster Algorithm for Solving General LPs
Shunhua Jiang, Zhao Song, Omri Weinstein, and Hengjie Zhang (Columbia University, USA; Institute for Advanced Study at Princeton, USA) The fastest known LP solver for general (dense) linear programs is due to [Cohen, Lee and Song’19] and runs in O^{*}(n^{ω} +n^{2.5−α/2} + n^{2+1/6}) time. A number of followup works [Lee, Song and Zhang’19, Brand’20, Song and Yu’20] obtain the same complexity through different techniques, but none of them can go below n^{2+1/6}, even if ω=2. This leaves a polynomial gap between the cost of solving linear systems (n^{ω}) and the cost of solving linear programs, and as such, improving the n^{2+1/6} term is crucial toward establishing an equivalence between these two fundamental problems. In this paper, we reduce the running time to O^{*}(n^{ω} +n^{2.5−α/2} + n^{2+1/18}) where ω and α are the fast matrix multiplication exponent and its dual. Hence, under the common belief that ω ≈ 2 and α ≈ 1, our LP solver runs in O^{*}(n^{2.055}) time instead of O^{*}(n^{2.16}). @InProceedings{STOC21p823, author = {Shunhua Jiang and Zhao Song and Omri Weinstein and Hengjie Zhang}, title = {A Faster Algorithm for Solving General LPs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {823832}, doi = {10.1145/3406325.3451058}, year = {2021}, } Publisher's Version Info 

Jung, Christopher 
STOC '21: "A New Analysis of Differential ..."
A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)
Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed SharifiMalvajerdi, and Moshe Shenfeld (University of Pennsylvania, USA; Hebrew University of Jerusalem, Israel) We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sampleaccurate is also accurate outofsample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds. @InProceedings{STOC21p9, author = {Christopher Jung and Katrina Ligett and Seth Neel and Aaron Roth and Saeed SharifiMalvajerdi and Moshe Shenfeld}, title = {A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {99}, doi = {10.1145/3406325.3465358}, year = {2021}, } Publisher's Version 

Kalai, Yael Tauman 
STOC '21: "SNARGs for Bounded Depth Computations ..."
SNARGs for Bounded Depth Computations and PPAD Hardness from SubExponential LWE
Ruta Jawale, Yael Tauman Kalai, Dakshita Khurana, and Rachel Zhang (University of Illinois at UrbanaChampaign, USA; Microsoft Research, USA; Massachusetts Institute of Technology, USA) We construct a succinct noninteractive publiclyverifiable delegation scheme for any logspace uniform circuit under the subexponential Learning With Errors (LWE) assumption. For a circuit C:{0,1}^{N}→{0,1} of size S and depth D, the prover runs in time poly(S), the communication complexity is D · polylog(S), and the verifier runs in time (D+N) ·polylog(S). To obtain this result, we introduce a new cryptographic primitive: a lossy correlationintractable hash function family. We use this primitive to soundly instantiate the FiatShamir transform for a large class of interactive proofs, including the interactive sumcheck protocol and the GKR protocol, assuming the subexponential hardness of LWE. Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (subexponential) averagecase hardness of PPAD, assuming the subexponential hardness of LWE. @InProceedings{STOC21p708, author = {Ruta Jawale and Yael Tauman Kalai and Dakshita Khurana and Rachel Zhang}, title = {SNARGs for Bounded Depth Computations and PPAD Hardness from SubExponential LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {708721}, doi = {10.1145/3406325.3451055}, year = {2021}, } Publisher's Version 

Kandiros, Anthimos Vardis 
STOC '21: "Learning Ising Models from ..."
Learning Ising Models from One or Multiple Samples
Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, and Anthimos Vardis Kandiros (Massachusetts Institute of Technology, USA; Google, USA) There have been two main lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix ; and (2) estimating them from one sample in restrictive settings. We propose a unified framework that smoothly interpolates between these two settings, enabling significantly richer estimation guarantees from one, a few, or many samples. Our main theorem provides guarantees for onesample estimation, quantifying the estimation error in terms of the metric entropy of a family of interaction matrices. As corollaries of our main theorem, we derive bounds when the model's interaction matrix is a (sparse) linear combination of known matrices, or it belongs to a finite set, or to a highdimensional manifold. In fact, our main result handles multiple independent samples by viewing them as one sample from a larger model, and can be used to derive estimation bounds that are qualitatively similar to those obtained in the aforedescribed multiplesample literature. Our technical approach benefits from sparsifying a model's interaction network, conditioning on subsets of variables that make the dependencies in the resulting conditional distribution sufficiently weak. We use this sparsification technique to prove strong concentration and anticoncentration results for the Ising model, which we believe have applications beyond the scope of this paper. @InProceedings{STOC21p161, author = {Yuval Dagan and Constantinos Daskalakis and Nishanth Dikkala and Anthimos Vardis Kandiros}, title = {Learning Ising Models from One or Multiple Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {161168}, doi = {10.1145/3406325.3451074}, year = {2021}, } Publisher's Version 

Kane, Daniel M. 
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of WisconsinMadison, USA; MPIINF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are єfar from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sampleoptimal testers for a range of properties. While one can always boost the confidence probability of any such tester by blackbox amplification, this generic boosting method typically leads to suboptimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sampleoptimal, within constant factors, as a function of all relevant parameters. We also show matching informationtheoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version STOC '21: "Efficiently Learning Halfspaces ..." Efficiently Learning Halfspaces with Tsybakov Noise Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, and Nikos Zarifis (University of WisconsinMadison, USA; University of California at San Diego, USA) We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomialtime algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of wellbehaved distributions including logconcave distributions. This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the nonoptimality of a candidate halfspace and gave a quasipolynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomialtime certificate algorithm. @InProceedings{STOC21p88, author = {Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, title = {Efficiently Learning Halfspaces with Tsybakov Noise}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88101}, doi = {10.1145/3406325.3450998}, year = {2021}, } Publisher's Version 

Kapralov, Michael 
STOC '21: "Towards Tight Bounds for Spectral ..."
Towards Tight Bounds for Spectral Sparsification of Hypergraphs
Michael Kapralov, Robert Krauthgamer, Jakab Tardos, and Yuichi Yoshida (EPFL, Switzerland; Weizmann Institute of Science, Israel; National Institute of Informatics, Japan) Cut and spectral sparsification of graphs have numerous applications, including e.g. speeding up algorithms for cuts and Laplacian solvers. These powerful notions have recently been extended to hypergraphs, which are much richer and may offer new applications. However, the current bounds on the size of hypergraph sparsifiers are not as tight as the corresponding bounds for graphs. Our first result is a polynomialtime algorithm that, given a hypergraph on n vertices with maximum hyperedge size r, outputs an єspectral sparsifier with O^{*}(nr) hyperedges, where O^{*} suppresses (є^{−1} logn)^{O(1)} factors. This size bound improves the two previous bounds: O^{*}(n^{3}) [Soma and Yoshida, SODA’19] and O^{*}(nr^{3}) [Bansal, Svensson and Trevisan, FOCS’19]. Our main technical tool is a new method for proving concentration of the nonlinear analogue of the quadratic form of the Laplacians for hypergraph expanders. We complement this with lower bounds on the bit complexity of any compression scheme that (1+є)approximates all the cuts in a given hypergraph, and hence also on the bit complexity of every єcut/spectral sparsifier. These lower bounds are based on RuzsaSzemerédi graphs, and a particular instantiation yields an Ω(nr) lower bound on the bit complexity even for fixed constant є. In the case of hypergraph cut sparsifiers, this is tight up to polylogarithmic factors in n, due to recent result of [Chen, Khanna and Nagda, FOCS’20]. For spectral sparsifiers it narrows the gap to O^{*}(r). Finally, for directed hypergraphs, we present an algorithm that computes an єspectral sparsifier with O^{*}(n^{2}r^{3}) hyperarcs, where r is the maximum size of a hyperarc. For small r, this improves over O^{*}(n^{3}) known from [Soma and Yoshida, SODA’19], and is getting close to the trivial lower bound of Ω(n^{2}) hyperarcs. @InProceedings{STOC21p598, author = {Michael Kapralov and Robert Krauthgamer and Jakab Tardos and Yuichi Yoshida}, title = {Towards Tight Bounds for Spectral Sparsification of Hypergraphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {598611}, doi = {10.1145/3406325.3451061}, year = {2021}, } Publisher's Version 

Karlin, Anna R. 
STOC '21: "A (Slightly) Improved Approximation ..."
A (Slightly) Improved Approximation Algorithm for Metric TSP
Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan (University of Washington, USA) For some > 10^{−36} we give a randomized 3/2− approximation algorithm for metric TSP. @InProceedings{STOC21p32, author = {Anna R. Karlin and Nathan Klein and Shayan Oveis Gharan}, title = {A (Slightly) Improved Approximation Algorithm for Metric TSP}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {3245}, doi = {10.1145/3406325.3451009}, year = {2021}, } Publisher's Version 

Kaufman, Tali 
STOC '21: "New Cosystolic Expanders from ..."
New Cosystolic Expanders from Tensors Imply Explicit Quantum LDPC Codes with Ω(√n log^{k} n) Distance
Tali Kaufman and Ran J. Tessler (BarIlan University, Israel; Weizmann Institute of Science, Israel) In this work we introduce a new notion of expansion in higher dimensions that is stronger than the well studied cosystolic expansion notion, and is termed Collectivecosystolic expansion. We show that tensoring two cosystolic expanders yields a new cosystolic expander, assuming one of the complexes in the product, is not only cosystolic expander, but rather a collective cosystolic expander. We then show that the well known bounded degree cosystolic expanders, the Ramanujan complexes are, in fact, collective cosystolic expanders. This enables us to construct new bounded degree cosystolic expanders, by tensoring of Ramanujan complexes. Using our new constructed bounded degree cosystolic expanders we construct explicit quantum LDPC codes of distance √n log^{k} n for any k, improving a recent result of Evra et. al. [FOCS, 2020], and setting a new record for distance of explicit quantum LDPC codes. The work of Evra et. al. [FOCS, 2020] took advantage of the high dimensional expansion notion known as cosystolic expansion, that occurs in Ramanujan complexes. Our improvement is achieved by considering tensor product of Ramanujan complexes, and using their newly derived property, the collective cosystolic expansion. @InProceedings{STOC21p1317, author = {Tali Kaufman and Ran J. Tessler}, title = {New Cosystolic Expanders from Tensors Imply Explicit Quantum LDPC Codes with Ω(√<i>n</i> log<sup><i>k</i></sup> <i>n</i>) Distance}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13171329}, doi = {10.1145/3406325.3451029}, year = {2021}, } Publisher's Version 

Keller, Nathan 
STOC '21: "Local Concentration Inequalities ..."
Local Concentration Inequalities and Tomaszewski’s Conjecture
Nathan Keller and Ohad Klein (BarIlan University, Israel) We prove Tomaszewski’s conjecture (1986): Let f:{−1,1}^{n} → ℝ be of the form f(x)= ∑_{i=1}^{n} a_{i} x_{i}. Then Pr[f(x) ≤ √Var[f]] ≥ 1/2. Our main novel tools are local concentration inequalities and an improved BerryEsseen inequality for firstdegree functions on the discrete cube. These tools are of independent interest, and may be useful in the study of linear threshold functions and of low degree Boolean functions. @InProceedings{STOC21p1656, author = {Nathan Keller and Ohad Klein}, title = {Local Concentration Inequalities and Tomaszewski’s Conjecture}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16561669}, doi = {10.1145/3406325.3451011}, year = {2021}, } Publisher's Version 

Kelley, Zander 
STOC '21: "An Improved Derandomization ..."
An Improved Derandomization of the Switching Lemma
Zander Kelley (University of Illinois at UrbanaChampaign, USA) We prove a new derandomization of Håstad’s switching lemma, showing how to efficiently generate restrictions satisfying the switching lemma for DNF or CNF formulas of size m using only O(logm) random bits. Derandomizations of the switching lemma have been useful in many works as a key buildingblock for constructing objects which are in some way provablypseudorandom with respect to AC^{0}circuits. Here, we use our new derandomization to give an improved analysis of the pseudorandom generator of Trevisan and Xue for AC^{0}circuits (CCC’13): we show that the generator εfools sizem, depthD circuits with nbit inputs using only O(log(m/ε)^{D} · logn) random bits. In particular, we obtain (modulo the loglogfactors hidden in the Onotation) a dependence on m/ε which is bestpossible with respect to currentlyknown AC^{0}circuit lower bounds. @InProceedings{STOC21p272, author = {Zander Kelley}, title = {An Improved Derandomization of the Switching Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {272282}, doi = {10.1145/3406325.3451054}, year = {2021}, } Publisher's Version 

Khot, Subhash 
STOC '21: "Optimal Inapproximability ..."
Optimal Inapproximability of Satisfiable kLIN over NonAbelian Groups
Amey Bhangale and Subhash Khot (University of California at Riverside, USA; New York University, USA) A seminal result of Håstad (2001) shows that it is NPhard to find an assignment that satisfies 1/G+ε fraction of the constraints of a given kLIN instance over an abelian group, even if there is an assignment that satisfies (1−ε) fraction of the constraints, for any constant ε>0. Engebretsen, Holmerin and Russell (2004) later showed that the same hardness result holds for kLIN instances over any finite nonabelian group. Unlike the abelian case, where we can efficiently find a solution if the instance is satisfiable, in the nonabelian case, it is NPcomplete to decide if a given system of linear equations is satisfiable or not, as shown by Goldmann and Russell (1999). Surprisingly, for certain nonabelian groups G, given a satisfiable kLIN instance over G, one can in fact do better than just outputting a random assignment using a simple but clever algorithm. The approximation factor achieved by this algorithm varies with the underlying group. In this paper, we show that this algorithm is optimal by proving a tight hardness of approximation of satisfiable kLIN instance over any nonabelian G, assuming P≠ NP. As a corollary, we also get 3query probabilistically checkable proofs with perfect completeness over large alphabets with improved soundness. Our proof crucially uses the quasirandom properties of the nonabelian groups defined by Gowers (2008). @InProceedings{STOC21p1615, author = {Amey Bhangale and Subhash Khot}, title = {Optimal Inapproximability of Satisfiable kLIN over NonAbelian Groups}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16151628}, doi = {10.1145/3406325.3451003}, year = {2021}, } Publisher's Version 

Khurana, Dakshita 
STOC '21: "SNARGs for Bounded Depth Computations ..."
SNARGs for Bounded Depth Computations and PPAD Hardness from SubExponential LWE
Ruta Jawale, Yael Tauman Kalai, Dakshita Khurana, and Rachel Zhang (University of Illinois at UrbanaChampaign, USA; Microsoft Research, USA; Massachusetts Institute of Technology, USA) We construct a succinct noninteractive publiclyverifiable delegation scheme for any logspace uniform circuit under the subexponential Learning With Errors (LWE) assumption. For a circuit C:{0,1}^{N}→{0,1} of size S and depth D, the prover runs in time poly(S), the communication complexity is D · polylog(S), and the verifier runs in time (D+N) ·polylog(S). To obtain this result, we introduce a new cryptographic primitive: a lossy correlationintractable hash function family. We use this primitive to soundly instantiate the FiatShamir transform for a large class of interactive proofs, including the interactive sumcheck protocol and the GKR protocol, assuming the subexponential hardness of LWE. Additionally, by relying on the result of Choudhuri et al. (STOC 2019), we establish (subexponential) averagecase hardness of PPAD, assuming the subexponential hardness of LWE. @InProceedings{STOC21p708, author = {Ruta Jawale and Yael Tauman Kalai and Dakshita Khurana and Rachel Zhang}, title = {SNARGs for Bounded Depth Computations and PPAD Hardness from SubExponential LWE}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {708721}, doi = {10.1145/3406325.3451055}, year = {2021}, } Publisher's Version 

Kim, Isaac H. 
STOC '21: "The Ghost in the Radiation: ..."
The Ghost in the Radiation: Robust Encodings of the Black Hole Interior (Invited Paper)
Isaac H. Kim, Eugene Tang, and John Preskill (University of Sydney, Australia; California Institute of Technology, USA) We reconsider the black hole firewall puzzle, emphasizing that quantum errorcorrection, computational complexity, and pseudorandomness are crucial concepts for understanding the black hole interior. We assume that the Hawking radiation emitted by an old black hole is pseudorandom, meaning that it cannot be distinguished from a perfectly thermal state by any efficient quantum computation acting on the radiation alone. We then infer the existence of a subspace of the radiation system which we interpret as an encoding of the black hole interior. This encoded interior is entangled with the late outgoing Hawking quanta emitted by the old black hole, and is inaccessible to computationally bounded observers who are outside the black hole. Specifically, efficient operations acting on the radiation, those with quantum computational complexity polynomial in the entropy of the remaining black hole, commute with a complete set of logical operators acting on the encoded interior, up to corrections which are exponentially small in the entropy. Thus, under our pseudorandomness assumption, the black hole interior is well protected from exterior observers as long as the remaining black hole is macroscopic. On the other hand, if the radiation is not pseudorandom, an exterior observer may be able to create a firewall by applying a polynomialtime quantum computation to the radiation. @InProceedings{STOC21p8, author = {Isaac H. Kim and Eugene Tang and John Preskill}, title = {The Ghost in the Radiation: Robust Encodings of the Black Hole Interior (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88}, doi = {10.1145/3406325.3465357}, year = {2021}, } Publisher's Version 

Kim, Michael P. 
STOC '21: "Outcome Indistinguishability ..."
Outcome Indistinguishability
Cynthia Dwork, Michael P. Kim, Omer Reingold, Guy N. Rothblum, and Gal Yona (Harvard University, USA; University of California at Berkeley, USA; Stanford University, USA; Weizmann Institute of Science, Israel) Prediction algorithms assign numbers to individuals that are popularly understood as individual “probabilities”—what is the probability of 5year survival after cancer diagnosis?—and which increasingly form the basis for lifealtering decisions. Drawing on an understanding of computational indistinguishability developed in complexity theory and cryptography, we introduce Outcome Indistinguishability. Predictors that are Outcome Indistinguishable (OI) yield a generative model for outcomes that cannot be efficiently refuted on the basis of the reallife observations produced by . We investigate a hierarchy of OI definitions, whose stringency increases with the degree to which distinguishers may access the predictor in question. Our findings reveal that OI behaves qualitatively differently than previously studied notions of indistinguishability. First, we provide constructions at all levels of the hierarchy. Then, leveraging recentlydeveloped machinery for proving averagecase finegrained hardness, we obtain lower bounds on the complexity of the more stringent forms of OI. This hardness result provides the first scientific grounds for the political argument that, when inspecting algorithmic risk prediction instruments, auditors should be granted oracle access to the algorithm, not simply historical predictions. @InProceedings{STOC21p1095, author = {Cynthia Dwork and Michael P. Kim and Omer Reingold and Guy N. Rothblum and Gal Yona}, title = {Outcome Indistinguishability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10951108}, doi = {10.1145/3406325.3451064}, year = {2021}, } Publisher's Version 

Kingsford, Carl 
STOC '21: "How Much Data Is Sufficient ..."
How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design
MariaFlorina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik (Carnegie Mellon University, USA; University of Texas at El Paso, USA; University of Pennsylvania, USA) Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worstcase bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worstcase guarantee. Worstcase instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that datadriven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, applicationspecific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm’s average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP’17; Balcan et al., COLT’17, ICML’18, EC’18) has proved generalization bounds by employing casebycase analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm’s performance is a piecewiseconstant, linear, or—more generally—piecewisestructured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology. @InProceedings{STOC21p919, author = {MariaFlorina Balcan and Dan DeBlasio and Travis Dick and Carl Kingsford and Tuomas Sandholm and Ellen Vitercik}, title = {How Much Data Is Sufficient to Learn HighPerforming Algorithms? Generalization Guarantees for DataDriven Algorithm Design}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {919932}, doi = {10.1145/3406325.3451036}, year = {2021}, } Publisher's Version 

Klein, Nathan 
STOC '21: "A (Slightly) Improved Approximation ..."
A (Slightly) Improved Approximation Algorithm for Metric TSP
Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan (University of Washington, USA) For some > 10^{−36} we give a randomized 3/2− approximation algorithm for metric TSP. @InProceedings{STOC21p32, author = {Anna R. Karlin and Nathan Klein and Shayan Oveis Gharan}, title = {A (Slightly) Improved Approximation Algorithm for Metric TSP}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {3245}, doi = {10.1145/3406325.3451009}, year = {2021}, } Publisher's Version 

Klein, Ohad 
STOC '21: "Local Concentration Inequalities ..."
Local Concentration Inequalities and Tomaszewski’s Conjecture
Nathan Keller and Ohad Klein (BarIlan University, Israel) We prove Tomaszewski’s conjecture (1986): Let f:{−1,1}^{n} → ℝ be of the form f(x)= ∑_{i=1}^{n} a_{i} x_{i}. Then Pr[f(x) ≤ √Var[f]] ≥ 1/2. Our main novel tools are local concentration inequalities and an improved BerryEsseen inequality for firstdegree functions on the discrete cube. These tools are of independent interest, and may be useful in the study of linear threshold functions and of low degree Boolean functions. @InProceedings{STOC21p1656, author = {Nathan Keller and Ohad Klein}, title = {Local Concentration Inequalities and Tomaszewski’s Conjecture}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16561669}, doi = {10.1145/3406325.3451011}, year = {2021}, } Publisher's Version 

Klein, Philip N. 
STOC '21: "A Quasipolynomial (2 + ε)Approximation ..."
A Quasipolynomial (2 + ε)Approximation for Planar Sparsest Cut
Vincent CohenAddad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (nonuniform) sparsest cut problem is the following graphpartitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constantfactor approximations are known. We consider the problem for planar graphs, and give a (2+)approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a SheraliAdamsstyle linear programming relaxation of the problem, which we then round. This should be compared with the polynomialtime approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ_{1}embeddings, and achieves an O(√logn)approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent CohenAddad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10561069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version 

Kleinberg, Jon 
STOC '21: "Simplicity Creates Inequity: ..."
Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)
Jon Kleinberg and Sendhil Mullainathan (Cornell University, USA; University of Chicago, USA) Algorithms are increasingly used to aid, or in some cases supplant, human decisionmaking, particularly for decisions that hinge on predictions. As a result, two additional features in addition to prediction quality have generated interest: (i) to facilitate human interaction and understanding with these algorithms, we desire prediction functions that are in some fashion simple or interpretable; and (ii) because they influence consequential decisions, we also want them to produce equitable allocations. We develop a formal model to explore the relationship between the demands of simplicity and equity. Although the two concepts appear to be motivated by qualitatively distinct goals, we show a fundamental inconsistency between them. Specifically, we formalize a general framework for producing simple prediction functions, and in this framework we establish two basic results. First, every simple prediction function is strictly improvable: there exists a more complex prediction function that is both strictly more efficient and also strictly more equitable. Put another way, using a simple prediction function both reduces utility for disadvantaged groups and reduces overall welfare relative to other options. Second, we show that simple prediction functions necessarily create incentives to use information about individuals' membership in a disadvantaged group  incentives that weren't present before simplification, and that work against these individuals. Thus, simplicity transforms disadvantage into bias against the disadvantaged group. Our results are not only about algorithms but about any process that produces simple models, and as such they connect to the psychology of stereotypes and to an earlier economics literature on statistical discrimination. @InProceedings{STOC21p7, author = {Jon Kleinberg and Sendhil Mullainathan}, title = {Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {77}, doi = {10.1145/3406325.3465356}, year = {2021}, } Publisher's Version 

Knop, Alexander 
STOC '21: "Statistical Query Complexity ..."
Statistical Query Complexity of Manifold Estimation
Eddie Aamari and Alexander Knop (LPSM, France; Sorbonne University, France; University of Paris, France; CNRS, France; University of California at San Diego, USA) This paper studies the statistical query (SQ) complexity of estimating ddimensional submanifolds in ℝ^{n}. We propose a purely geometric algorithm called Manifold Propagation, that reduces the problem to three natural geometric routines: projection, tangent space estimation, and point detection. We then provide constructions of these geometric routines in the SQ framework. Given an adversarial STAT(τ) oracle and a target Hausdorff distance precision ε = Ω(τ^{2/(d+1)}), the resulting SQ manifold reconstruction algorithm has query complexity O(n polylog(n) ε^{−d/2}), which is proved to be nearly optimal. In the process, we establish lowrank matrix completion results for SQ’s and lower bounds for randomized SQ estimators in general metric spaces. @InProceedings{STOC21p116, author = {Eddie Aamari and Alexander Knop}, title = {Statistical Query Complexity of Manifold Estimation}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {116122}, doi = {10.1145/3406325.3451135}, year = {2021}, } Publisher's Version STOC '21: "LogRank and Lifting for ANDFunctions ..." LogRank and Lifting for ANDFunctions Alexander Knop, Shachar Lovett, Sam McGuire, and Weiqiang Yuan (University of California at San Diego, USA; Tsinghua University, China) Let f: {0, 1}^{n} → {0, 1} be a boolean function, and let f_{∧}(x, y) = f(x ∧ y) denote the ANDfunction of f, where x ∧ y denotes bitwise AND. We study the deterministic communication complexity of f_{∧} and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f_{∧}. This comes within a logn factor of establishing the logrank conjecture for ANDfunctions with no assumptions on f. Our result stands in contrast with previous results on special cases of the logrank conjecture, which needed significant restrictions on f such as monotonicity or low F_{2}degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for ANDfunctions, stating that the deterministic communication complexity of f_{∧} is polynomially related to the ANDdecision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}^{n} → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size polylogarithmic in its sparsity. We also establish extensions of this result to multilinear polynomials f: {0, 1}^{n} → with a larger range. @InProceedings{STOC21p197, author = {Alexander Knop and Shachar Lovett and Sam McGuire and Weiqiang Yuan}, title = {LogRank and Lifting for ANDFunctions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {197208}, doi = {10.1145/3406325.3450999}, year = {2021}, } Publisher's Version 

Knudsen, Jakob Bæk Tejs 
STOC '21: "Load Balancing with Dynamic ..."
Load Balancing with Dynamic Set of Balls and Bins
Anders Aamand, Jakob Bæk Tejs Knudsen, and Mikkel Thorup (University of Copenhagen, Denmark) In dynamic load balancing, we wish to distribute balls into bins in an environment where both balls and bins can be added and removed. We want to minimize the maximum load of any bin but we also want to minimize the number of balls and bins that are affected when adding or removing a ball or a bin. We want a hashingstyle solution where we given the ID of a ball can find its bin efficiently. We are given a userspecified balancing parameter c=1+ε, where ε∈ (0,1). Let n and m be the current number of balls and bins. Then we want no bin with load above C=⌈ c n/m⌉, referred to as the capacity of the bins. We present a scheme where we can locate a ball checking 1+O(log1/ε) bins in expectation. When inserting or deleting a ball, we expect to move O(1/ε) balls, and when inserting or deleting a bin, we expect to move O(C/ε) balls. Previous bounds were off by a factor 1/ε. The above bounds are best possible when C=O(1) but for larger C, we can do much better: We define f=ε C when C≤ log1/ε, f=ε√C· √log(1/(ε√C)) when log1/ε≤ C<1/2ε^{2}, and f=1 when C≥ 1/2ε^{2}. We show that we expect to move O(1/f) balls when inserting or deleting a ball, and O(C/f) balls when inserting or deleting a bin. Moreover, when C≥ log1/ε, we can search a ball checking only O(1) bins in expectation. For the bounds with larger C, we first have to resolve a much simpler probabilistic problem. Place n balls in m bins of capacity C, one ball at the time. Each ball picks a uniformly random nonfull bin. We show that in expectation and with high probability, the fraction of nonfull bins is Θ(f). Then the expected number of bins that a new ball would have to visit to find one that is not full is Θ(1/f). As it turns out, this is also the complexity of an insertion in our more complicated scheme where both balls and bins can be added and removed. @InProceedings{STOC21p1262, author = {Anders Aamand and Jakob Bæk Tejs Knudsen and Mikkel Thorup}, title = {Load Balancing with Dynamic Set of Balls and Bins}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12621275}, doi = {10.1145/3406325.3451107}, year = {2021}, } Publisher's Version 

Kociumaka, Tomasz 
STOC '21: "Improved Dynamic Algorithms ..."
Improved Dynamic Algorithms for Longest Increasing Subsequence
Tomasz Kociumaka and Saeed Seddighin (University of California at Berkeley, USA; Toyota Technological Institute at Chicago, USA) We study dynamic algorithms for the longest increasing subsequence (LIS) problem. A dynamic LIS algorithm maintains a sequence subject to operations of the following form arriving one by one: insert an element, delete an element, or substitute an element for another. After each update, the algorithm must report the length of the longest increasing subsequence of the current sequence. Our main contribution is the first exact dynamic LIS algorithm with sublinear update time. More precisely, we present a randomized algorithm that performs each operation in time Õ(n^{4/5}) and, after each update, reports the answer to the LIS problem correctly with high probability. We use several novel techniques and observations for this algorithm that may find applications in future work. In the second part of the paper, we study approximate dynamic LIS algorithms, which are allowed to underestimate the solution size within a bounded multiplicative factor. In this setting, we give a deterministic (1−o(1))approximation algorithm with update time O(n^{o(1)}). This result improves upon the previous work of Mitzenmacher and Seddighin (STOC’20) that provides an Ω(є^{O(1/є)})approximation algorithm with update time Õ(n^{є}) for any є > 0. @InProceedings{STOC21p640, author = {Tomasz Kociumaka and Saeed Seddighin}, title = {Improved Dynamic Algorithms for Longest Increasing Subsequence}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {640653}, doi = {10.1145/3406325.3451026}, year = {2021}, } Publisher's Version 

Kol, Gillat 
STOC '21: "Optimal Error Resilience of ..."
Optimal Error Resilience of Adaptive Message Exchange
Klim Efremenko, Gillat Kol, and Raghuvansh R. Saxena (BenGurion University of the Negev, Israel; Princeton University, USA) We study the error resilience of the message exchange task: Two parties, each holding a private input, want to exchange their inputs. However, the channel connecting them is governed by an adversary that may corrupt a constant fraction of the transmissions. What is the maximum fraction of corruptions that still allows the parties to exchange their inputs? For the nonadaptive channel, where the parties must agree in advance on the order in which they communicate, the maximum error resilience was shown to be 1/4 (see Braverman and Rao, STOC 2011). The problem was also studied over the adaptive channel, where the order in which the parties communicate may not be predetermined (Ghaffari, Haeupler, and Sudan, STOC 2014; Efremenko, Kol, and Saxena, STOC 2020). These works show that the adaptive channel admits much richer set of protocols but leave open the question of finding its maximum error resilience. In this work, we show that the maximum error resilience of a protocol for message exchange over the adaptive channel is 5/16, thereby settling the above question. Our result requires improving both the known upper bounds and the known lower bounds for the problem. @InProceedings{STOC21p1235, author = {Klim Efremenko and Gillat Kol and Raghuvansh R. Saxena}, title = {Optimal Error Resilience of Adaptive Message Exchange}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12351247}, doi = {10.1145/3406325.3451077}, year = {2021}, } Publisher's Version STOC '21: "Almost Optimal SuperConstantPass ..." Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu (Massachusetts Institute of Technology, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA) We give an almost quadratic n^{2−o(1)} lower bound on the space consumption of any o(√logn)pass streaming algorithm solving the (directed) st reachability problem. This means that any such algorithm must essentially store the entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional fundamental problems, including maximum matching, shortest path, matrix rank, and linear programming. Our main technical contribution is the definition and construction of set hiding graphs, that may be of independent interest: we give a general way of encoding a set S ⊆ [k] as a directed graph with n = k^{ 1 + o( 1 ) } vertices, such that deciding whether i ∈ S boils down to deciding if t_{i} is reachable from s_{i}, for a specific pair of vertices (s_{i},t_{i}) in the graph. Furthermore, we prove that our graph “hides” S, in the sense that no lowspace streaming algorithm with a small number of passes can learn (almost) anything about S. @InProceedings{STOC21p570, author = {Lijie Chen and Gillat Kol and Dmitry Paramonov and Raghuvansh R. Saxena and Zhao Song and Huacheng Yu}, title = {Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {570583}, doi = {10.1145/3406325.3451038}, year = {2021}, } Publisher's Version 

Kontonis, Vasilis 
STOC '21: "Efficiently Learning Halfspaces ..."
Efficiently Learning Halfspaces with Tsybakov Noise
Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, and Nikos Zarifis (University of WisconsinMadison, USA; University of California at San Diego, USA) We study the problem of PAC learning homogeneous halfspaces with Tsybakov noise. In the Tsybakov noise model, the label of every example is independently flipped with an adversarially controlled probability that can be arbitrarily close to 1/2 for a fraction of the examples. We give the first polynomialtime algorithm for this fundamental learning problem. Our algorithm learns the true halfspace within any desired accuracy and succeeds under a broad family of wellbehaved distributions including logconcave distributions. This extended abstract is a merge of two papers. In an earlier work, a subset of the authors developed an efficient reduction from learning to certifying the nonoptimality of a candidate halfspace and gave a quasipolynomial time certificate algorithm. In a subsequent work, the authors of the this paper developed a polynomialtime certificate algorithm. @InProceedings{STOC21p88, author = {Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, title = {Efficiently Learning Halfspaces with Tsybakov Noise}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {88101}, doi = {10.1145/3406325.3450998}, year = {2021}, } Publisher's Version 

Kothari, Pravesh K. 
STOC '21: "Playing Unique Games on Certified ..."
Playing Unique Games on Certified SmallSet Expanders
Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer (Harvard University, USA; Carnegie Mellon University, USA; Stanford University, USA; ETH Zurich, Switzerland) We give an algorithm for solving unique games (UG) instances whenever lowdegree sumofsquares proofs certify good bounds on the smallsetexpansion of the underlying constraint graph via a hypercontractive inequality. Our algorithm is in fact more versatile, and succeeds even when the constraint graph is not a smallset expander as long as the structure of nonexpanding small sets is (informally speaking) “characterized” by a lowdegree sumofsquares proof. Our results are obtained by rounding lowentropy solutions — measured via a new global potential function — to sumofsquares (SoS) semidefinite programs. This technique adds to the (currently short) list of general tools for analyzing SoS relaxations for worstcase optimization problems. As corollaries, we obtain the first polynomialtime algorithms for solving any UG instance where the constraint graph is either the noisy hypercube, the short code or the Johnson graph. The prior best algorithm for such instances was the eigenvalue enumeration algorithm of Arora, Barak, and Steurer (2010) which requires quasipolynomial time for the noisy hypercube and nearlyexponential time for the short code and Johnson graphs. All of our results achieve an approximation of 1−є vs δ for UG instances, where є>0 and δ > 0 depend on the expansion parameters of the graph but are independent of the alphabet size. @InProceedings{STOC21p1629, author = {Mitali Bafna and Boaz Barak and Pravesh K. Kothari and Tselil Schramm and David Steurer}, title = {Playing Unique Games on Certified SmallSet Expanders}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16291642}, doi = {10.1145/3406325.3451099}, year = {2021}, } Publisher's Version 

Kothari, Robin 
STOC '21: "Degree vs. Approximate Degree ..."
Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem
Scott Aaronson, Shalev BenDavid, Robin Kothari, Shravas Rao, and Avishay Tal (University of Texas at Austin, USA; University of Waterloo, Canada; Microsoft Quantum, USA; Microsoft Research, USA; Northwestern University, USA; University of California at Berkeley, USA) Based on the recent breakthrough of Huang (2019), we show that for any total Boolean function f, deg(f) = O(adeg(f)^2): The degree of f is at most quadratic in the approximate degree of f. This is optimal as witnessed by the OR function. D(f) = O(Q(f)^4): The deterministic query complexity of f is at most quartic in the quantum query complexity of f. This matches the known separation (up to log factors) due to Ambainis, Balodis, Belovs, Lee, Santha, and Smotrovs (2017). We apply these results to resolve the quantum analogue of the Aanderaa–Karp–Rosenberg conjecture. We show that if f is a nontrivial monotone graph property of an nvertex graph specified by its adjacency matrix, then Q(f)=Ω(n), which is also optimal. We also show that the approximate degree of any readonce formula on n variables is Θ(√n). @InProceedings{STOC21p1330, author = {Scott Aaronson and Shalev BenDavid and Robin Kothari and Shravas Rao and Avishay Tal}, title = {Degree vs. Approximate Degree and Quantum Implications of Huang’s Sensitivity Theorem}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13301342}, doi = {10.1145/3406325.3451047}, year = {2021}, } Publisher's Version 

Krauthgamer, Robert 
STOC '21: "Towards Tight Bounds for Spectral ..."
Towards Tight Bounds for Spectral Sparsification of Hypergraphs
Michael Kapralov, Robert Krauthgamer, Jakab Tardos, and Yuichi Yoshida (EPFL, Switzerland; Weizmann Institute of Science, Israel; National Institute of Informatics, Japan) Cut and spectral sparsification of graphs have numerous applications, including e.g. speeding up algorithms for cuts and Laplacian solvers. These powerful notions have recently been extended to hypergraphs, which are much richer and may offer new applications. However, the current bounds on the size of hypergraph sparsifiers are not as tight as the corresponding bounds for graphs. Our first result is a polynomialtime algorithm that, given a hypergraph on n vertices with maximum hyperedge size r, outputs an єspectral sparsifier with O^{*}(nr) hyperedges, where O^{*} suppresses (є^{−1} logn)^{O(1)} factors. This size bound improves the two previous bounds: O^{*}(n^{3}) [Soma and Yoshida, SODA’19] and O^{*}(nr^{3}) [Bansal, Svensson and Trevisan, FOCS’19]. Our main technical tool is a new method for proving concentration of the nonlinear analogue of the quadratic form of the Laplacians for hypergraph expanders. We complement this with lower bounds on the bit complexity of any compression scheme that (1+є)approximates all the cuts in a given hypergraph, and hence also on the bit complexity of every єcut/spectral sparsifier. These lower bounds are based on RuzsaSzemerédi graphs, and a particular instantiation yields an Ω(nr) lower bound on the bit complexity even for fixed constant є. In the case of hypergraph cut sparsifiers, this is tight up to polylogarithmic factors in n, due to recent result of [Chen, Khanna and Nagda, FOCS’20]. For spectral sparsifiers it narrows the gap to O^{*}(r). Finally, for directed hypergraphs, we present an algorithm that computes an єspectral sparsifier with O^{*}(n^{2}r^{3}) hyperarcs, where r is the maximum size of a hyperarc. For small r, this improves over O^{*}(n^{3}) known from [Soma and Yoshida, SODA’19], and is getting close to the trivial lower bound of Ω(n^{2}) hyperarcs. @InProceedings{STOC21p598, author = {Michael Kapralov and Robert Krauthgamer and Jakab Tardos and Yuichi Yoshida}, title = {Towards Tight Bounds for Spectral Sparsification of Hypergraphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {598611}, doi = {10.1145/3406325.3451061}, year = {2021}, } Publisher's Version STOC '21: "Subcubic Algorithms for Gomory–Hu ..." Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs Amir Abboud, Robert Krauthgamer, and Ohad Trabelsi (Weizmann Institute of Science, Israel) Every undirected graph G has a (weighted) cutequivalent tree T, commonly named after Gomory and Hu who discovered it in 1961. Both T and G have the same node set, and for every node pair s,t, the minimum (s,t)cut in T is also an exact minimum (s,t)cut in G. We give the first subcubictime algorithm that constructs such a tree for a simple graph G (unweighted with no parallel edges). Its time complexity is Õ(n^{2.5}), for n=V(G); previously, only Õ(n^{3}) was known, except for restricted cases like sparse graphs. Consequently, we obtain the first algorithm for AllPairs MaxFlow in simple graphs that breaks the cubictime barrier. Gomory and Hu compute this tree using n−1 queries to (singlepair) MaxFlow; the new algorithm can be viewed as a finegrained reduction to Õ(√n) MaxFlow computations on nnode graphs. @InProceedings{STOC21p1725, author = {Amir Abboud and Robert Krauthgamer and Ohad Trabelsi}, title = {Subcubic Algorithms for Gomory–Hu Tree in Unweighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17251737}, doi = {10.1145/3406325.3451073}, year = {2021}, } Publisher's Version 

Krishnamurthy, Akshay 
STOC '21: "Contextual Search in the Presence ..."
Contextual Search in the Presence of Irrational Agents
Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, and Robert Schapire (Microsoft Research, USA; Harvard University, USA) We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as featurebased dynamic pricing. Standard gametheoretic formulations of this problem assume that agents act in accordance with a specific behavioral model. In practice, some agents may not subscribe to the dominant behavioral model or may act in ways that are seemingly arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance even with a few arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. Our techniques draw inspiration from learning theory, game theory, highdimensional geometry, and convex analysis. @InProceedings{STOC21p910, author = {Akshay Krishnamurthy and Thodoris Lykouris and Chara Podimata and Robert Schapire}, title = {Contextual Search in the Presence of Irrational Agents}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {910918}, doi = {10.1145/3406325.3451120}, year = {2021}, } Publisher's Version 

Kuhn, Fabian 
STOC '21: "Efficient Randomized Distributed ..."
Efficient Randomized Distributed Coloring in CONGEST
Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan (Reykjavik University, Iceland; University of Freiburg, Germany; Technion, Israel) Distributed vertex coloring is one of the classic problems and probably also the most widely studied problems in the area of distributed graph algorithms. We present a new randomized distributed vertex coloring algorithm for the standard CONGEST model, where the network is modeled as an nnode graph G, and where the nodes of G operate in synchronous communication rounds in which they can exchange O(logn)bit messages over all the edges of G. For graphs with maximum degree Δ, we show that the (Δ+1)list coloring problem (and therefore also the standard (Δ+1)coloring problem) can be solved in O(log^{5}logn) rounds. Previously such a result was only known for the significantly more powerful LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary size. The best previous (Δ+1)coloring algorithm in the CONGEST model had a running time of O(logΔ + log^{6}logn) rounds. As a function of n alone, the best previous algorithm therefore had a round complexity of O(logn), which is a bound that can also be achieved by a na'ive folklore algorithm. For large maximum degree Δ, our algorithm hence is an exponential improvement over the previous state of the art. @InProceedings{STOC21p1180, author = {Magnús M. Halldórsson and Fabian Kuhn and Yannic Maus and Tigran Tonoyan}, title = {Efficient Randomized Distributed Coloring in CONGEST}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11801193}, doi = {10.1145/3406325.3451089}, year = {2021}, } Publisher's Version 

Kumar, Mrinal 
STOC '21: "Decoding Multivariate Multiplicity ..."
Decoding Multivariate Multiplicity Codes on Product Sets
Siddharth Bhandari, Prahladh Harsha, Mrinal Kumar, and Madhu Sudan (Tata Institute of Fundamental Research, India; IIT Bombay, India; Harvard University, USA) The multiplicity SchwartzZippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and Yekhanin [J. ACM, 2014], who showed how to use this lemma to construct highrate locallydecodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson radius. For errors exceeding the Johnson radius, even combinatorial listdecodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set of variables. We then apply the polynomial method by viewing this as a problem over the field () of rational functions in . @InProceedings{STOC21p1489, author = {Siddharth Bhandari and Prahladh Harsha and Mrinal Kumar and Madhu Sudan}, title = {Decoding Multivariate Multiplicity Codes on Product Sets}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14891501}, doi = {10.1145/3406325.3451027}, year = {2021}, } Publisher's Version 

Kumar, Ravi 
STOC '21: "SampleEfficient Proper PAC ..."
SampleEfficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d^{6}), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2^{O(d)} on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {SampleEfficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version 

Kuszmaul, William 
STOC '21: "How Asymmetry Helps Buffer ..."
How Asymmetry Helps Buffer Management: Achieving Optimal Tail Size in Cup Games
William Kuszmaul (Massachusetts Institute of Technology, USA) The cup game on n cups is a multistep game with two players, a filler and an emptier. At each step, the filler distributes 1 unit of water among the cups, and then the emptier selects a single cup to remove (up to) 1 unit of water from. There are several objective functions that the emptier might wish to minimize. One of the strongest guarantees would be to minimize tail size, which is defined to be the number of cups with fill 2 or greater. A simple lowerbound construction shows that the optimal tail size for deterministic emptying algorithms is Θ(n), however. We present a simple randomized emptying algorithm that achieves tail size Õ(logn) with high probability in n for poly n steps. Moreover, we show that this is tight up to doubly logarithmic factors. We also extend our results to the multiprocessor cup game, achieving tail size Õ(logn + p) on p processors with high probability in n. We show that the dependence on p is near optimal for any emptying algorithm that achieves polynomialbounded backlog. A natural question is whether our results can be extended to give unending guarantees, which apply to arbitrarily long games. We give a lower bound construction showing that no monotone memoryless emptying algorithm can achieve an unending guarantee on either tail size or the related objective function of backlog. On the other hand, we show that even a very small (i.e., 1 / poly n) amount of resource augmentation is sufficient to overcome this barrier. @InProceedings{STOC21p1248, author = {William Kuszmaul}, title = {How Asymmetry Helps Buffer Management: Achieving Optimal Tail Size in Cup Games}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12481261}, doi = {10.1145/3406325.3451033}, year = {2021}, } Publisher's Version 

Laddha, Aditi 
STOC '21: "Reducing Isotropy and Volume ..."
Reducing Isotropy and Volume to KLS: An O*(n^{3}ψ^{2}) Volume Algorithm
He Jia, Aditi Laddha, Yin Tat Lee, and Santosh Vempala (Georgia Institute of Technology, USA; University of Washington, USA; Microsoft Research, USA) We show that the volume of a convex body in R^{n} in the general membership oracle model can be computed to within relative error ε using O(n^{3}ψ^{2}/ε^{2}) oracle queries, where ψ is the KLS constant. With the current bound of ψ=O(n^{o(1)}), this gives an O(n^{3+o(1)}/ε^{2}) algorithm, the first improvement on the LovászVempala O(n^{4}/ε^{2}) algorithm from 2003. The main new ingredient is an O(n^{3}ψ^{2}) algorithm for isotropic transformation, following which we can apply the O(n^{3}/ε^{2}) volume algorithm of Cousins and Vempala for wellrounded convex bodies. A positive resolution of the KLS conjecture would imply an O(n^{3}/є^{2}) volume algorithm. We also give an efficient implementation of the new algorithm for convex polytopes defined by m inequalities in R^{n}: polytope volume can be estimated in time O(mn^{c}/ε^{2}) where c<3.2 depends on the current matrix multiplication exponent and improves on the previous best bound. @InProceedings{STOC21p961, author = {He Jia and Aditi Laddha and Yin Tat Lee and Santosh Vempala}, title = {Reducing Isotropy and Volume to KLS: An <i>O</i>*(<i>n</i><sup>3</sup><i>ψ</i><sup>2</sup>) Volume Algorithm}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {961974}, doi = {10.1145/3406325.3451018}, year = {2021}, } Publisher's Version 

Langley, Zachary 
STOC '21: "A Framework for Dynamic Matching ..."
A Framework for Dynamic Matching in Weighted Graphs
Aaron Bernstein, Aditi Dudeja, and Zachary Langley (Rutgers University, USA) We introduce a new framework for computing approximate maximum weight matchings. Our primary focus is on the fully dynamic setting, where there is a large gap between the guarantees of the best known algorithms for computing weighted and unweighted matchings. Indeed, almost all current weighted matching algorithms that reduce to the unweighted problem lose a factor of two in the approximation ratio. In contrast, in other sublinear models such as the distributed and streaming models, recent work has largely closed this weighted/unweighted gap. For bipartite graphs, we almost completely settle the gap with a general reduction that converts any algorithm for αapproximate unweighted matching to an algorithm for (1−)αapproximate weighted matching, while only increasing the update time by an O(logn) factor for constant . We also show that our framework leads to significant improvements for nonbipartite graphs, though not in the form of a universal reduction. In particular, we give two algorithms for weighted nonbipartite matching: 1. A randomized (Las Vegas) fully dynamic algorithm that maintains a (1/2−)approximate maximum weight matching in worstcase update time O(polylog n) with high probability against an adaptive adversary. Our bounds are essentially the same as those of the unweighted algorithm of Wajc [STOC 2020]. 2. A deterministic fully dynamic algorithm that maintains a (2/3−)approximate maximum weight matching in amortized update time O(m^{1/4}). Our bounds are essentially the same as those of the unweighted algorithm of Bernstein and Stein [SODA 2016]. A key feature of our framework is that it uses existing algorithms for unweighted matching as blackboxes. As a result, our framework is simple and versatile. Moreover, our framework easily translates to other models, and we use it to derive new results for the weighted matching problem in streaming and communication complexity models. @InProceedings{STOC21p668, author = {Aaron Bernstein and Aditi Dudeja and Zachary Langley}, title = {A Framework for Dynamic Matching in Weighted Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {668681}, doi = {10.1145/3406325.3451113}, year = {2021}, } Publisher's Version 

Lazos, Philip 
STOC '21: "Efficient TwoSided Markets ..."
Efficient TwoSided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for twosided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with nearoptimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimumpossible amount of prior information, from each seller distribution is sufficient for a large class of twosided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any onesided mechanism into a twosided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackboxtype mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient TwoSided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14521465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version 

Le, Hung 
STOC '21: "Clan Embeddings into Trees, ..."
Clan Embeddings into Trees, and Low Treewidth Graphs
Arnold Filtser and Hung Le (Columbia University, USA; University of Massachusetts, USA) In low distortion metric embeddings, the goal is to embed a host “hard” metric space into a “simpler” target space while approximately preserving pairwise distances. A highly desirable target space is that of a tree metric. Unfortunately, such embedding will result in a huge distortion. A celebrated bypass to this problem is stochastic embedding with logarithmic expected distortion. Another bypass is Ramseytype embedding, where the distortion guarantee applies only to a subset of the points. However, both these solutions fail to provide an embedding into a single tree with a worstcase distortion guarantee on all pairs. In this paper, we propose a novel third bypass called clan embedding. Here each point x is mapped to a subset of points f(x), called a clan, with a special chief point χ(x)∈ f(x). The clan embedding has multiplicative distortion t if for every pair (x,y) some copy y′∈ f(y) in the clan of y is close to the chief of x: min_{y′∈ f(y)}d(y′,χ(x))≤ t· d(x,y). Our first result is a clan embedding into a tree with multiplicative distortion O(logn/є) such that each point has 1+є copies (in expectation). In addition, we provide a “spanning” version of this theorem for graphs and use it to devise the first compact routing scheme with constant size routing tables. We then focus on minorfree graphs of diameter prameterized by D, which were known to be stochastically embeddable into bounded treewidth graphs with expected additive distortion є D. We devise Ramseytype embedding and clan embedding analogs of the stochastic embedding. We use these embeddings to construct the first (bicriteria quasipolynomial time) approximation scheme for the metric ρdominating set and metric ρindependent set problems in minorfree graphs. @InProceedings{STOC21p342, author = {Arnold Filtser and Hung Le}, title = {Clan Embeddings into Trees, and Low Treewidth Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {342355}, doi = {10.1145/3406325.3451043}, year = {2021}, } Publisher's Version 

Leake, Jonathan 
STOC '21: "Sampling Matrices from HarishChandra–Itzykson–Zuber ..."
Sampling Matrices from HarishChandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy
Jonathan Leake, Colin McSwiggen, and Nisheeth K. Vishnoi (TU Berlin, Germany; University of Tokyo, Japan; Yale University, USA) Given two Hermitian matrices Y and Λ, the HarishChandra–Itzykson–Zuber (HCIZ) distribution is given by the density e^{Tr(U Λ U*Y)} with respect to the Haar measure on the unitary group. Random unitary matrices distributed according to the HCIZ distribution are important in various settings in physics and random matrix theory, but the problem of sampling efficiently from this distribution has remained open. We present two algorithms to sample matrices from distributions that are close to the HCIZ distribution. The first produces samples that are ξclose in total variation distance, and the number of arithmetic operations required depends on poly(log1/ξ). The second produces samples that are ξclose in infinity divergence, but with a poly(1/ξ) dependence. Our results have the following applications: 1) an efficient algorithm to sample from complex versions of matrix Langevin distributions studied in statistics, 2) an efficient algorithm to sample from continuous maximum entropy distributions over unitary orbits, which in turn implies an efficient algorithm to sample a pure quantum state from the entropymaximizing ensemble representing a given density matrix, and 3) an efficient algorithm for differentially private rankk approximation that comes with improved utility bounds for k>1. @InProceedings{STOC21p1384, author = {Jonathan Leake and Colin McSwiggen and Nisheeth K. Vishnoi}, title = {Sampling Matrices from HarishChandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13841397}, doi = {10.1145/3406325.3451094}, year = {2021}, } Publisher's Version STOC '21: "Capacity Lower Bounds via ..." Capacity Lower Bounds via Productization Leonid Gurvits and Jonathan Leake (City College of New York, USA; TU Berlin, Germany) We give a sharp lower bound on the capacity of a real stable polynomial, depending only on the value of its gradient at x = 1. This result implies a sharp improvement to a similar inequality proved by LinialSamorodnitskyWigderson in 2000, which was crucial to the analysis of their permanent approximation algorithm. Such inequalities have played an important role in the recent work on operator scaling and its generalizations and applications, and in fact we use our bound to construct a new scaling algorithm for real stable polynomials. Our bound is also quite similar to one used very recently by KarlinKleinOveis Gharan to give an improved approximation factor for metric TSP. The new technique we develop to prove this bound is productization, which says that any real stable polynomial can be approximated at any point in the positive orthant by a product of linear forms. Beyond the results of this paper, our main hope is that this new technique will allow us to avoid ”frightening technicalities”, in the words of Laurent and Schrijver, that often accompany combinatorial lower bounds. In particular, we believe that this technique will be useful towards simplifying and improving further the approximation factor given in the fantastic work of KarlinKleinOveis Gharan on metric TSP. @InProceedings{STOC21p847, author = {Leonid Gurvits and Jonathan Leake}, title = {Capacity Lower Bounds via Productization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {847858}, doi = {10.1145/3406325.3451105}, year = {2021}, } Publisher's Version 

Lee, Euiwoong 
STOC '21: "A Framework for Quadratic ..."
A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations
Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor (Institute for Advanced Study at Princeton, USA; Princeton University, USA; University of Michigan, USA) We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(A_{ij}) with real entries and an originsymmetric convex body K⊂ ℝ^{n} that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑_{i=1}^{n}∑_{j=1}^{n} A_{ij} x_{i}x_{j}=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like maxcut, Grothendieck/noncommutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using casespecific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constantapproximability necessitates that K has type2 constant that grows slowly with n. However, we show that even when the type2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constantfactor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type2 constant nearly characterizes the approximability of quadratic maximization. @InProceedings{STOC21p870, author = {Vijay Bhattiprolu and Euiwoong Lee and Assaf Naor}, title = {A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {870881}, doi = {10.1145/3406325.3451128}, year = {2021}, } Publisher's Version 

Lee, Yin Tat 
STOC '21: "Minimum Cost Flows, MDPs, ..."
Minimum Cost Flows, MDPs, and ℓ_{1}Regression in Nearly Linear Time for Dense Instances
Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang (KTH, Sweden; University of Washington, USA; Microsoft Research, USA; Stanford University, USA; University of Michigan, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA; Google Research, USA) In this paper we provide new randomized algorithms with improved runtimes for solving linear programs with twosided constraints. In the special case of the minimum cost flow problem on nvertex medge graphs with integer polynomiallybounded costs and capacities we obtain a randomized method which solves the problem in Õ(m + n^{1.5}) time. This improves upon the previous best runtime of Õ(m √n) [LeeSidford’14] and, in the special case of unitcapacity maximum flow, improves upon the previous best runtimes of m^{4/3 + o(1)} [LiuSidford’20, Kathuria’20] and Õ(m √n) [LeeSidford’14] for sufficiently dense graphs. In the case of ℓ_{1}regression in a matrix with ncolumns and mrows we obtain a randomized method which computes an єapproximate solution in Õ(mn + n^{2.5}) time. This yields a randomized method which computes an єoptimal policy of a discounted Markov Decision Process with S states and, A actions per state in time Õ(S^{2} A + S^{2.5}). These methods improve upon the previous best runtimes of methods which depend polylogarithmically on problem parameters, which were Õ(mn^{1.5}) [LeeSidford’15] and Õ(S^{2.5} A) [LeeSidford’14, SidfordWangWuYe’18] respectively. To obtain this result we introduce two new algorithmic tools of possible independent interest. First, we design a new general interior point method for solving linear programs with two sided constraints which combines techniques from [LeeSongZhang’19, Brand et al.’20] to obtain a robust stochastic method with iteration count nearly the square root of the smaller dimension. Second, to implement this method we provide dynamic data structures for efficiently maintaining approximations to variants of Lewisweights, a fundamental importance measure for matrices which generalize leverage scores and effective resistances. @InProceedings{STOC21p859, author = {Jan van den Brand and Yin Tat Lee and Yang P. Liu and Thatchaphol Saranurak and Aaron Sidford and Zhao Song and Di Wang}, title = {Minimum Cost Flows, MDPs, and ℓ<sub>1</sub>Regression in Nearly Linear Time for Dense Instances}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {859869}, doi = {10.1145/3406325.3451108}, year = {2021}, } Publisher's Version STOC '21: "Reducing Isotropy and Volume ..." Reducing Isotropy and Volume to KLS: An O*(n^{3}ψ^{2}) Volume Algorithm He Jia, Aditi Laddha, Yin Tat Lee, and Santosh Vempala (Georgia Institute of Technology, USA; University of Washington, USA; Microsoft Research, USA) We show that the volume of a convex body in R^{n} in the general membership oracle model can be computed to within relative error ε using O(n^{3}ψ^{2}/ε^{2}) oracle queries, where ψ is the KLS constant. With the current bound of ψ=O(n^{o(1)}), this gives an O(n^{3+o(1)}/ε^{2}) algorithm, the first improvement on the LovászVempala O(n^{4}/ε^{2}) algorithm from 2003. The main new ingredient is an O(n^{3}ψ^{2}) algorithm for isotropic transformation, following which we can apply the O(n^{3}/ε^{2}) volume algorithm of Cousins and Vempala for wellrounded convex bodies. A positive resolution of the KLS conjecture would imply an O(n^{3}/є^{2}) volume algorithm. We also give an efficient implementation of the new algorithm for convex polytopes defined by m inequalities in R^{n}: polytope volume can be estimated in time O(mn^{c}/ε^{2}) where c<3.2 depends on the current matrix multiplication exponent and improves on the previous best bound. @InProceedings{STOC21p961, author = {He Jia and Aditi Laddha and Yin Tat Lee and Santosh Vempala}, title = {Reducing Isotropy and Volume to KLS: An <i>O</i>*(<i>n</i><sup>3</sup><i>ψ</i><sup>2</sup>) Volume Algorithm}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {961974}, doi = {10.1145/3406325.3451018}, year = {2021}, } Publisher's Version STOC '21: "A NearlyLinear Time Algorithm ..." A NearlyLinear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path Sally Dong, Yin Tat Lee, and Guanghao Ye (University of Washington, USA; Microsoft Research, USA) Arising from structural graph theory, treewidth has become a focus of study in fixedparameter tractable algorithms. Many NPhard problems are known to be solvable in O(n · 2^{O(τ)}) time, where τ is the treewidth of the input graph. Analogously, many problems in P should be solvable in O(n · τ^{O(1)}) time; however, due to the lack of appropriate tools, only a few such results are currently known. In our paper, we show this holds for linear programs: Given a linear program of the form min_{Ax=b,ℓ ≤ x≤ u} c^{⊤} x whose dual graph G_{A} has treewidth τ, and a corresponding widthτ tree decomposition, we show how to solve it in time O(n · τ^{2} log(1/ε)), where n is the number of variables and ε is the relative accuracy. When a tree decomposition is not given, we use existing techniques in vertex separators to obtain algorithms with O(n · τ^{4} log(1/ε)) and O(n · τ^{2} log(1/ε) + n^{1.5}) runtimes. Besides being the first of its kind, our algorithm has runtime nearly matching the fastest runtime for solving the subproblem Ax=b (under the assumption that no fast matrix multiplication is used). We obtain these results by combining recent techniques in interiorpoint methods (IPMs), sketching, and a novel representation of the solution under a multiscale basis similar to the wavelet basis. This representation further yields the first IPM with o(rank(A)) time per iteration when the treewidth is small. @InProceedings{STOC21p1784, author = {Sally Dong and Yin Tat Lee and Guanghao Ye}, title = {A NearlyLinear Time Algorithm for Linear Programs with Small Treewidth: A Multiscale Representation of Robust Central Path}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17841797}, doi = {10.1145/3406325.3451056}, year = {2021}, } Publisher's Version 

Leme, Renato Paes 
STOC '21: "Combinatorial Bernoulli Factories: ..."
Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes
Rad Niazadeh, Renato Paes Leme, and Jon Schneider (University of Chicago, USA; Google Research, USA) A Bernoulli factory is an algorithmic procedure for exact sampling of certain random variables having only Bernoulli access to their parameters. Bernoulli access to a parameter p ∈ [0,1] means the algorithm does not know p, but has sample access to independent draws of a Bernoulli random variable with mean equal to p. In this paper, we study the problem of Bernoulli factories for polytopes: given Bernoulli access to a vector x∈ P for a given polytope P⊂ [0,1]^{n}, output a randomized vertex such that the expected value of the ith coordinate is exactly equal to x_{i}. For example, for the special case of the perfect matching polytope, one is given Bernoulli access to the entries of a doubly stochastic matrix [x_{ij}] and asked to sample a matching such that the probability of each edge (i,j) be present in the matching is exactly equal to x_{ij}. We show that a polytope P admits a Bernoulli factory if and and only if P is the intersection of [0,1]^{n} with an affine subspace. Our construction is based on an algebraic formulation of the problem, involving identifying a family of Bernstein polynomials (one per vertex) that satisfy a certain algebraic identity on P. The main technical tool behind our construction is a connection between these polynomials and the geometry of zonotope tilings. We apply these results to construct an explicit factory for the perfect matching polytope. The resulting factory is deeply connected to the combinatorial enumeration of arborescences and may be of independent interest. For the kuniform matroid polytope, we recover a sampling procedure known in statistics as Sampford sampling. @InProceedings{STOC21p833, author = {Rad Niazadeh and Renato Paes Leme and Jon Schneider}, title = {Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {833846}, doi = {10.1145/3406325.3451072}, year = {2021}, } Publisher's Version 

Leonardi, Stefano 
STOC '21: "Efficient TwoSided Markets ..."
Efficient TwoSided Markets with Limited Information
Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser (Google Research, Switzerland; Sapienza University of Rome, Italy) A celebrated impossibility result by Myerson and Satterthwaite (1983) shows that any truthful mechanism for twosided markets that maximizes social welfare must run a deficit, resulting in a necessity to relax welfare efficiency and the use of approximation mechanisms. Such mechanisms in general make extensive use of the Bayesian priors. In this work, we investigate a question of increasing theoretical and practical importance: how much prior information is required to design mechanisms with nearoptimal approximations? Our first contribution is a more general impossibility result stating that no meaningful approximation is possible without any prior information, expanding the famous impossibility result of Myerson and Satterthwaite. Our second contribution is that one single sample (one number per item), arguably a minimumpossible amount of prior information, from each seller distribution is sufficient for a large class of twosided markets. We prove matching upper and lower bounds on the best approximation that can be obtained with one single sample for subadditive buyers and additive sellers, regardless of computational considerations. Our third contribution is the design of computationally efficient blackbox reductions that turn any onesided mechanism into a twosided mechanism with a small loss in the approximation, while using only one single sample from each seller. On the way, our blackboxtype mechanisms deliver several interesting positive results in their own right, often beating even the state of the art that uses full prior information. @InProceedings{STOC21p1452, author = {Paul Dütting and Federico Fusco and Philip Lazos and Stefano Leonardi and Rebecca Reiffenhäuser}, title = {Efficient TwoSided Markets with Limited Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14521465}, doi = {10.1145/3406325.3451076}, year = {2021}, } Publisher's Version STOC '21: "Flow Time Scheduling with ..." Flow Time Scheduling with Uncertain Processing Time Yossi Azar, Stefano Leonardi, and Noam Touitou (Tel Aviv University, Israel; Sapienza University of Rome, Italy) We consider the problem of online scheduling on a single machine in order to minimize weighted flow time. The existing algorithms for this problem (STOC ’01, SODA ’03, FOCS ’18) all require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in reallife scenarios. In this paper, we present the first algorithm for weighted flow time which do not require exact knowledge of the processing times of jobs. Specifically, we introduce the Scheduling with Predicted Processing Time (SPPT) problem, where the algorithm is given a prediction for the processing time of each job, instead of its real processing time. For the case of a constant factor distortion between the predictions and the real processing time, our algorithms match all the best known competitiveness bounds for weighted flow time – namely O(logP), O(logD) and O(logW), where P,D,W are the maximum ratios of processing times, densities, and weights, respectively. For larger errors, the competitiveness of our algorithms degrades gracefully. @InProceedings{STOC21p1070, author = {Yossi Azar and Stefano Leonardi and Noam Touitou}, title = {Flow Time Scheduling with Uncertain Processing Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10701080}, doi = {10.1145/3406325.3451023}, year = {2021}, } Publisher's Version 

Levin, Leonid A. 
STOC '21: "Climbing Algorithms (Invited ..."
Climbing Algorithms (Invited Talk)
Leonid A. Levin (Boston University, USA) NP (search) problems allow easy correctness tests for solutions. Climbing algorithms allow also easy assessment of how close to yielding the correct answer is the configuration at any stage of their run. This offers a great flexibility, as how sensible is any deviation from the standard procedures can be instantly assessed. An example is the Dual Matrix Algorithm (DMA) for linear programming, variations of which were considered by A.Y.Levin in 1965 and by Yamnitsky and myself in 1982. It has little sensitivity to numerical errors and to the number of inequalities. It offers substantial flexibility and, thus, potential for further developments. @InProceedings{STOC21p2, author = {Leonid A. Levin}, title = {Climbing Algorithms (Invited Talk)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {23}, doi = {10.1145/3406325.3457137}, year = {2021}, } Publisher's Version 

Li, Jason 
STOC '21: "Deterministic Mincut in AlmostLinear ..."
Deterministic Mincut in AlmostLinear Time
Jason Li (Carnegie Mellon University, USA) We present a deterministic (global) mincut algorithm for weighted, undirected graphs that runs in m^{1+o(1)} time, answering an open question of Karger from the 1990s. To obtain our result, we derandomize the construction of the skeleton graph in Karger’s nearlinear time mincut algorithm, which is its only randomized component. In particular, we partially derandomize the wellknown BenczurKarger graph sparsification technique by random sampling, which we accomplish by the method of pessimistic estimators. Our main technical component is designing an efficient pessimistic estimator to capture the cuts of a graph, which involves harnessing the expander decomposition framework introduced in recent work by Goranci et al. (SODA 2021). As a sideeffect, we obtain a structural representation of all approximate mincuts in a graph, which may have future applications. @InProceedings{STOC21p384, author = {Jason Li}, title = {Deterministic Mincut in AlmostLinear Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {384395}, doi = {10.1145/3406325.3451114}, year = {2021}, } Publisher's Version STOC '21: "A Quasipolynomial (2 + ε)Approximation ..." A Quasipolynomial (2 + ε)Approximation for Planar Sparsest Cut Vincent CohenAddad, Anupam Gupta, Philip N. Klein, and Jason Li (Google, Switzerland; Carnegie Mellon University, USA; Brown University, USA) The (nonuniform) sparsest cut problem is the following graphpartitioning problem: given a “supply” graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constantfactor approximations are known. We consider the problem for planar graphs, and give a (2+)approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a “patching” primitive. We combine this decomposition with a SheraliAdamsstyle linear programming relaxation of the problem, which we then round. This should be compared with the polynomialtime approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and ℓ_{1}embeddings, and achieves an O(√logn)approximation in polynomial time. @InProceedings{STOC21p1056, author = {Vincent CohenAddad and Anupam Gupta and Philip N. Klein and Jason Li}, title = {A Quasipolynomial (2 + <i>ε</i>)Approximation for Planar Sparsest Cut}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10561069}, doi = {10.1145/3406325.3451103}, year = {2021}, } Publisher's Version STOC '21: "Approximate Gomory–Hu Tree ..." Approximate Gomory–Hu Tree Is Faster Than n – 1 MaxFlows Jason Li and Debmalya Panigrahi (Carnegie Mellon University, USA; Duke University, USA) The GomoryHu tree or cut tree (Gomory and Hu, 1961) is a classic data structure for reporting s−t mincuts (and by duality, the values of s−t maxflows) for all pairs of vertices s and t in an undirected graph. Gomory and Hu showed that it can be computed using n−1 exact maxflow computations. Surprisingly, this remains the best algorithm for GomoryHu trees more than 50 years later, even for approximate mincuts. In this paper, we break this longstanding barrier and give an algorithm for computing a (1+є)approximate GomoryHu tree using log(n) maxflow computations. Specifically, we obtain the runtime bounds we describe below. We obtain a randomized (Monte Carlo) algorithm for undirected, weighted graphs that runs in Õ(m + n^{3/2}) time and returns a (1+є)approximate GomoryHu tree algorithm whp. Previously, the best running time known was Õ(n^{5/2}), which is obtained by running Gomory and Hu’s original algorithm on a cut sparsifier of the graph. Next, we obtain a randomized (Monte Carlo) algorithm for undirected, unweighted graphs that runs in m^{4/3+o(1)} time and returns a (1+є)approximate GomoryHu tree algorithm whp. This improves on our first result for sparse graphs, namely m = o(n^{9/8}). Previously, the best running time known for unweighted graphs was Õ(mn) for an exact GomoryHu tree (Bhalgat et al., STOC 2007); no better result was known if approximations are allowed. As a consequence of our GomoryHu tree algorithms, we also solve the (1+є)approximate all pairs mincut and single source mincut problems in the same time bounds. (These problems are simpler in that the goal is to only return the s−t mincut values, and not the mincuts.) This improves on the recent algorithm for these problems in Õ(n^{2}) time due to Abboud et al. (FOCS 2020). @InProceedings{STOC21p1738, author = {Jason Li and Debmalya Panigrahi}, title = {Approximate Gomory–Hu Tree Is Faster Than <i>n</i> – 1 MaxFlows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17381748}, doi = {10.1145/3406325.3451112}, year = {2021}, } Publisher's Version STOC '21: "Vertex Connectivity in Polylogarithmic ..." Vertex Connectivity in Polylogarithmic MaxFlows Jason Li, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Sorrachai Yingchareonthawornchai (Carnegie Mellon University, USA; University of Copenhagen, Denmark; KTH, Sweden; Duke University, USA; University of Michigan, USA; Aalto University, Finland) The vertex connectivity of an medge nvertex undirected graph is the smallest number of vertices whose removal disconnects the graph, or leaves only a singleton vertex. In this paper, we give a reduction from the vertex connectivity problem to a set of maxflow instances. Using this reduction, we can solve vertex connectivity in (m^{α}) time for any α ≥ 1, if there is a m^{α}time maxflow algorithm. Using the current best maxflow algorithm that runs in m^{4/3+o(1)} time (Kathuria, Liu and Sidford, FOCS 2020), this yields a m^{4/3+o(1)}time vertex connectivity algorithm. This is the first improvement in the running time of the vertex connectivity problem in over 20 years, the previous best being an Õ(mn)time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm with an o(mn) running time was known before our work, even if we assume an (m)time maxflow algorithm. Our new technique is robust enough to also improve the best Õ(mn)time bound for directed vertex connectivity to mn^{1−1/12+o(1)} time @InProceedings{STOC21p317, author = {Jason Li and Danupon Nanongkai and Debmalya Panigrahi and Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai}, title = {Vertex Connectivity in Polylogarithmic MaxFlows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {317329}, doi = {10.1145/3406325.3451088}, year = {2021}, } Publisher's Version 

Li, Ray 
STOC '21: "Settling SETH vs. Approximate ..."
Settling SETH vs. Approximate Sparse Directed Unweighted Diameter (up to (NU)NSETH)
Ray Li (Stanford University, USA) We prove several tight results on the finegrained complexity of approximating the diameter of a graph. First, we prove that, for any ε>0, assuming the Strong Exponential Time Hypothesis (SETH), there are no nearlinear time 2−εapproximation algorithms for the Diameter of a sparse directed graph, even in unweighted graphs. This result shows that a simple nearlinear time 2approximation algorithm for Diameter is optimal under SETH, answering a question from a survey of Rubinstein and VassilevskaWilliams (SIGACT ’19) for the case of directed graphs. In the same survey, Rubinstein and VassilevskaWilliams also asked if it is possible to show that there are no 2−ε approximation algorithms for Diameter in a directed graph in O(n^{1.499}) time. We show that, assuming a hypothesis called NSETH, one cannot use a deterministic SETHbased reduction to rule out the existence of such algorithms. Extending the techniques in these two results, we characterize whether a 2−ε approximation algorithm running in time O(n^{1+δ}) for the Diameter of a sparse directed unweighted graph can be ruled out by a deterministic SETHbased reduction for every δ∈(0,1) and essentially every ε∈(0,1), assuming NSETH. This settles the SETHhardness of approximating the diameter of sparse directed unweighted graphs for deterministic reductions, up to NSETH. We make the same characterization for randomized SETHbased reductions, assuming another hypothesis called NUNSETH. We prove additional hardness and nonreducibility results for undirected graphs. @InProceedings{STOC21p1684, author = {Ray Li}, title = {Settling SETH vs. Approximate Sparse Directed Unweighted Diameter (up to (NU)NSETH)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16841696}, doi = {10.1145/3406325.3451045}, year = {2021}, } Publisher's Version 

Li, Yingkai 
STOC '21: "Revelation Gap for Pricing ..."
Revelation Gap for Pricing from Samples
Yiding Feng, Jason D. Hartline, and Yingkai Li (Northwestern University, USA) This paper considers priorindependent mechanism design, in which a single mechanism is designed to achieve approximately optimal performance on every prior distribution from a given class. Most results in this literature focus on mechanisms with truthtelling equilibria, a.k.a., truthful mechanisms. Feng and Hartline [FOCS 2018] introduce the revelation gap to quantify the loss of the restriction to truthful mechanisms. We solve a main open question left in Feng and Hartline [FOCS 2018]; namely, we identify a nontrivial revelation gap for revenue maximization. Our analysis focuses on the canonical problem of selling a single item to a single agent with only access to a single sample from the agent's valuation distribution. We identify the samplebid mechanism (a simple nontruthful mechanism) and upperbound its priorindependent approximation ratio by 1.835 (resp. 1.296) for regular (resp. MHR) distributions. We further prove that no truthful mechanism can achieve priorindependent approximation ratio better than 1.957 (resp. 1.543) for regular (resp. MHR) distributions. Thus, a nontrivial revelation gap is shown as the samplebid mechanism outperforms the optimal priorindependent truthful mechanism. On the hardness side, we prove that no (possibly nontruthful) mechanism can achieve priorindependent approximation ratio better than 1.073 even for uniform distributions. @InProceedings{STOC21p1438, author = {Yiding Feng and Jason D. Hartline and Yingkai Li}, title = {Revelation Gap for Pricing from Samples}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {14381451}, doi = {10.1145/3406325.3451057}, year = {2021}, } Publisher's Version 

Ligett, Katrina 
STOC '21: "A New Analysis of Differential ..."
A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)
Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed SharifiMalvajerdi, and Moshe Shenfeld (University of Pennsylvania, USA; Hebrew University of Jerusalem, Israel) We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sampleaccurate is also accurate outofsample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds. @InProceedings{STOC21p9, author = {Christopher Jung and Katrina Ligett and Seth Neel and Aaron Roth and Saeed SharifiMalvajerdi and Moshe Shenfeld}, title = {A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {99}, doi = {10.1145/3406325.3465358}, year = {2021}, } Publisher's Version 

Lin, Bingkai 
STOC '21: "Constant Approximating kClique ..."
Constant Approximating kClique Is W[1]Hard
Bingkai Lin (Nanjing University, China) For every graph G, let ω(G) be the largest size of complete subgraph in G. This paper presents a simple algorithm which, on input a graph G, a positive integer k and a small constant є>0, outputs a graph G′ and an integer k′ in 2^{Θ(k5)}· G^{O(1)}time such that (1) k′≤ 2^{Θ(k5)}, (2) if ω(G)≥ k, then ω(G′)≥ k′, (3) if ω(G)<k, then ω(G′)< (1−є)k′. This implies that no f(k)· G^{O(1)}time algorithm can distinguish between the cases ω(G)≥ k and ω(G)<k/c for any constant c≥ 1 and computable function f, unless FPT= W[1]. @InProceedings{STOC21p1749, author = {Bingkai Lin}, title = {Constant Approximating kClique Is W[1]Hard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17491756}, doi = {10.1145/3406325.3451016}, year = {2021}, } Publisher's Version 

Lin, Huijia 
STOC '21: "Indistinguishability Obfuscation ..."
Indistinguishability Obfuscation from WellFounded Assumptions
Aayush Jain, Huijia Lin, and Amit Sahai (University of California at Los Angeles, USA; University of Washington, USA) Indistinguishability obfuscation, introduced by [Barak et. al. Crypto 2001], aims to compile programs into unintelligible ones while preserving functionality. It is a fascinating and powerful object that has been shown to enable a host of new cryptographic goals and beyond. However, constructions of indistinguishability obfuscation have remained elusive, with all other proposals relying on heuristics or newly conjectured hardness assumptions. In this work, we show how to construct indistinguishability obfuscation from subexponential hardness of four wellfounded assumptions. We prove: Informal Theorem: Let τ ∈ (0,∞), δ ∈ (0,1), ∈ (0,1) be arbitrary constants. Assume subexponential security of the following assumptions:  the Learning With Errors (LWE) assumption with subexponential modulustonoise ratio 2^{kє} and noises of magnitude polynomial in k, where k is the dimension of the LWE secret,  the Learning Parity with Noise (LPN) assumption over general prime fields ℤ_{p} with polynomially many LPN samples and error rate 1/ℓ^{δ}, where ℓ is the dimension of the LPN secret,  the existence of a Boolean PseudoRandom Generator (PRG) in NC^{0} with stretch n^{1+τ}, where n is the length of the PRG seed,  the Decision Linear (DLIN) assumption on symmetric bilinear groups of prime order. Then, (subexponentially secure) indistinguishability obfuscation for all polynomialsize circuits exists. Further, assuming only polynomial security of the aforementioned assumptions, there exists collusion resistant publickey functional encryption for all polynomialsize circuits. @InProceedings{STOC21p60, author = {Aayush Jain and Huijia Lin and Amit Sahai}, title = {Indistinguishability Obfuscation from WellFounded Assumptions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {6073}, doi = {10.1145/3406325.3451093}, year = {2021}, } Publisher's Version 

Liu, Allen 
STOC '21: "Settling the Robust Learnability ..."
Settling the Robust Learnability of Mixtures of Gaussians
Allen Liu and Ankur Moitra (Massachusetts Institute of Technology, USA) This work represents a natural coalescence of two important lines of work – learning mixtures of Gaussians and algorithmic robust statistics. In particular we give the first provably robust algorithm for learning mixtures of any constant number of Gaussians. We require only mild assumptions on the mixing weights (bounded fractionality) and that the total variation distance between components is bounded away from zero. At the heart of our algorithm is a new method for proving dimensionindependent polynomial identifiability through applying a carefully chosen sequence of differential operations to certain generating functions that not only encode the parameters we would like to learn but also the system of polynomial equations we would like to solve. We show how the symbolic identities we derive can be directly used to analyze a natural sumofsquares relaxation. @InProceedings{STOC21p518, author = {Allen Liu and Ankur Moitra}, title = {Settling the Robust Learnability of Mixtures of Gaussians}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {518531}, doi = {10.1145/3406325.3451084}, year = {2021}, } Publisher's Version 

Liu, Kuikui 
STOC '21: "Optimal Mixing of Glauber ..."
Optimal Mixing of Glauber Dynamics: Entropy Factorization via HighDimensional Expansion
Zongchen Chen, Kuikui Liu, and Eric Vigoda (Georgia Institute of Technology, USA; University of Washington, USA) We prove an optimal mixing time bound for the singlesite update Markov chain known as the Glauber dynamics or Gibbs sampling in a variety of settings. Our work presents an improved version of the spectral independence approach of Anari et al. (2020) and shows O(nlogn) mixing time on any nvertex graph of bounded degree when the maximum eigenvalue of an associated influence matrix is bounded. As an application of our results, for the hardcore model on independent sets weighted by a fugacity λ, we establish O(nlogn) mixing time for the Glauber dynamics on any nvertex graph of constant maximum degree Δ when λ<λ_{c}(Δ) where λ_{c}(Δ) is the critical point for the uniqueness/nonuniqueness phase transition on the Δregular tree. More generally, for any antiferromagnetic 2spin system we prove O(nlogn) mixing time of the Glauber dynamics on any bounded degree graph in the corresponding tree uniqueness region. Our results apply more broadly; for example, we also obtain O(nlogn) mixing for qcolorings of trianglefree graphs of maximum degree Δ when the number of colors satisfies q > α Δ where α ≈ 1.763, and O(mlogn) mixing for generating random matchings of any graph with bounded degree and m edges. Our approach is based on two steps. First, we show that the approximate tensorization of entropy (i.e., factorizing entropy into single vertices), which is a key step for establishing the modified logSobolev inequality in many previous works, can be deduced from entropy factorization into blocks of fixed linear size. Second, we adapt the localtoglobal scheme of Alev and Lau (2020) to establish such block factorization of entropy in a more general setting of pure weighted simplicial complexes satisfying local spectral expansion; this also substantially generalizes the result of Cryan et al. (2019). @InProceedings{STOC21p1537, author = {Zongchen Chen and Kuikui Liu and Eric Vigoda}, title = {Optimal Mixing of Glauber Dynamics: Entropy Factorization via HighDimensional Expansion}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15371550}, doi = {10.1145/3406325.3451035}, year = {2021}, } Publisher's Version STOC '21: "LogConcave Polynomials IV: ..." LogConcave Polynomials IV: Approximate Exchange, Tight Mixing Times, and NearOptimal Sampling of Forests Nima Anari, Kuikui Liu, Shayan Oveis Gharan, Cynthia Vinzant, and ThuyDuong Vuong (Stanford University, USA; University of Washington, USA; North Carolina State University, USA) We prove tight mixing time bounds for natural random walks on bases of matroids, determinantal distributions, and more generally distributions associated with logconcave polynomials. For a matroid of rank k on a ground set of n elements, or more generally distributions associated with logconcave polynomials of homogeneous degree k on n variables, we show that the downup random walk, started from an arbitrary point in the support, mixes in time O(klogk). Our bound has no dependence on n or the starting point, unlike the previous analyses of Anari et al. (STOC 2019), Cryan et al. (FOCS 2019), and is tight up to constant factors. The main new ingredient is a property we call approximate exchange, a generalization of wellstudied exchange properties for matroids and valuated matroids, which may be of independent interest. In particular, given a distribution µ over sizek subsets of [n], our approximate exchange property implies that a simple local search algorithm gives a k^{O(k)}approximation of max_{S} µ(S) when µ is generated by a logconcave polynomial, and that greedy gives the same approximation ratio when µ is strongly Rayleigh. As an application, we show how to leverage downup random walks to approximately sample random forests or random spanning trees in a graph with n edges in time O(nlog^{2} n). The best known result for sampling random forest was a FPAUS with high polynomial runtime recently found by Anari et al. (STOC 2019), Cryan et al. (FOCS 2019). For spanning tree, we improve on the almostlinear time algorithm by Schild (STOC 2018). Our analysis works on weighted graphs too, and is the first to achieve nearlylinear running time for these problems. Our algorithms can be naturally extended to support approximately sampling from random forests of size between k_{1} and k_{2} in time O(n log^{2} n), for fixed parameters k_{1}, k_{2}. @InProceedings{STOC21p408, author = {Nima Anari and Kuikui Liu and Shayan Oveis Gharan and Cynthia Vinzant and ThuyDuong Vuong}, title = {LogConcave Polynomials IV: Approximate Exchange, Tight Mixing Times, and NearOptimal Sampling of Forests}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {408420}, doi = {10.1145/3406325.3451091}, year = {2021}, } Publisher's Version Info 

Liu, Yang P. 
STOC '21: "Minimum Cost Flows, MDPs, ..."
Minimum Cost Flows, MDPs, and ℓ_{1}Regression in Nearly Linear Time for Dense Instances
Jan van den Brand, Yin Tat Lee, Yang P. Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang (KTH, Sweden; University of Washington, USA; Microsoft Research, USA; Stanford University, USA; University of Michigan, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA; Google Research, USA) In this paper we provide new randomized algorithms with improved runtimes for solving linear programs with twosided constraints. In the special case of the minimum cost flow problem on nvertex medge graphs with integer polynomiallybounded costs and capacities we obtain a randomized method which solves the problem in Õ(m + n^{1.5}) time. This improves upon the previous best runtime of Õ(m √n) [LeeSidford’14] and, in the special case of unitcapacity maximum flow, improves upon the previous best runtimes of m^{4/3 + o(1)} [LiuSidford’20, Kathuria’20] and Õ(m √n) [LeeSidford’14] for sufficiently dense graphs. In the case of ℓ_{1}regression in a matrix with ncolumns and mrows we obtain a randomized method which computes an єapproximate solution in Õ(mn + n^{2.5}) time. This yields a randomized method which computes an єoptimal policy of a discounted Markov Decision Process with S states and, A actions per state in time Õ(S^{2} A + S^{2.5}). These methods improve upon the previous best runtimes of methods which depend polylogarithmically on problem parameters, which were Õ(mn^{1.5}) [LeeSidford’15] and Õ(S^{2.5} A) [LeeSidford’14, SidfordWangWuYe’18] respectively. To obtain this result we introduce two new algorithmic tools of possible independent interest. First, we design a new general interior point method for solving linear programs with two sided constraints which combines techniques from [LeeSongZhang’19, Brand et al.’20] to obtain a robust stochastic method with iteration count nearly the square root of the smaller dimension. Second, to implement this method we provide dynamic data structures for efficiently maintaining approximations to variants of Lewisweights, a fundamental importance measure for matrices which generalize leverage scores and effective resistances. @InProceedings{STOC21p859, author = {Jan van den Brand and Yin Tat Lee and Yang P. Liu and Thatchaphol Saranurak and Aaron Sidford and Zhao Song and Di Wang}, title = {Minimum Cost Flows, MDPs, and ℓ<sub>1</sub>Regression in Nearly Linear Time for Dense Instances}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {859869}, doi = {10.1145/3406325.3451108}, year = {2021}, } Publisher's Version STOC '21: "Discrepancy Minimization via ..." Discrepancy Minimization via a SelfBalancing Walk Ryan Alweiss, Yang P. Liu, and Mehtaab Sawhney (Princeton University, USA; Stanford University, USA; Massachusetts Institute of Technology, USA) We study discrepancy minimization for vectors in ℝ^{n} under various settings. The main result is the analysis of a new simple random process in high dimensions through a comparison argument. As corollaries, we obtain bounds which are tight up to logarithmic factors for online vector balancing against oblivious adversaries, resolving several questions posed by Bansal, Jiang, Singla, and Sinha (STOC 2020), as well as a linear time algorithm for logarithmic bounds for the Komlós conjecture. @InProceedings{STOC21p14, author = {Ryan Alweiss and Yang P. Liu and Mehtaab Sawhney}, title = {Discrepancy Minimization via a SelfBalancing Walk}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1420}, doi = {10.1145/3406325.3450994}, year = {2021}, } Publisher's Version Video 

Liu, Yanyi 
STOC '21: "Cryptography from SublinearTime ..."
Cryptography from SublinearTime AverageCase Hardness of TimeBounded Kolmogorov Complexity
Yanyi Liu and Rafael Pass (Cornell University, USA) Let MK^{t}P[s] be the set of strings x such that K^{t}(x) ≤ s(x), where K^{t}(x) denotes the tbounded Kolmogorov complexity of the truthtable described by x. Our main theorem shows that for an appropriate notion of mild averagecase hardness, for every ε>0, polynomial t(n) ≥ (1+ε)n, and every “nice” class F of superpolynomial functions, the following are equivalent: (i) the existence of some function T ∈ F such that Thard oneway functions (OWF) exists (with nonuniform security); (ii) the existence of some function T ∈ F such that MK^{t}P[T^{−1}] is mildly averagecase hard with respect to sublineartime nonuniform algorithms (with runningtime n^{δ} for some 0<δ<1). For instance, existence of subexponentiallyhard (resp. quasipolynomiallyhard) OWFs is equivalent to mild averagecase hardness of MK^{t}P[poly logn] (resp. MK^{t}P[2^{O(√logn)})]) w.r.t. sublineartime nonuniform algorithms. We additionally note that if we want to deduce Thard OWFs where security holds w.r.t. uniform Ttime probabilistic attackers (i.e., uniformlysecure OWFs), it suffices to assume sublinear time hardness of MK^{t}P w.r.t. uniform probabilistic sublineartime attackers. We complement this result by proving lower bounds that come surprisingly close to what is required to unconditionally deduce the existence of (uniformlysecure) OWFs: MK^{t}P[polylogn] is worstcase hard w.r.t. uniform probabilistic sublineartime algorithms, and MK^{t}P[n−logn] is mildly averagecase hard for all O(t(n)/n^{3})time deterministic algorithms. @InProceedings{STOC21p722, author = {Yanyi Liu and Rafael Pass}, title = {Cryptography from SublinearTime AverageCase Hardness of TimeBounded Kolmogorov Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {722735}, doi = {10.1145/3406325.3451121}, year = {2021}, } Publisher's Version 

Lokshtanov, Daniel 
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C_{>t} Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C_{>t}free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an nvertex C_{>t}free graph G with weights on vertices, finds in time n^{(log3 n)} a maximumweight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n^{(log2 n)} assuming G is P_{t}free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on P_{t}free graphs in two directions: by encompassing the more general setting of C_{>t}free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version 

Lombardi, Alex 
STOC '21: "Fiat–Shamir via ListRecoverable ..."
Fiat–Shamir via ListRecoverable Codes (or: Parallel Repetition of GMW Is Not ZeroKnowledge)
Justin Holmgren, Alex Lombardi, and Ron D. Rothblum (NTT Research, USA; Massachusetts Institute of Technology, USA; Technion, Israel) In a seminal work, Goldreich, Micali and Wigderson (CRYPTO ’86) demonstrated the wide applicability of zeroknowledge proofs by constructing such a proof system for the NPcomplete problem of graph 3coloring. A longstanding open question has been whether parallel repetition of their protocol preserves zero knowledge. In this work, we answer this question in the negative, assuming a standard cryptographic assumption (i.e., the hardness of learning with errors (LWE)). Leveraging a connection observed by Dwork, Naor, Reingold, and Stockmeyer (FOCS ’99), our negative result is obtained by making positive progress on a related fundamental problem in cryptography: securely instantiating the FiatShamir heuristic for eliminating interaction in publiccoin interactive protocols. A recent line of work has shown how to instantiate the heuristic securely, albeit only for a limited class of protocols. Our main result shows how to instantiate FiatShamir for parallel repetitions of much more general interactive proofs. In particular, we construct hash functions that, assuming LWE, securely realize the FiatShamir transform for the following rich classes of protocols: 1) The parallel repetition of any “commitandopen” protocol (such as the GMW protocol mentioned above), when a specific (natural) commitment scheme is used. Commitandopen protocols are a ubiquitous paradigm for constructing general purpose publiccoin zero knowledge proofs. 2) The parallel repetition of any base protocol that (1) satisfies a stronger notion of soundness called roundbyround soundness, and (2) has an efficient procedure, using a suitable trapdoor, for recognizing “bad verifier randomness” that would allow the prover to cheat. Our results are obtained by establishing a new connection between the FiatShamir transform and listrecoverable codes. In contrast to the usual focus in coding theory, we focus on a parameter regime in which the input lists are extremely large, but the rate can be small. We give a (probabilistic) construction based on ParvareshVardy codes (FOCS ’05) that suffices for our applications. @InProceedings{STOC21p750, author = {Justin Holmgren and Alex Lombardi and Ron D. Rothblum}, title = {Fiat–Shamir via ListRecoverable Codes (or: Parallel Repetition of GMW Is Not ZeroKnowledge)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {750760}, doi = {10.1145/3406325.3451116}, year = {2021}, } Publisher's Version 

Lovett, Shachar 
STOC '21: "LogRank and Lifting for ANDFunctions ..."
LogRank and Lifting for ANDFunctions
Alexander Knop, Shachar Lovett, Sam McGuire, and Weiqiang Yuan (University of California at San Diego, USA; Tsinghua University, China) Let f: {0, 1}^{n} → {0, 1} be a boolean function, and let f_{∧}(x, y) = f(x ∧ y) denote the ANDfunction of f, where x ∧ y denotes bitwise AND. We study the deterministic communication complexity of f_{∧} and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f_{∧}. This comes within a logn factor of establishing the logrank conjecture for ANDfunctions with no assumptions on f. Our result stands in contrast with previous results on special cases of the logrank conjecture, which needed significant restrictions on f such as monotonicity or low F_{2}degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for ANDfunctions, stating that the deterministic communication complexity of f_{∧} is polynomially related to the ANDdecision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}^{n} → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size polylogarithmic in its sparsity. We also establish extensions of this result to multilinear polynomials f: {0, 1}^{n} → with a larger range. @InProceedings{STOC21p197, author = {Alexander Knop and Shachar Lovett and Sam McGuire and Weiqiang Yuan}, title = {LogRank and Lifting for ANDFunctions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {197208}, doi = {10.1145/3406325.3450999}, year = {2021}, } Publisher's Version 

Lu, Zhenjian 
STOC '21: "Pseudodeterministic Algorithms ..."
Pseudodeterministic Algorithms and the Structure of Probabilistic Time
Zhenjian Lu, Igor C. Oliveira, and Rahul Santhanam (University of Warwick, UK; University of Oxford, UK) We connect the study of pseudodeterministic algorithms to two major open problems about the structural complexity of BPTIME: proving hierarchy theorems and showing the existence of complete problems. Our main contributions can be summarised as follows. A new pseudorandom generator and its consequences. We build on techniques developed to prove hierarchy theorems for probabilistic time with advice (Fortnow and Santhanam, FOCS 2004) to construct the first unconditional pseudorandom generator of polynomial stretch computable in pseudodeterministic polynomial time (with one bit of advice) that is secure infinitely often against polynomialtime computations. As an application of this construction, we obtain new results about the complexity of generating and representing prime numbers. For instance, we show unconditionally for each ε > 0 that infinitely many primes p_{n} have a succinct representation in the following sense: there is a fixed probabilistic polynomial time algorithm that generates p_{n} with high probability from its succinct representation of size O(p_{n}^{ε}). This offers an exponential improvement over the running time of previous results, and shows that infinitely many primes have succinct and efficient representations. Structural results for probabilistic time from pseudodeterministic algorithms. Oliveira and Santhanam (STOC 2017) established unconditionally that there is a pseudodeterministic algorithm for the Circuit Acceptance Probability Problem (CAPP) that runs in subexponential time and is correct with high probability over any samplable distribution on circuits on infinitely many input lengths. We show that improving this running time or obtaining a result that holds for every large input length would imply new time hierarchy theorems for probabilistic time. In addition, we prove that a worstcase polynomialtime pseudodeterministic algorithm for CAPP would imply that BPP has complete problems. Equivalence between pseudodeterministic constructions and hierarchies. We establish an equivalence between a certain explicit pseudodeterministic construction problem and the existence of strong hierarchy theorems for probabilistic time. More precisely, we show that pseudodeterministically constructing in exponential time strings of large rKt complexity (Oliveira, ICALP 2019) is possible if and only if for every constructive function T(n) ≤ exp(o(exp(n))) we have BPTIME[poly(T)] ⊈ i.o.BPTIME[T]/logT. More generally, these results suggest new approaches for designing pseudodeterministic algorithms for search problems and for unveiling the structure of probabilistic time. @InProceedings{STOC21p303, author = {Zhenjian Lu and Igor C. Oliveira and Rahul Santhanam}, title = {Pseudodeterministic Algorithms and the Structure of Probabilistic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {303316}, doi = {10.1145/3406325.3451085}, year = {2021}, } Publisher's Version 

Lykouris, Thodoris 
STOC '21: "Contextual Search in the Presence ..."
Contextual Search in the Presence of Irrational Agents
Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, and Robert Schapire (Microsoft Research, USA; Harvard University, USA) We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as featurebased dynamic pricing. Standard gametheoretic formulations of this problem assume that agents act in accordance with a specific behavioral model. In practice, some agents may not subscribe to the dominant behavioral model or may act in ways that are seemingly arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance even with a few arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. Our techniques draw inspiration from learning theory, game theory, highdimensional geometry, and convex analysis. @InProceedings{STOC21p910, author = {Akshay Krishnamurthy and Thodoris Lykouris and Chara Podimata and Robert Schapire}, title = {Contextual Search in the Presence of Irrational Agents}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {910918}, doi = {10.1145/3406325.3451120}, year = {2021}, } Publisher's Version 

Lyu, Xin 
STOC '21: "InverseExponential Correlation ..."
InverseExponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma
Lijie Chen and Xin Lyu (Massachusetts Institute of Technology, USA; Tsinghua University, China) In this work we prove that there is a function f ∈ E^{ NP} such that, for every sufficiently large n and d = √n/logn, f_{n} (f restricted to nbit inputs) cannot be (1/2 + 2^{−d})approximated by F_{2}polynomials of degree d. We also observe that a minor improvement (e.g., improving d to n^{1/2+ε} for any ε > 0) over our result would imply E^{ NP} cannot be computed by depth3 AC^{0}circuits of 2^{n1/2 + ε} size, which is a notoriously hard open question in complexity theory. Using the same proof techniques, we are also able to construct extremely rigid matrices over F_{2} in P^{ NP}. More specifically, we show that for every constant ε ∈ (0,1), there is a P^{ NP} algorithm which on input 1^{n} outputs an n× n F_{2}matrix H_{n} satisfying R_{Hn}(2^{log1 − ε n}) ≥ (1/2 − exp(−log^{2/3 · ε} n) ) · n^{2}, for every sufficiently large n. This improves the recent P^{ NP} constructions of rigid matrices in [Alman and Chen, FOCS 2019] and [Bhangale et al., FOCS 2020], which only give Ω(n^{2}) rigidity. The key ingredient in the proof of our new results is a new derandomized XOR lemma based on approximate linear sums, which roughly says that given an ninput function f which cannot be 0.99approximated by certain linear sum of s many functions in F within ℓ_{1}distance, one can construct a new function Amp^{f} with O(n) input bits, which cannot be (1/2+s^{Ω(1)})approximated by Ffunctions. Taking F to be a function collection containing lowdegree F_{2}polynomials or lowrank F_{2}matrices, our results are then obtained by first using the algorithmic method to construct a function which is weakly hard against linear sums of F in the above sense, and then applying the derandomized XOR lemma to f. We obtain our new derandomized XOR lemma by giving a generalization of the famous hardcore lemma by Impagliazzo. Our generalization in some sense constructs a nonBoolean hardcore of a weakly hard function f with respect to Ffunctions, from the weak inapproximability of f by any linear sum of F with bounded ℓ_{p}norm. This generalization recovers the original hardcore lemma by considering the ℓ_{∞}norm. Surprisingly, when we switch to the ℓ_{1}norm, we immediately rediscover Levin’s proof of Yao’s XOR Lemma. That is, these first two proofs of Yao’s XOR Lemma can be unified with our new perspective. For proving the correlation bounds, our new derandomized XOR lemma indeed works with the ℓ_{4/3}norm. @InProceedings{STOC21p761, author = {Lijie Chen and Xin Lyu}, title = {InverseExponential Correlation Bounds and Extremely Rigid Matrices from a New Derandomized XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {761771}, doi = {10.1145/3406325.3451132}, year = {2021}, } Publisher's Version Info 

Mangoubi, Oren 
STOC '21: "Greedy Adversarial Equilibrium: ..."
Greedy Adversarial Equilibrium: An Efficient Alternative to NonconvexNonconcave MinMax Optimization
Oren Mangoubi and Nisheeth K. Vishnoi (Worcester Polytechnic Institute, USA; Yale University, USA) Minmax optimization of an objective function f: ℝ^{d} × ℝ^{d} → ℝ is an important model for robustness in an adversarial setting, with applications to many areas including optimization, economics, and deep learning. In many applications f may be nonconvexnonconcave, and finding a global minmax point may be computationally intractable. There is a long line of work that seeks computationally tractable algorithms for alternatives to the minmax optimization model. However, many of the alternative models have solution points which are only guaranteed to exist under strong assumptions on f, such as convexity, monotonicity, or special properties of the starting point. We propose an optimization model, the εgreedy adversarial equilibrium, and show that it can serve as a computationally tractable alternative to the minmax optimization model. Roughly, we say that a point (x^{⋆}, y^{⋆}) is an εgreedy adversarial equilibrium if y^{⋆} is an εapproximate local maximum for f(x^{⋆},·), and x^{⋆} is an εapproximate local minimum for a “greedy approximation” to the function max_{z} f(x, z) which can be efficiently estimated using secondorder optimization algorithms. We prove the existence of such a point for any smooth function which is bounded and has Lipschitz Hessian. To prove existence, we introduce an algorithm that converges from any starting point to an εgreedy adversarial equilibrium in a number of evaluations of the function f, the maxplayer’s gradient ∇_{y} f(x,y), and its Hessian ∇_{y}^{2} f(x,y), that is polynomial in the dimension d, 1/ε, and the bounds on f and its Lipschitz constant. @InProceedings{STOC21p896, author = {Oren Mangoubi and Nisheeth K. Vishnoi}, title = {Greedy Adversarial Equilibrium: An Efficient Alternative to NonconvexNonconcave MinMax Optimization}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {896909}, doi = {10.1145/3406325.3451097}, year = {2021}, } Publisher's Version 

Manurangsi, Pasin 
STOC '21: "SampleEfficient Proper PAC ..."
SampleEfficient Proper PAC Learning with Approximate Differential Privacy
Badih Ghazi, Noah Golowich, Ravi Kumar, and Pasin Manurangsi (Google Research, USA; Massachusetts Institute of Technology, USA) In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is Õ(d^{6}), ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2^{O(d)} on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al., which was also asked by Bousquet et al. (NeurIPS 2020). Using machinery developed by Bousquet et al., we then show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest. @InProceedings{STOC21p183, author = {Badih Ghazi and Noah Golowich and Ravi Kumar and Pasin Manurangsi}, title = {SampleEfficient Proper PAC Learning with Approximate Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {183196}, doi = {10.1145/3406325.3451028}, year = {2021}, } Publisher's Version 

Maus, Yannic 
STOC '21: "Efficient Randomized Distributed ..."
Efficient Randomized Distributed Coloring in CONGEST
Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan (Reykjavik University, Iceland; University of Freiburg, Germany; Technion, Israel) Distributed vertex coloring is one of the classic problems and probably also the most widely studied problems in the area of distributed graph algorithms. We present a new randomized distributed vertex coloring algorithm for the standard CONGEST model, where the network is modeled as an nnode graph G, and where the nodes of G operate in synchronous communication rounds in which they can exchange O(logn)bit messages over all the edges of G. For graphs with maximum degree Δ, we show that the (Δ+1)list coloring problem (and therefore also the standard (Δ+1)coloring problem) can be solved in O(log^{5}logn) rounds. Previously such a result was only known for the significantly more powerful LOCAL model, where in each round, neighboring nodes can exchange messages of arbitrary size. The best previous (Δ+1)coloring algorithm in the CONGEST model had a running time of O(logΔ + log^{6}logn) rounds. As a function of n alone, the best previous algorithm therefore had a round complexity of O(logn), which is a bound that can also be achieved by a na'ive folklore algorithm. For large maximum degree Δ, our algorithm hence is an exponential improvement over the previous state of the art. @InProceedings{STOC21p1180, author = {Magnús M. Halldórsson and Fabian Kuhn and Yannic Maus and Tigran Tonoyan}, title = {Efficient Randomized Distributed Coloring in CONGEST}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11801193}, doi = {10.1145/3406325.3451089}, year = {2021}, } Publisher's Version 

McGuire, Sam 
STOC '21: "LogRank and Lifting for ANDFunctions ..."
LogRank and Lifting for ANDFunctions
Alexander Knop, Shachar Lovett, Sam McGuire, and Weiqiang Yuan (University of California at San Diego, USA; Tsinghua University, China) Let f: {0, 1}^{n} → {0, 1} be a boolean function, and let f_{∧}(x, y) = f(x ∧ y) denote the ANDfunction of f, where x ∧ y denotes bitwise AND. We study the deterministic communication complexity of f_{∧} and show that, up to a logn factor, it is bounded by a polynomial in the logarithm of the real rank of the communication matrix of f_{∧}. This comes within a logn factor of establishing the logrank conjecture for ANDfunctions with no assumptions on f. Our result stands in contrast with previous results on special cases of the logrank conjecture, which needed significant restrictions on f such as monotonicity or low F_{2}degree. Our techniques can also be used to prove (within a logn factor) a lifting theorem for ANDfunctions, stating that the deterministic communication complexity of f_{∧} is polynomially related to the ANDdecision tree complexity of f. The results rely on a new structural result regarding boolean functions f: {0, 1}^{n} → {0, 1} with a sparse polynomial representation, which may be of independent interest. We show that if the polynomial computing f has few monomials then the set system of the monomials has a small hitting set, of size polylogarithmic in its sparsity. We also establish extensions of this result to multilinear polynomials f: {0, 1}^{n} → with a larger range. @InProceedings{STOC21p197, author = {Alexander Knop and Shachar Lovett and Sam McGuire and Weiqiang Yuan}, title = {LogRank and Lifting for ANDFunctions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {197208}, doi = {10.1145/3406325.3450999}, year = {2021}, } Publisher's Version 

McKenzie, Theo 
STOC '21: "Support of Closed Walks and ..."
Support of Closed Walks and Second Eigenvalue Multiplicity of Graphs
Theo McKenzie, Peter Michael Reichstein Rasmussen, and Nikhil Srivastava (University of California at Berkeley, USA; University of Copenhagen, Denmark) We show that the multiplicity of the second normalized adjacency matrix eigenvalue of any connected graph of maximum degree Δ is bounded by O(n Δ^{7/5}/log^{1/5−o(1)}n) for any Δ, and improve this to O(nlog^{1/2}d/log^{1/4−o(1)}n) for simple dregular graphs when d≥ log^{1/4}n. In fact, the same bounds hold for the number of eigenvalues in any interval of width λ_{2}/log_{Δ}^{1−o(1)}n containing the second eigenvalue λ_{2}. The main ingredient in the proof is a polynomial (in k) lower bound on the typical support of a closed random walk of length 2k in any connected graph, which in turn relies on new lower bounds for the entries of the Perron eigenvector of submatrices of the normalized adjacency matrix. @InProceedings{STOC21p396, author = {Theo McKenzie and Peter Michael Reichstein Rasmussen and Nikhil Srivastava}, title = {Support of Closed Walks and Second Eigenvalue Multiplicity of Graphs}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {396407}, doi = {10.1145/3406325.3451129}, year = {2021}, } Publisher's Version 

McSwiggen, Colin 
STOC '21: "Sampling Matrices from HarishChandra–Itzykson–Zuber ..."
Sampling Matrices from HarishChandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy
Jonathan Leake, Colin McSwiggen, and Nisheeth K. Vishnoi (TU Berlin, Germany; University of Tokyo, Japan; Yale University, USA) Given two Hermitian matrices Y and Λ, the HarishChandra–Itzykson–Zuber (HCIZ) distribution is given by the density e^{Tr(U Λ U*Y)} with respect to the Haar measure on the unitary group. Random unitary matrices distributed according to the HCIZ distribution are important in various settings in physics and random matrix theory, but the problem of sampling efficiently from this distribution has remained open. We present two algorithms to sample matrices from distributions that are close to the HCIZ distribution. The first produces samples that are ξclose in total variation distance, and the number of arithmetic operations required depends on poly(log1/ξ). The second produces samples that are ξclose in infinity divergence, but with a poly(1/ξ) dependence. Our results have the following applications: 1) an efficient algorithm to sample from complex versions of matrix Langevin distributions studied in statistics, 2) an efficient algorithm to sample from continuous maximum entropy distributions over unitary orbits, which in turn implies an efficient algorithm to sample a pure quantum state from the entropymaximizing ensemble representing a given density matrix, and 3) an efficient algorithm for differentially private rankk approximation that comes with improved utility bounds for k>1. @InProceedings{STOC21p1384, author = {Jonathan Leake and Colin McSwiggen and Nisheeth K. Vishnoi}, title = {Sampling Matrices from HarishChandra–Itzykson–Zuber Densities with Applications to Quantum Inference and Differential Privacy}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13841397}, doi = {10.1145/3406325.3451094}, year = {2021}, } Publisher's Version 

Minzer, Dor 
STOC '21: "New Separations Results for ..."
New Separations Results for External Information
Mark Braverman and Dor Minzer (Princeton University, USA; Massachusetts Institute of Technology, USA) We obtain new separation results for the twoparty external information complexity of Boolean functions. The external information complexity of a function f(x,y) is the minimum amount of information a twoparty protocol computing f must reveal to an outside observer about the input. We prove an exponential separation between external and internal information complexity, which is the best possible; previously no separation was known. We use this result in order to then prove a nearquadratic separation between amortized zeroerror communication complexity and external information complexity for total functions, disproving a conjecture of the first author. Finally, we prove a matching upper bound showing that our separation result is tight. @InProceedings{STOC21p248, author = {Mark Braverman and Dor Minzer}, title = {New Separations Results for External Information}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {248258}, doi = {10.1145/3406325.3451044}, year = {2021}, } Publisher's Version 

Moitra, Ankur 
STOC '21: "Settling the Robust Learnability ..."
Settling the Robust Learnability of Mixtures of Gaussians
Allen Liu and Ankur Moitra (Massachusetts Institute of Technology, USA) This work represents a natural coalescence of two important lines of work – learning mixtures of Gaussians and algorithmic robust statistics. In particular we give the first provably robust algorithm for learning mixtures of any constant number of Gaussians. We require only mild assumptions on the mixing weights (bounded fractionality) and that the total variation distance between components is bounded away from zero. At the heart of our algorithm is a new method for proving dimensionindependent polynomial identifiability through applying a carefully chosen sequence of differential operations to certain generating functions that not only encode the parameters we would like to learn but also the system of polynomial equations we would like to solve. We show how the symbolic identities we derive can be directly used to analyze a natural sumofsquares relaxation. @InProceedings{STOC21p518, author = {Allen Liu and Ankur Moitra}, title = {Settling the Robust Learnability of Mixtures of Gaussians}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {518531}, doi = {10.1145/3406325.3451084}, year = {2021}, } Publisher's Version STOC '21: "Algorithmic Foundations for ..." Algorithmic Foundations for the Diffraction Limit Sitan Chen and Ankur Moitra (Massachusetts Institute of Technology, USA) For more than a century and a half it has been widelybelieved (but was never rigorously shown) that the physics of diffraction imposes certain fundamental limits on the resolution of an optical system. However our understanding of what exactly can and cannot be resolved has never risen above heuristic arguments which, even worse, appear contradictory. In this work we remedy this gap by studying the diffraction limit as a statistical inverse problem and, based on connections to provable algorithms for learning mixture models, we rigorously prove upper and lower bounds on the statistical and algorithmic complexity needed to resolve closely spaced point sources. In particular we show that there is a phase transition where the sample complexity goes from polynomial to exponential. Surprisingly, we show that this does not occur at the Abbe limit, which has long been presumed to be the true diffraction limit. @InProceedings{STOC21p490, author = {Sitan Chen and Ankur Moitra}, title = {Algorithmic Foundations for the Diffraction Limit}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {490503}, doi = {10.1145/3406325.3451078}, year = {2021}, } Publisher's Version 

Moran, Shay 
STOC '21: "Learnability Can Be Independent ..."
Learnability Can Be Independent of Set Theory (Invited Paper)
Shai BenDavid, Pavel Hrubes, Shay Moran, Amir Shpilka, and Amir Yehudayoff (University of Waterloo, Canada; Czech Academy of Sciences, Czechia; Technion, Israel; Tel Aviv University, Israel) A fundamental result in statistical learning theory is the equivalence of PAC learnability of a class with the finiteness of its VapnikChervonenkis dimension. However, this clean result applies only to binary classification problems. In search for a similar combinatorial characterization of learnability in a more general setting, we discovered a surprising independence of set theory for some basic general notion of learnability. Consider the following statistical estimation problem: given a family F of real valued random variables over some domain X and an i.i.d. sample drawn from an unknown distribution P over X, find f in F such that its expectation w.r.t. P is close to the supremum expectation over all members of F. This Expectation Maximization (EMX) problem captures many well studied learning problems. Surprisingly, we show that the EMX learnability of some simple classes depends on the cardinality of the continuum and is therefore independent of the set theory ZFC axioms. Our results imply that that there exist no "finitary" combinatorial parameter that characterizes EMX learnability in a way similar to the VCdimension characterization of binary classification learnability. @InProceedings{STOC21p11, author = {Shai BenDavid and Pavel Hrubes and Shay Moran and Amir Shpilka and Amir Yehudayoff}, title = {Learnability Can Be Independent of Set Theory (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {1111}, doi = {10.1145/3406325.3465360}, year = {2021}, } Publisher's Version STOC '21: "A Theory of Universal Learning ..." A Theory of Universal Learning Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, and Amir Yehudayoff (Google, Switzerland; Toyota Technological Institute at Chicago, USA; Technion, Israel; Google Research, Israel; Princeton University, USA) How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its “learning curve”, that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of VapnikChervonenkis and Valiant, does not explain the behavior of learning curves: the distributionfree PAC model of learning can only bound the upper envelope of the learning curves over all possible data distributions. This does not match the practice of machine learning, where the data source is typically fixed in any given scenario, while the learner may choose the number of training examples on the basis of factors such as computational resources and desired accuracy. In this paper, we study an alternative learning model that better captures such practical aspects of machine learning, but still gives rise to a complete theory of the learnable in the spirit of the PAC model. More precisely, we consider the problem of universal learning, which aims to understand the performance of learning algorithms on every data distribution, but without requiring uniformity over the distribution. The main result of this paper is a remarkable trichotomy: there are only three possible rates of universal learning. More precisely, we show that the learning curves of any given concept class decay either at an exponential, linear, or arbitrarily slow rates. Moreover, each of these cases is completely characterized by appropriate combinatorial parameters, and we exhibit optimal learning algorithms that achieve the best possible rate in each case. For concreteness, we consider in this paper only the realizable case, though analogous results are expected to extend to more general learning scenarios. @InProceedings{STOC21p532, author = {Olivier Bousquet and Steve Hanneke and Shay Moran and Ramon van Handel and Amir Yehudayoff}, title = {A Theory of Universal Learning}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {532541}, doi = {10.1145/3406325.3451087}, year = {2021}, } Publisher's Version STOC '21: "Boosting Simple Learners ..." Boosting Simple Learners Noga Alon, Alon Gonen, Elad Hazan, and Shay Moran (Princeton University, USA; Tel Aviv University, Israel; OrCam, Israel; Google AI, USA; Technion, Israel; Google Research, Israel) Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are “rulesofthumbs” from an “easytolearn class”. (Schapire and Freund ’12, ShalevShwartz and BenDavid ’14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed in order to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire (’95, ’12). Whereas the lower bound shows that Ω(1/γ^{2}) weak hypotheses with γmargin are sometimes necessary, our new method requires only Õ(1/γ) weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex (“deeper”) aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are “far away” from the class be learned? Towards answering the first question we identify a combinatorialgeometric parameter which captures the expressivity of baseclasses in boosting. As a corollary we provide an affirmative answer to the second question for many wellstudied classes, including halfspaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory. @InProceedings{STOC21p481, author = {Noga Alon and Alon Gonen and Elad Hazan and Shay Moran}, title = {Boosting Simple Learners}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {481489}, doi = {10.1145/3406325.3451030}, year = {2021}, } Publisher's Version STOC '21: "Adversarial Laws of Large ..." Adversarial Laws of Large Numbers and Optimal Regret in Online Classification Noga Alon, Omri BenEliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed subpopulation is wellestimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by BenEliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The samplecomplexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from BenDavid, Pál, and ShalevShwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri BenEliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version 

Moshkovitz, Guy 
STOC '21: "Structure vs. Randomness for ..."
Structure vs. Randomness for Bilinear Maps
Alex Cohen and Guy Moshkovitz (Yale University, USA; City University of New York, USA) We prove that the slice rank of a 3tensor (a combinatorial notion introduced by Tao in the context of the capset problem), the analytic rank (a Fouriertheoretic notion introduced by Gowers and Wolf), and the geometric rank (a recently introduced algebrogeometric notion) are all equivalent up to an absolute constant. As a corollary, we obtain strong tradeoffs on the arithmetic complexity of a biased bililnear map, and on the separation between computing a bilinear map exactly and on average. Our result settles open questions of Haramaty and Shpilka [STOC 2010], and of Lovett [Discrete Anal., 2019] for 3tensors. @InProceedings{STOC21p800, author = {Alex Cohen and Guy Moshkovitz}, title = {Structure vs. Randomness for Bilinear Maps}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {800808}, doi = {10.1145/3406325.3451007}, year = {2021}, } Publisher's Version 

Mossel, Elchanan 
STOC '21: "Robust Testing of Low Dimensional ..."
Robust Testing of Low Dimensional Functions
Anindya De, Elchanan Mossel, and Joe Neeman (University of Pennsylvania, USA; Massachusetts Institute of Technology, USA; University of Texas at Austin, USA) A natural problem in highdimensional inference is to decide if a classifier f:ℝ^{n} → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝ^{n} → {−1,1}, a linear kjunta if it is completely determined by some kdimensional subspace of the input space. A recent work of the authors showed that linear kjuntas are testable. Thus there exists an algorithm to distinguish between: (1) f: ℝ^{n} → {−1,1} which is a linear kjunta with surface area s. (2) f is єfar from any linear kjunta with surface area (1+є)s. The query complexity of the algorithm is independent of the ambient dimension n. Following the surge of interest in noisetolerant property testing, in this paper we prove a noisetolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between: (1) f: ℝ^{n} → {−1,1} has correlation at least c with some linear kjunta with surface area s. (2) f has correlation at most c−є with any linear kjunta with surface area at most s. The query complexity of our tester is k^{poly(s/є)}. Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear kjuntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity k^{O(poly(logk/є))} for the class of intersection of khalfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no nontrivial noise tolerant testers were known even for a single halfspace. @InProceedings{STOC21p584, author = {Anindya De and Elchanan Mossel and Joe Neeman}, title = {Robust Testing of Low Dimensional Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {584597}, doi = {10.1145/3406325.3451115}, year = {2021}, } Publisher's Version 

Mukhopadhyay, Partha 
STOC '21: "Lower Bounds for Monotone ..."
Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Arkadev Chattopadhyay, Rajit Datta, and Partha Mukhopadhyay (Tata Institute of Fundamental Research, India; Chennai Mathematical Institute, India) Valiant (1980) showed that general arithmetic circuits with negation can be exponentially more powerful than monotone ones. We give the first improvement to this classical result: we construct a family of polynomials P_{n} in n variables, each of its monomials has nonnegative coefficient, such that P_{n} can be computed by a polynomialsize depththree formula but every monotone circuit computing it has size 2^{Ω(n1/4/log(n))}. The polynomial P_{n} embeds the SINK∘ XOR function devised recently by Chattopadhyay, Mande and Sherif (2020) to refute the LogApproximateRank Conjecture in communication complexity. To prove our lower bound for P_{n}, we develop a general connection between corruption of combinatorial rectangles by any function f ∘ XOR and corruption of product polynomials by a certain polynomial P^{f} that is an arithmetic embedding of f. This connection should be of independent interest. Using further ideas from communication complexity, we construct another family of setmultilinear polynomials f_{n,m} such that both F_{n,m} − є· f_{n,m} and F_{n,m} + є· f_{n,m} have monotone circuit complexity 2^{Ω(n/log(n))} if є ≥ 2^{− Ω( m )} and F_{n,m} ∏_{i=1}^{n} (x_{i,1} +⋯+x_{i,m}), with m = O( n/logn ). The polynomials f_{n,m} have 0/1 coefficients and are in VNP. Proving such lower bounds for monotone circuits has been advocated recently by Hrubeš (2020) as a first step towards proving lower bounds against general circuits via his new approach. @InProceedings{STOC21p786, author = {Arkadev Chattopadhyay and Rajit Datta and Partha Mukhopadhyay}, title = {Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {786799}, doi = {10.1145/3406325.3451069}, year = {2021}, } Publisher's Version 

Mukhopadhyay, Sagnik 
STOC '21: "Distributed Weighted MinCut ..."
Distributed Weighted MinCut in NearlyOptimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimumweight cut (mincut) is a basic measure of a network’s connectivity strength. While the mincut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own mincut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearlyoptimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)approximation at best while the exact Õ(n^{0.8}D^{0.2} + n^{0.9})time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hopdiameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)time algorithm for computing mincut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of MukhopadhyayNanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted MinCut in NearlyOptimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11441153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version STOC '21: "Breaking the Quadratic Barrier ..." Breaking the Quadratic Barrier for Matroid Intersection Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, and Danupon Nanongkai (KTH, Sweden; University of Copenhagen, Denmark) The matroid intersection problem is a fundamental problem that has been extensively studied for half a century. In the classic version of this problem, we are given two matroids M_{1} = (V, I_{1}) and M_{2} = (V, I_{2}) on a comment ground set V of n elements, and then we have to find the largest common independent set S ∈ I_{1} ∩ I_{2} by making independence oracle queries of the form ”Is S ∈ I_{1}?” or ”Is S ∈ I_{2}?” for S ⊆ V. The goal is to minimize the number of queries. Beating the existing Õ(n^{2}) bound, known as the quadratic barrier, is an open problem that captures the limits of techniques from two lines of work. The first one is the classic Cunningham’s algorithm [SICOMP 1986], whose Õ(n^{2})query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally, these algorithms take Õ(nr) queries where r denotes the rank which can be as big as n). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requires either approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm with o(n^{2}) independence queries was known. In this work, we break the quadratic barrier with a randomized algorithm guaranteeing Õ(n^{9/5}) independence queries with high probability, and a deterministic algorithm guaranteeing Õ(n^{11/6}) independence queries. Our key insight is simple and fast algorithms to solve a graph reachability problem that arose in the standard augmenting path framework [Edmonds 1968]. Combining this with previous exact and approximation algorithms leads to our results. @InProceedings{STOC21p421, author = {Joakim Blikstad and Jan van den Brand and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Breaking the Quadratic Barrier for Matroid Intersection}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {421432}, doi = {10.1145/3406325.3451092}, year = {2021}, } Publisher's Version 

Mullainathan, Sendhil 
STOC '21: "Simplicity Creates Inequity: ..."
Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)
Jon Kleinberg and Sendhil Mullainathan (Cornell University, USA; University of Chicago, USA) Algorithms are increasingly used to aid, or in some cases supplant, human decisionmaking, particularly for decisions that hinge on predictions. As a result, two additional features in addition to prediction quality have generated interest: (i) to facilitate human interaction and understanding with these algorithms, we desire prediction functions that are in some fashion simple or interpretable; and (ii) because they influence consequential decisions, we also want them to produce equitable allocations. We develop a formal model to explore the relationship between the demands of simplicity and equity. Although the two concepts appear to be motivated by qualitatively distinct goals, we show a fundamental inconsistency between them. Specifically, we formalize a general framework for producing simple prediction functions, and in this framework we establish two basic results. First, every simple prediction function is strictly improvable: there exists a more complex prediction function that is both strictly more efficient and also strictly more equitable. Put another way, using a simple prediction function both reduces utility for disadvantaged groups and reduces overall welfare relative to other options. Second, we show that simple prediction functions necessarily create incentives to use information about individuals' membership in a disadvantaged group  incentives that weren't present before simplification, and that work against these individuals. Thus, simplicity transforms disadvantage into bias against the disadvantaged group. Our results are not only about algorithms but about any process that produces simple models, and as such they connect to the psychology of stereotypes and to an earlier economics literature on statistical discrimination. @InProceedings{STOC21p7, author = {Jon Kleinberg and Sendhil Mullainathan}, title = {Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {77}, doi = {10.1145/3406325.3465356}, year = {2021}, } Publisher's Version 

N, Vishvajeet 
STOC '21: "Graph Streaming Lower Bounds ..."
Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma
Sepehr Assadi and Vishvajeet N (Rutgers University, USA) We study spacepass tradeoffs in graph streaming algorithms for parameter estimation and property testing problems such as estimating the size of maximum matchings and maximum cuts, weight of minimum spanning trees, or testing if a graph is connected or cyclefree versus being far from these properties. We develop a new lower bound technique that proves that for many problems of interest, including all the above, obtaining a (1+є)approximation requires either n^{Ω(1)} space or Ω(1/є) passes, even on highly restricted families of graphs such as boundeddegree planar graphs. For multiple of these problems, this bound matches those of existing algorithms and is thus (asymptotically) optimal. Our results considerably strengthen prior lower bounds even for arbitrary graphs: starting from the influential work of [Verbin, Yu; SODA 2011], there has been a plethora of lower bounds for singlepass algorithms for these problems; however, the only multipass lower bounds proven very recently in [Assadi, Kol, Saxena, Yu; FOCS 2020] rules out sublinearspace algorithms with exponentially smaller o(log(1/є)) passes for these problems. One key ingredient of our proofs is a simple streaming XOR Lemma, a generic hardness amplification result, that we prove: informally speaking, if a ppass sspace streaming algorithm can only solve a decision problem with advantage δ > 0 over random guessing, then it cannot solve XOR of ℓ independent copies of the problem with advantage much better than δ^{ℓ}. This result can be of independent interest and useful for other streaming lower bounds as well. @InProceedings{STOC21p612, author = {Sepehr Assadi and Vishvajeet N}, title = {Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {612625}, doi = {10.1145/3406325.3451110}, year = {2021}, } Publisher's Version 

Nakos, Vasileios 
STOC '21: "Sparse Nonnegative Convolution ..."
Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution
Karl Bringmann, Nick Fischer, and Vasileios Nakos (Saarland University, Germany; MPIINF, Germany) Computing the convolution A ⋆ B of two lengthn vectors A,B is an ubiquitous computational primitive, with applications in a variety of disciplines. Within theoretical computer science, applications range from string problems to Knapsacktype problems, and from 3SUM to AllPairs Shortest Paths. These applications often come in the form of nonnegative convolution, where the entries of A,B are nonnegative integers. The classical algorithm to compute A⋆ B uses the Fast Fourier Transform (FFT) and runs in time O(n logn). However, in many cases A and B might satisfy sparsity conditions, and hence one could hope for significant gains compared to the standard FFT algorithm. The ideal goal would be an O(k logk)time algorithm, where k is the number of nonzero elements in the output, i.e., the size of the support of A ⋆ B. This problem is referred to as sparse nonnegative convolution, and has received a considerable amount of attention in the literature; the fastest algorithms to date run in time O(k log^{2} n). The main result of this paper is the first O(k logk)time algorithm for sparse nonnegative convolution. Our algorithm is randomized and assumes that the length n and the largest entry of A and B are subexponential in k. Surprisingly, we can phrase our algorithm as a reduction from the sparse case to the dense case of nonnegative convolution, showing that, under some mild assumptions, sparse nonnegative convolution is equivalent to dense nonnegative convolution for constanterror randomized algorithms. Specifically, if D(n) is the time to convolve two nonnegative lengthn vectors with success probability 2/3, and S(k) is the time to convolve two nonnegative vectors with output size k with success probability 2/3, then S(k) = O(D(k) + k (loglogk)^{2}). Our approach uses a variety of new techniques in combination with some old machinery from linear sketching and structured linear algebra, as well as new insights on linear hashing, the most classical hash function. @InProceedings{STOC21p1711, author = {Karl Bringmann and Nick Fischer and Vasileios Nakos}, title = {Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17111724}, doi = {10.1145/3406325.3451090}, year = {2021}, } Publisher's Version 

Nanongkai, Danupon 
STOC '21: "Distributed Weighted MinCut ..."
Distributed Weighted MinCut in NearlyOptimal Time
Michal Dory, Yuval Efron, Sagnik Mukhopadhyay, and Danupon Nanongkai (ETH Zurich, Switzerland; University of Toronto, Canada; KTH, Sweden; University of Copenhagen, Denmark) Minimumweight cut (mincut) is a basic measure of a network’s connectivity strength. While the mincut can be computed efficiently in the sequential setting [Karger STOC’96], there was no efficient way for a distributed network to compute its own mincut without limiting the input structure or dropping the output quality: In the standard CONGEST model, existing algorithms with nearlyoptimal time (e.g. [Ghaffari, Kuhn, DISC’13; Nanongkai, Su, DISC’14]) can guarantee a solution that is (1+є)approximation at best while the exact Õ(n^{0.8}D^{0.2} + n^{0.9})time algorithm [Ghaffari, Nowicki, Thorup, SODA’20] works only on simple networks (no weights and no parallel edges). Throughout, n and D denote the network’s number of vertices and hopdiameter, respectively. For the weighted case, the best bound was Õ(n) [Daga, Henzinger, Nanongkai, Saranurak, STOC’19]. In this paper, we provide an exact Õ(√n + D)time algorithm for computing mincut on weighted networks. Our result improves even the previous algorithm that works only on simple networks. Its time complexity matches the known lower bound up to polylogarithmic factors. At the heart of our algorithm are a routing trick and two structural lemmas regarding the structure of a minimum cut of a graph. These two structural lemmas considerably strengthen and generalize the framework of MukhopadhyayNanongkai [STOC’20] and can be of independent interest. @InProceedings{STOC21p1144, author = {Michal Dory and Yuval Efron and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Distributed Weighted MinCut in NearlyOptimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11441153}, doi = {10.1145/3406325.3451020}, year = {2021}, } Publisher's Version STOC '21: "Breaking the Quadratic Barrier ..." Breaking the Quadratic Barrier for Matroid Intersection Joakim Blikstad, Jan van den Brand, Sagnik Mukhopadhyay, and Danupon Nanongkai (KTH, Sweden; University of Copenhagen, Denmark) The matroid intersection problem is a fundamental problem that has been extensively studied for half a century. In the classic version of this problem, we are given two matroids M_{1} = (V, I_{1}) and M_{2} = (V, I_{2}) on a comment ground set V of n elements, and then we have to find the largest common independent set S ∈ I_{1} ∩ I_{2} by making independence oracle queries of the form ”Is S ∈ I_{1}?” or ”Is S ∈ I_{2}?” for S ⊆ V. The goal is to minimize the number of queries. Beating the existing Õ(n^{2}) bound, known as the quadratic barrier, is an open problem that captures the limits of techniques from two lines of work. The first one is the classic Cunningham’s algorithm [SICOMP 1986], whose Õ(n^{2})query implementations were shown by CLS+ [FOCS 2019] and Nguyen [2019] (more generally, these algorithms take Õ(nr) queries where r denotes the rank which can be as big as n). The other one is the general cutting plane method of Lee, Sidford, and Wong [FOCS 2015]. The only progress towards breaking the quadratic barrier requires either approximation algorithms or a more powerful rank oracle query [CLS+ FOCS 2019]. No exact algorithm with o(n^{2}) independence queries was known. In this work, we break the quadratic barrier with a randomized algorithm guaranteeing Õ(n^{9/5}) independence queries with high probability, and a deterministic algorithm guaranteeing Õ(n^{11/6}) independence queries. Our key insight is simple and fast algorithms to solve a graph reachability problem that arose in the standard augmenting path framework [Edmonds 1968]. Combining this with previous exact and approximation algorithms leads to our results. @InProceedings{STOC21p421, author = {Joakim Blikstad and Jan van den Brand and Sagnik Mukhopadhyay and Danupon Nanongkai}, title = {Breaking the Quadratic Barrier for Matroid Intersection}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {421432}, doi = {10.1145/3406325.3451092}, year = {2021}, } Publisher's Version STOC '21: "Vertex Connectivity in Polylogarithmic ..." Vertex Connectivity in Polylogarithmic MaxFlows Jason Li, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Sorrachai Yingchareonthawornchai (Carnegie Mellon University, USA; University of Copenhagen, Denmark; KTH, Sweden; Duke University, USA; University of Michigan, USA; Aalto University, Finland) The vertex connectivity of an medge nvertex undirected graph is the smallest number of vertices whose removal disconnects the graph, or leaves only a singleton vertex. In this paper, we give a reduction from the vertex connectivity problem to a set of maxflow instances. Using this reduction, we can solve vertex connectivity in (m^{α}) time for any α ≥ 1, if there is a m^{α}time maxflow algorithm. Using the current best maxflow algorithm that runs in m^{4/3+o(1)} time (Kathuria, Liu and Sidford, FOCS 2020), this yields a m^{4/3+o(1)}time vertex connectivity algorithm. This is the first improvement in the running time of the vertex connectivity problem in over 20 years, the previous best being an Õ(mn)time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm with an o(mn) running time was known before our work, even if we assume an (m)time maxflow algorithm. Our new technique is robust enough to also improve the best Õ(mn)time bound for directed vertex connectivity to mn^{1−1/12+o(1)} time @InProceedings{STOC21p317, author = {Jason Li and Danupon Nanongkai and Debmalya Panigrahi and Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai}, title = {Vertex Connectivity in Polylogarithmic MaxFlows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {317329}, doi = {10.1145/3406325.3451088}, year = {2021}, } Publisher's Version 

Naor, Assaf 
STOC '21: "A Framework for Quadratic ..."
A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations
Vijay Bhattiprolu, Euiwoong Lee, and Assaf Naor (Institute for Advanced Study at Princeton, USA; Princeton University, USA; University of Michigan, USA) We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(A_{ij}) with real entries and an originsymmetric convex body K⊂ ℝ^{n} that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑_{i=1}^{n}∑_{j=1}^{n} A_{ij} x_{i}x_{j}=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like maxcut, Grothendieck/noncommutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using casespecific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constantapproximability necessitates that K has type2 constant that grows slowly with n. However, we show that even when the type2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows us to devise a generic algorithmic approach to the above problem. We associate to each convex body a new (higher dimensional) auxiliary set that is not convex, but is approximately convex when K has a bounded type2 constant. If our auxiliary set has an approximate separation oracle, then we design an approximation algorithm for the original quadratic optimization problem, using an approximate version of the ellipsoid method. Even though our hardness result implies that such an oracle does not exist in general, this new question can be solved in specific cases of interest by implementing a range of classical tools from functional analysis, most notably the deep factorization theory of linear operators. Beyond encompassing the scenarios in the literature for which constantfactor approximation algorithms were found, our generic framework implies that that for convex sets with bounded type2 constant, constant factor approximability is preserved under the following basic operations: (a) Subspaces, (b) Quotients, (c) Minkowski Sums, (d) Complex Interpolation. This yields a rich family of new examples where constant factor approximations are possible, which were beyond the reach of previous methods. We also show (under commonly used complexity assumptions) that for symmetric norms and unitarily invariant matrix norms the type2 constant nearly characterizes the approximability of quadratic maximization. @InProceedings{STOC21p870, author = {Vijay Bhattiprolu and Euiwoong Lee and Assaf Naor}, title = {A Framework for Quadratic Form Maximization over Convex Sets through Nonconvex Relaxations}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {870881}, doi = {10.1145/3406325.3451128}, year = {2021}, } Publisher's Version 

Naor, Moni 
STOC '21: "Adversarial Laws of Large ..."
Adversarial Laws of Large Numbers and Optimal Regret in Online Classification
Noga Alon, Omri BenEliezer, Yuval Dagan, Shay Moran, Moni Naor, and Eylon Yogev (Princeton University, USA; Tel Aviv University, Israel; Harvard University, USA; Massachusetts Institute of Technology, USA; Technion, Israel; Google Research, Israel; Weizmann Institute of Science, Israel; Boston University, USA) Laws of large numbers guarantee that given a large enough sample from some population, the measure of any fixed subpopulation is wellestimated by its frequency in the sample. We study laws of large numbers in sampling processes that can affect the environment they are acting upon and interact with it. Specifically, we consider the sequential sampling model proposed by BenEliezer and Yogev (2020), and characterize the classes which admit a uniform law of large numbers in this model: these are exactly the classes that are online learnable. Our characterization may be interpreted as an online analogue to the equivalence between learnability and uniform convergence in statistical (PAC) learning. The samplecomplexity bounds we obtain are tight for many parameter regimes, and as an application, we determine the optimal regret bounds in online learning, stated in terms of Littlestone’s dimension, thus resolving the main open question from BenDavid, Pál, and ShalevShwartz (2009), which was also posed by Rakhlin, Sridharan, and Tewari (2015). @InProceedings{STOC21p447, author = {Noga Alon and Omri BenEliezer and Yuval Dagan and Shay Moran and Moni Naor and Eylon Yogev}, title = {Adversarial Laws of Large Numbers and Optimal Regret in Online Classification}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {447455}, doi = {10.1145/3406325.3451041}, year = {2021}, } Publisher's Version 

Nederlof, Jesper 
STOC '21: "Improving Schroeppel and Shamir’s ..."
Improving Schroeppel and Shamir’s Algorithm for Subset Sum via Orthogonal Vectors
Jesper Nederlof and Karol Węgrzycki (Utrecht University, Netherlands; Saarland University, Germany; MPIINF, Germany) We present an O^{∗}(2^{0.5n}) time and O^{∗}(2^{0.249999n}) space randomized algorithm for solving worstcase Subset Sum instances with n integers. This is the first improvement over the longstanding O^{∗}(2^{n/2}) time and O^{∗}(2^{n/4}) space algorithm due to Schroeppel and Shamir (FOCS 1979). We breach this gap in two steps: (1) We present a space efficient reduction to the Orthogonal Vectors Problem (OV), one of the most central problem in FineGrained Complexity. The reduction is established via an intricate combination of the method of Schroeppel and Shamir, and the representation technique introduced by HowgraveGraham and Joux (EUROCRYPT 2010) for designing Subset Sum algorithms for the average case regime. (2) We provide an algorithm for OV that detects an orthogonal pair among N given vectors in {0,1}^{d} with support size d/4 in time Õ(N· 2^{d}/d d/4). Our algorithm for OV is based on and refines the representative families framework developed by Fomin, Lokshtanov, Panolan and Saurabh (J. ACM 2016). Our reduction uncovers a curious tight relation between Subset Sum and OV, because any improvement of our algorithm for OV would imply an improvement over the runtime of Schroeppel and Shamir, which is also a long standing open problem. @InProceedings{STOC21p1670, author = {Jesper Nederlof and Karol Węgrzycki}, title = {Improving Schroeppel and Shamir’s Algorithm for Subset Sum via Orthogonal Vectors}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16701683}, doi = {10.1145/3406325.3451024}, year = {2021}, } Publisher's Version Video 

Neel, Seth 
STOC '21: "A New Analysis of Differential ..."
A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)
Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed SharifiMalvajerdi, and Moshe Shenfeld (University of Pennsylvania, USA; Hebrew University of Jerusalem, Israel) We give a new proof of the "transfer theorem" underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sampleaccurate is also accurate outofsample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the conditional distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its conditional expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the conditional distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds. @InProceedings{STOC21p9, author = {Christopher Jung and Katrina Ligett and Seth Neel and Aaron Roth and Saeed SharifiMalvajerdi and Moshe Shenfeld}, title = {A New Analysis of Differential Privacy’s Generalization Guarantees (Invited Paper)}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {99}, doi = {10.1145/3406325.3465358}, year = {2021}, } Publisher's Version 

Neeman, Joe 
STOC '21: "Robust Testing of Low Dimensional ..."
Robust Testing of Low Dimensional Functions
Anindya De, Elchanan Mossel, and Joe Neeman (University of Pennsylvania, USA; Massachusetts Institute of Technology, USA; University of Texas at Austin, USA) A natural problem in highdimensional inference is to decide if a classifier f:ℝ^{n} → {−1,1} depends on a small number of linear directions of its input data. Call a function g: ℝ^{n} → {−1,1}, a linear kjunta if it is completely determined by some kdimensional subspace of the input space. A recent work of the authors showed that linear kjuntas are testable. Thus there exists an algorithm to distinguish between: (1) f: ℝ^{n} → {−1,1} which is a linear kjunta with surface area s. (2) f is єfar from any linear kjunta with surface area (1+є)s. The query complexity of the algorithm is independent of the ambient dimension n. Following the surge of interest in noisetolerant property testing, in this paper we prove a noisetolerant (or robust) version of this result. Namely, we give an algorithm which given any c>0, є>0, distinguishes between: (1) f: ℝ^{n} → {−1,1} has correlation at least c with some linear kjunta with surface area s. (2) f has correlation at most c−є with any linear kjunta with surface area at most s. The query complexity of our tester is k^{poly(s/є)}. Using our techniques, we also obtain a fully noise tolerant tester with the same query complexity for any class C of linear kjuntas with surface area bounded by s. As a consequence, we obtain a fully noise tolerant tester with query complexity k^{O(poly(logk/є))} for the class of intersection of khalfspaces (for constant k) over the Gaussian space. Our query complexity is independent of the ambient dimension n. Previously, no nontrivial noise tolerant testers were known even for a single halfspace. @InProceedings{STOC21p584, author = {Anindya De and Elchanan Mossel and Joe Neeman}, title = {Robust Testing of Low Dimensional Functions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {584597}, doi = {10.1145/3406325.3451115}, year = {2021}, } Publisher's Version 

Nekrich, Yakov 
STOC '21: "Dynamic Planar Point Location ..."
Dynamic Planar Point Location in Optimal Time
Yakov Nekrich (Michigan Technological University, USA) In this paper we describe a fullydynamic data structure that supports point location queries in a connected planar subdivision with n edges. Our data structure uses O(n) space, answers queries in O(logn) time, and supports updates in O(logn) time. Our solution is based on a data structure for vertical ray shooting queries that supports queries and updates in O(logn) time. @InProceedings{STOC21p1003, author = {Yakov Nekrich}, title = {Dynamic Planar Point Location in Optimal Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {10031014}, doi = {10.1145/3406325.3451100}, year = {2021}, } Publisher's Version 

Niazadeh, Rad 
STOC '21: "Combinatorial Bernoulli Factories: ..."
Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes
Rad Niazadeh, Renato Paes Leme, and Jon Schneider (University of Chicago, USA; Google Research, USA) A Bernoulli factory is an algorithmic procedure for exact sampling of certain random variables having only Bernoulli access to their parameters. Bernoulli access to a parameter p ∈ [0,1] means the algorithm does not know p, but has sample access to independent draws of a Bernoulli random variable with mean equal to p. In this paper, we study the problem of Bernoulli factories for polytopes: given Bernoulli access to a vector x∈ P for a given polytope P⊂ [0,1]^{n}, output a randomized vertex such that the expected value of the ith coordinate is exactly equal to x_{i}. For example, for the special case of the perfect matching polytope, one is given Bernoulli access to the entries of a doubly stochastic matrix [x_{ij}] and asked to sample a matching such that the probability of each edge (i,j) be present in the matching is exactly equal to x_{ij}. We show that a polytope P admits a Bernoulli factory if and and only if P is the intersection of [0,1]^{n} with an affine subspace. Our construction is based on an algebraic formulation of the problem, involving identifying a family of Bernstein polynomials (one per vertex) that satisfy a certain algebraic identity on P. The main technical tool behind our construction is a connection between these polynomials and the geometry of zonotope tilings. We apply these results to construct an explicit factory for the perfect matching polytope. The resulting factory is deeply connected to the combinatorial enumeration of arborescences and may be of independent interest. For the kuniform matroid polytope, we recover a sampling procedure known in statistics as Sampford sampling. @InProceedings{STOC21p833, author = {Rad Niazadeh and Renato Paes Leme and Jon Schneider}, title = {Combinatorial Bernoulli Factories: Matchings, Flows, and Other Polytopes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {833846}, doi = {10.1145/3406325.3451072}, year = {2021}, } Publisher's Version 

Nisan, Noam 
STOC '21: "Bipartite Perfect Matching ..."
Bipartite Perfect Matching as a Real Polynomial
Gal Beniamini and Noam Nisan (Hebrew University of Jerusalem, Israel) We obtain a description of the Bipartite Perfect Matching decision problem as a multilinear polynomial over the Reals. We show that it has full degree and (1−o_{n}(1))· 2^{n2} monomials with nonzero coefficients. In contrast, we show that in the dual representation (switching the roles of 0 and 1) the number of monomials is only exponential in Θ(n logn). Our proof relies heavily on the fact that the lattice of graphs which are “matchingcovered” is Eulerian. @InProceedings{STOC21p1118, author = {Gal Beniamini and Noam Nisan}, title = {Bipartite Perfect Matching as a Real Polynomial}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11181131}, doi = {10.1145/3406325.3451002}, year = {2021}, } Publisher's Version 

Nordström, Jakob 
STOC '21: "Automating Algebraic Proof ..."
Automating Algebraic Proof Systems Is NPHard
Susanna F. de Rezende, Mika Göös, Jakob Nordström, Toniann Pitassi, Robert Robere, and Dmitry Sokolov (Czech Academy of Sciences, Czechia; EPFL, Switzerland; University of Copenhagen, Denmark; Lund University, Sweden; University of Toronto, Canada; Institute for Advanced Study at Princeton, USA; McGill University, Canada; St. Petersburg State University, Russia; Russian Academy of Sciences, Russia) We show that algebraic proofs are hard to find: Given an unsatisfiable CNF formula F, it is NPhard to find a refutation of F in the Nullstellensatz, Polynomial Calculus, or Sherali–Adams proof systems in time polynomial in the size of the shortest such refutation. Our work extends, and gives a simplified proof of, the recent breakthrough of Atserias and Müller (JACM 2020) that established an analogous result for Resolution. @InProceedings{STOC21p209, author = {Susanna F. de Rezende and Mika Göös and Jakob Nordström and Toniann Pitassi and Robert Robere and Dmitry Sokolov}, title = {Automating Algebraic Proof Systems Is NPHard}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {209222}, doi = {10.1145/3406325.3451080}, year = {2021}, } Publisher's Version 

Nowicki, Krzysztof 
STOC '21: "A Deterministic Algorithm ..."
A Deterministic Algorithm for the MST Problem in Constant Rounds of Congested Clique
Krzysztof Nowicki (University of Copenhagen, Denmark; University of Wrocław, Poland) In this paper we show that the Minimum Spanning Tree problem (MST) can be solved deterministically in O(1) rounds of the Congested Clique model. In the Congested Clique model there are n players that perform computation in synchronous rounds. Each round consist of a phase of local computation and a phase of communication, in which each pair of players is allowed to exchange O(logn) bit messages. The studies of this model began with the MST problem: in the paper by Lotker, Pavlov, PattShamir, and Peleg [SPAA’03, SICOMP’05] that defines the Congested Clique model the authors give a deterministic O(loglogn) round algorithm that improved over a trivial O(logn) round adaptation of Borůvka’s algorithm. There was a sequence of gradual improvements to this result: an O(logloglogn) round algorithm by Hegeman, Pandurangan, Pemmaraju, Sardeshmukh, and Scquizzato [PODC’15], an O(log^{*} n) round algorithm by Ghaffari and Parter, [PODC’16] and an O(1) round algorithm by Jurdziński and Nowicki, [SODA’18], but all those algorithms were randomized. Therefore, the question about the existence of any deterministic o(loglogn) round algorithms for the Minimum Spanning Tree problem remains open since the seminal paper by Lotker, Pavlov, PattShamir, and Peleg [SPAA’03, SICOMP’05]. Our result resolves this question and establishes that O(1) rounds is enough to solve the MST problem in the Congested Clique model, even if we are not allowed to use any randomness. Furthermore, the amount of communication needed by the algorithm makes it applicable to a variant of the MPC model using machines with local memory of size O(n). @InProceedings{STOC21p1154, author = {Krzysztof Nowicki}, title = {A Deterministic Algorithm for the MST Problem in Constant Rounds of Congested Clique}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11541165}, doi = {10.1145/3406325.3451136}, year = {2021}, } Publisher's Version 

O'Donnell, Ryan 
STOC '21: "Improved Quantum Data Analysis ..."
Improved Quantum Data Analysis
Costin Bădescu and Ryan O'Donnell (Carnegie Mellon University, USA) We provide more sampleefficient versions of some basic routines in quantum data analysis, along with simpler proofs. Particularly, we give a quantum ”Threshold Search” algorithm that requires only O((log^{2} m)/є^{2}) samples of a ddimensional state ρ. That is, given observables 0 ≤ A_{1}, A_{2}, …, A_{m} ≤ 1 such that (ρ A_{i}) ≥ 1/2 for at least one i, the algorithm finds j with (ρ A_{j}) ≥ 1/2−є. As a consequence, we obtain a Shadow Tomography algorithm requiring only O((log^{2} m)(logd)/є^{4}) samples, which simultaneously achieves the best known dependence on each parameter m, d, є. This yields the same sample complexity for quantum Hypothesis Selection among m states; we also give an alternative Hypothesis Selection method using O((log^{3} m)/є^{2}) samples. @InProceedings{STOC21p1398, author = {Costin Bădescu and Ryan O'Donnell}, title = {Improved Quantum Data Analysis}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {13981411}, doi = {10.1145/3406325.3451109}, year = {2021}, } Publisher's Version STOC '21: "Fiber Bundle Codes: Breaking ..." Fiber Bundle Codes: Breaking the N^{1/2} polylog(N) Barrier for Quantum LDPC Codes Matthew B. Hastings, Jeongwan Haah, and Ryan O'Donnell (Station Q, USA; Microsoft Quantum, USA; Carnegie Mellon University, USA) We present a quantum LDPC code family that has distance Ω(N^{3/5}/polylog(N)) and Θ(N^{3/5}) logical qubits, where N is the code length. This is the first quantum LDPC code construction that achieves distance greater than N^{1/2} polylog(N). The construction is based on generalizing the homological product of codes to a fiber bundle. @InProceedings{STOC21p1276, author = {Matthew B. Hastings and Jeongwan Haah and Ryan O'Donnell}, title = {Fiber Bundle Codes: Breaking the <i>N</i><sup>1/2</sup> polylog(<i>N</i>) Barrier for Quantum LDPC Codes}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {12761288}, doi = {10.1145/3406325.3451005}, year = {2021}, } Publisher's Version 

Oliveira, Igor C. 
STOC '21: "Pseudodeterministic Algorithms ..."
Pseudodeterministic Algorithms and the Structure of Probabilistic Time
Zhenjian Lu, Igor C. Oliveira, and Rahul Santhanam (University of Warwick, UK; University of Oxford, UK) We connect the study of pseudodeterministic algorithms to two major open problems about the structural complexity of BPTIME: proving hierarchy theorems and showing the existence of complete problems. Our main contributions can be summarised as follows. A new pseudorandom generator and its consequences. We build on techniques developed to prove hierarchy theorems for probabilistic time with advice (Fortnow and Santhanam, FOCS 2004) to construct the first unconditional pseudorandom generator of polynomial stretch computable in pseudodeterministic polynomial time (with one bit of advice) that is secure infinitely often against polynomialtime computations. As an application of this construction, we obtain new results about the complexity of generating and representing prime numbers. For instance, we show unconditionally for each ε > 0 that infinitely many primes p_{n} have a succinct representation in the following sense: there is a fixed probabilistic polynomial time algorithm that generates p_{n} with high probability from its succinct representation of size O(p_{n}^{ε}). This offers an exponential improvement over the running time of previous results, and shows that infinitely many primes have succinct and efficient representations. Structural results for probabilistic time from pseudodeterministic algorithms. Oliveira and Santhanam (STOC 2017) established unconditionally that there is a pseudodeterministic algorithm for the Circuit Acceptance Probability Problem (CAPP) that runs in subexponential time and is correct with high probability over any samplable distribution on circuits on infinitely many input lengths. We show that improving this running time or obtaining a result that holds for every large input length would imply new time hierarchy theorems for probabilistic time. In addition, we prove that a worstcase polynomialtime pseudodeterministic algorithm for CAPP would imply that BPP has complete problems. Equivalence between pseudodeterministic constructions and hierarchies. We establish an equivalence between a certain explicit pseudodeterministic construction problem and the existence of strong hierarchy theorems for probabilistic time. More precisely, we show that pseudodeterministically constructing in exponential time strings of large rKt complexity (Oliveira, ICALP 2019) is possible if and only if for every constructive function T(n) ≤ exp(o(exp(n))) we have BPTIME[poly(T)] ⊈ i.o.BPTIME[T]/logT. More generally, these results suggest new approaches for designing pseudodeterministic algorithms for search problems and for unveiling the structure of probabilistic time. @InProceedings{STOC21p303, author = {Zhenjian Lu and Igor C. Oliveira and Rahul Santhanam}, title = {Pseudodeterministic Algorithms and the Structure of Probabilistic Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {303316}, doi = {10.1145/3406325.3451085}, year = {2021}, } Publisher's Version 

Oshman, Rotem 
STOC '21: "The Communication Complexity ..."
The Communication Complexity of Multiparty Set Disjointness under Product Distributions
Nachum Dershowitz, Rotem Oshman, and Tal Roth (Tel Aviv University, Israel) In the multiparty numberinhand set disjointness problem, we have k players, with private inputs X_{1},…,X_{k} ⊆ [n]. The players’ goal is to check whether ∩_{ℓ=1}^{k} X_{ℓ} = ∅. It is known that in the shared blackboard model of communication, set disjointness requires Ω(n logk + k) bits of communication, and in the coordinator model, it requires Ω(kn) bits. However, these two lower bounds require that the players’ inputs can be highly correlated. We study the communication complexity of multiparty set disjointness under product distributions, and ask whether the problem becomes significantly easier, as it is known to become in the twoparty case. Our main result is a nearlytight bound of Θ^{̃}(n^{1−1/k} + k) for both the shared blackboard model and the coordinator model. This shows that in the shared blackboard model, as the number of players grows, having independent inputs helps less and less; but in the coordinator model, when k is very large, having independent inputs makes the problem much easier. Both our upper and our lower bounds use new ideas, as the original techniques developed for the twoparty case do not scale to more than two players. @InProceedings{STOC21p1194, author = {Nachum Dershowitz and Rotem Oshman and Tal Roth}, title = {The Communication Complexity of Multiparty Set Disjointness under Product Distributions}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {11941207}, doi = {10.1145/3406325.3451106}, year = {2021}, } Publisher's Version 

Pan, Qinxuan 
STOC '21: "SampleOptimal and Efficient ..."
SampleOptimal and Efficient Learning of Tree Ising Models
Constantinos Daskalakis and Qinxuan Pan (Massachusetts Institute of Technology, USA) We show that nvariable treestructured Ising models can be learned computationallyefficiently to within total variation distance є from an optimal O(n lnn/є^{2}) samples, where O(·) hides an absolute constant which, importantly, does not depend on the model being learned—neither its tree nor the magnitude of its edge strengths, on which we place no assumptions. Our guarantees hold, in fact, for the celebrated ChowLiu algorithm [1968], using the plugin estimator for estimating mutual information. While this (or any other) algorithm may fail to identify the structure of the underlying model correctly from a finite sample, we show that it will still learn a treestructured model that is єclose to the true one in total variation distance, a guarantee called “proper learning.” Our guarantees do not follow from known results for the ChowLiu algorithm and the ensuing literature on learning graphical models, including the very recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or samplesuboptimal and/or timeinefficient algorithms, unless further assumptions are placed on the model, such as bounds on the “strengths” of the model’s edges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sampleoptimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation. @InProceedings{STOC21p133, author = {Constantinos Daskalakis and Qinxuan Pan}, title = {SampleOptimal and Efficient Learning of Tree Ising Models}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {133146}, doi = {10.1145/3406325.3451006}, year = {2021}, } Publisher's Version 

Panigrahi, Debmalya 
STOC '21: "Approximate Gomory–Hu Tree ..."
Approximate Gomory–Hu Tree Is Faster Than n – 1 MaxFlows
Jason Li and Debmalya Panigrahi (Carnegie Mellon University, USA; Duke University, USA) The GomoryHu tree or cut tree (Gomory and Hu, 1961) is a classic data structure for reporting s−t mincuts (and by duality, the values of s−t maxflows) for all pairs of vertices s and t in an undirected graph. Gomory and Hu showed that it can be computed using n−1 exact maxflow computations. Surprisingly, this remains the best algorithm for GomoryHu trees more than 50 years later, even for approximate mincuts. In this paper, we break this longstanding barrier and give an algorithm for computing a (1+є)approximate GomoryHu tree using log(n) maxflow computations. Specifically, we obtain the runtime bounds we describe below. We obtain a randomized (Monte Carlo) algorithm for undirected, weighted graphs that runs in Õ(m + n^{3/2}) time and returns a (1+є)approximate GomoryHu tree algorithm whp. Previously, the best running time known was Õ(n^{5/2}), which is obtained by running Gomory and Hu’s original algorithm on a cut sparsifier of the graph. Next, we obtain a randomized (Monte Carlo) algorithm for undirected, unweighted graphs that runs in m^{4/3+o(1)} time and returns a (1+є)approximate GomoryHu tree algorithm whp. This improves on our first result for sparse graphs, namely m = o(n^{9/8}). Previously, the best running time known for unweighted graphs was Õ(mn) for an exact GomoryHu tree (Bhalgat et al., STOC 2007); no better result was known if approximations are allowed. As a consequence of our GomoryHu tree algorithms, we also solve the (1+є)approximate all pairs mincut and single source mincut problems in the same time bounds. (These problems are simpler in that the goal is to only return the s−t mincut values, and not the mincuts.) This improves on the recent algorithm for these problems in Õ(n^{2}) time due to Abboud et al. (FOCS 2020). @InProceedings{STOC21p1738, author = {Jason Li and Debmalya Panigrahi}, title = {Approximate Gomory–Hu Tree Is Faster Than <i>n</i> – 1 MaxFlows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {17381748}, doi = {10.1145/3406325.3451112}, year = {2021}, } Publisher's Version STOC '21: "Vertex Connectivity in Polylogarithmic ..." Vertex Connectivity in Polylogarithmic MaxFlows Jason Li, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Sorrachai Yingchareonthawornchai (Carnegie Mellon University, USA; University of Copenhagen, Denmark; KTH, Sweden; Duke University, USA; University of Michigan, USA; Aalto University, Finland) The vertex connectivity of an medge nvertex undirected graph is the smallest number of vertices whose removal disconnects the graph, or leaves only a singleton vertex. In this paper, we give a reduction from the vertex connectivity problem to a set of maxflow instances. Using this reduction, we can solve vertex connectivity in (m^{α}) time for any α ≥ 1, if there is a m^{α}time maxflow algorithm. Using the current best maxflow algorithm that runs in m^{4/3+o(1)} time (Kathuria, Liu and Sidford, FOCS 2020), this yields a m^{4/3+o(1)}time vertex connectivity algorithm. This is the first improvement in the running time of the vertex connectivity problem in over 20 years, the previous best being an Õ(mn)time algorithm due to Henzinger, Rao, and Gabow (FOCS 1996). Indeed, no algorithm with an o(mn) running time was known before our work, even if we assume an (m)time maxflow algorithm. Our new technique is robust enough to also improve the best Õ(mn)time bound for directed vertex connectivity to mn^{1−1/12+o(1)} time @InProceedings{STOC21p317, author = {Jason Li and Danupon Nanongkai and Debmalya Panigrahi and Thatchaphol Saranurak and Sorrachai Yingchareonthawornchai}, title = {Vertex Connectivity in Polylogarithmic MaxFlows}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {317329}, doi = {10.1145/3406325.3451088}, year = {2021}, } Publisher's Version 

Paramonov, Dmitry 
STOC '21: "Almost Optimal SuperConstantPass ..."
Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability
Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu (Massachusetts Institute of Technology, USA; Princeton University, USA; Institute for Advanced Study at Princeton, USA) We give an almost quadratic n^{2−o(1)} lower bound on the space consumption of any o(√logn)pass streaming algorithm solving the (directed) st reachability problem. This means that any such algorithm must essentially store the entire graph. As corollaries, we obtain almost quadratic space lower bounds for additional fundamental problems, including maximum matching, shortest path, matrix rank, and linear programming. Our main technical contribution is the definition and construction of set hiding graphs, that may be of independent interest: we give a general way of encoding a set S ⊆ [k] as a directed graph with n = k^{ 1 + o( 1 ) } vertices, such that deciding whether i ∈ S boils down to deciding if t_{i} is reachable from s_{i}, for a specific pair of vertices (s_{i},t_{i}) in the graph. Furthermore, we prove that our graph “hides” S, in the sense that no lowspace streaming algorithm with a small number of passes can learn (almost) anything about S. @InProceedings{STOC21p570, author = {Lijie Chen and Gillat Kol and Dmitry Paramonov and Raghuvansh R. Saxena and Zhao Song and Huacheng Yu}, title = {Almost Optimal SuperConstantPass Streaming Lower Bounds for Reachability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {570583}, doi = {10.1145/3406325.3451038}, year = {2021}, } Publisher's Version 

Parisi, Daniel 
STOC '21: "Entropy Decay in the Swendsen–Wang ..."
Entropy Decay in the Swendsen–Wang Dynamics on ℤ^{d}
Antonio Blanca, Pietro Caputo, Daniel Parisi, Alistair Sinclair, and Eric Vigoda (Pennsylvania State University, USA; Roma Tre University, Italy; University of California at Berkeley, USA; Georgia Institute of Technology, USA) We study the mixing time of the SwendsenWang dynamics for the ferromagnetic Ising and Potts models on the integer lattice ℤ^{d}. This dynamics is a widely used Markov chain that has largely resisted sharp analysis because it is nonlocal, i.e., it changes the entire configuration in one step. We prove that, whenever strong spatial mixing (SSM) holds, the mixing time on any nvertex cube in ℤ^{d} is O(logn), and we prove this is tight by establishing a matching lower bound. The previous best known bound was O(n). SSM is a standard condition corresponding to exponential decay of correlations with distance between spins on the lattice and is known to hold in d=2 dimensions throughout the hightemperature (single phase) region. Our result follows from a modified logSobolev inequality, which expresses the fact that the dynamics contracts relative entropy at a constant rate at each step. The proof of this fact utilizes a new factorization of the entropy in the joint probability space over spins and edges that underlies the SwendsenWang dynamics, which extends to general bipartite graphs of bounded degree. This factorization leads to several additional results, including mixing time bounds for a number of natural local and nonlocal Markov chains on the joint space, as well as for the standard randomcluster dynamics. @InProceedings{STOC21p1551, author = {Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda}, title = {Entropy Decay in the Swendsen–Wang Dynamics on ℤ<sup><i>d</i></sup>}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15511564}, doi = {10.1145/3406325.3451095}, year = {2021}, } Publisher's Version 

Pass, Rafael 
STOC '21: "Cryptography from SublinearTime ..."
Cryptography from SublinearTime AverageCase Hardness of TimeBounded Kolmogorov Complexity
Yanyi Liu and Rafael Pass (Cornell University, USA) Let MK^{t}P[s] be the set of strings x such that K^{t}(x) ≤ s(x), where K^{t}(x) denotes the tbounded Kolmogorov complexity of the truthtable described by x. Our main theorem shows that for an appropriate notion of mild averagecase hardness, for every ε>0, polynomial t(n) ≥ (1+ε)n, and every “nice” class F of superpolynomial functions, the following are equivalent: (i) the existence of some function T ∈ F such that Thard oneway functions (OWF) exists (with nonuniform security); (ii) the existence of some function T ∈ F such that MK^{t}P[T^{−1}] is mildly averagecase hard with respect to sublineartime nonuniform algorithms (with runningtime n^{δ} for some 0<δ<1). For instance, existence of subexponentiallyhard (resp. quasipolynomiallyhard) OWFs is equivalent to mild averagecase hardness of MK^{t}P[poly logn] (resp. MK^{t}P[2^{O(√logn)})]) w.r.t. sublineartime nonuniform algorithms. We additionally note that if we want to deduce Thard OWFs where security holds w.r.t. uniform Ttime probabilistic attackers (i.e., uniformlysecure OWFs), it suffices to assume sublinear time hardness of MK^{t}P w.r.t. uniform probabilistic sublineartime attackers. We complement this result by proving lower bounds that come surprisingly close to what is required to unconditionally deduce the existence of (uniformlysecure) OWFs: MK^{t}P[polylogn] is worstcase hard w.r.t. uniform probabilistic sublineartime algorithms, and MK^{t}P[n−logn] is mildly averagecase hard for all O(t(n)/n^{3})time deterministic algorithms. @InProceedings{STOC21p722, author = {Yanyi Liu and Rafael Pass}, title = {Cryptography from SublinearTime AverageCase Hardness of TimeBounded Kolmogorov Complexity}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {722735}, doi = {10.1145/3406325.3451121}, year = {2021}, } Publisher's Version STOC '21: "Indistinguishability Obfuscation ..." Indistinguishability Obfuscation from Circular Security Romain Gay and Rafael Pass (IBM Research, Switzerland; Cornell Tech, USA) We show the existence of indistinguishability obfuscators (iO) for general circuits assuming subexponential security of: (a) the Learning with Errors (LWE) assumption (with subexponential modulustonoise ratio); (b) a circular security conjecture regarding the GentrySahaiWaters' (GSW) encryption scheme and a Packed version of Regev's encryption scheme. The circular security conjecture states that a notion of leakageresilient security, that we prove is satisfied by GSW assuming LWE, is retained in the presence of an encrypted keycycle involving GSW and Packed Regev. @InProceedings{STOC21p736, author = {Romain Gay and Rafael Pass}, title = {Indistinguishability Obfuscation from Circular Security}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {736749}, doi = {10.1145/3406325.3451070}, year = {2021}, } Publisher's Version 

Peebles, John 
STOC '21: "Optimal Testing of Discrete ..."
Optimal Testing of Discrete Distributions with High Probability
Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, and Eric Price (University of WisconsinMadison, USA; MPIINF, Germany; University of California at San Diego, USA; Princeton University, USA; University of Texas at Austin, USA) We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property P, and parameters 0< є, δ <1, we want to distinguish with probability at least 1−δ whether these distributions satisfy P or are єfar from P in total variation distance. Most prior work in distribution testing studied the constant confidence case (corresponding to δ = Ω(1)), and provided sampleoptimal testers for a range of properties. While one can always boost the confidence probability of any such tester by blackbox amplification, this generic boosting method typically leads to suboptimal sample bounds. Here we study the following broad question: For a given property P, can we characterize the sample complexity of testing P as a function of all relevant problem parameters, including the error probability δ? Prior to this work, uniformity testing was the only statistical task whose sample complexity had been characterized in this setting. As our main results, we provide the first algorithms for closeness and independence testing that are sampleoptimal, within constant factors, as a function of all relevant parameters. We also show matching informationtheoretic lower bounds on the sample complexity of these problems. Our techniques naturally extend to give optimal testers for related problems. To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples. @InProceedings{STOC21p542, author = {Ilias Diakonikolas and Themis Gouleakis and Daniel M. Kane and John Peebles and Eric Price}, title = {Optimal Testing of Discrete Distributions with High Probability}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {542555}, doi = {10.1145/3406325.3450997}, year = {2021}, } Publisher's Version 

Peleg, Shir 
STOC '21: "Polynomial Time Deterministic ..."
Polynomial Time Deterministic Identity Testing Algorithm for Σ^{[3]}ΠΣΠ^{[2]} Circuits via Edelstein–Kelly Type Theorem for Quadratic Polynomials
Shir Peleg and Amir Shpilka (Tel Aviv University, Israel) In this work we resolve conjectures of Beecken, Mitmann and Saxena [BMS13] and Gupta [Gupta14], by proving an analog of a theorem of Edelstein and Kelly for quadratic polynomials. As immediate corollary we obtain the first deterministic polynomial time blackbox algorithm for testing zeroness of Σ^{[3]}ΠΣΠ^{[2]} circuits. @InProceedings{STOC21p259, author = {Shir Peleg and Amir Shpilka}, title = {Polynomial Time Deterministic Identity Testing Algorithm for Σ<sup>[3]</sup>ΠΣΠ<sup>[2]</sup> Circuits via Edelstein–Kelly Type Theorem for Quadratic Polynomials}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {259271}, doi = {10.1145/3406325.3451013}, year = {2021}, } Publisher's Version Info 

Peri, Noam 
STOC '21: "Expander Random Walks: A FourierAnalytic ..."
Expander Random Walks: A FourierAnalytic Approach
Gil Cohen, Noam Peri, and Amnon TaShma (Tel Aviv University, Israel) In this work we ask the following basic question: assume the vertices of an expander graph are labelled by 0,1. What “test” functions f : { 0,1}^{t} → {0,1} cannot distinguish t independent samples from those obtained by a random walk? The expander hitting property due to Ajtai, Komlos and Szemeredi (STOC 1987) is captured by the AND test function, whereas the fundamental expander Chernoff bound due to Gillman (SICOMP 1998), Heally (Computational Complexity 2008) is about test functions indicating whether the weight is close to the mean. In fact, it is known that all threshold functions are fooled by a random walk (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Recently, it was shown that even the highly sensitive PARITY function is fooled by a random walk TaShma (STOC 2017). We focus on balanced labels. Our first main result is proving that all symmetric functions are fooled by a random walk. Put differently, we prove a central limit theorem (CLT) for expander random walks with respect to the total variation distance, significantly strengthening the classic CLT for Markov Chains that is established with respect to the Kolmogorov distance (Kipnis and Varadhan, Communications in Mathematical Physics 1986). Our approach significantly deviates from prior works. We first study how well a Fourier character χ_{S} is fooled by a random walk as a function of S. Then, given a test function f, we expand f in the Fourier basis and combine the above with known results on the Fourier spectrum of f. We also proceed further and consider general test functions  not necessarily symmetric. As our approach is Fourier analytic, it is general enough to analyze such versatile test functions. For our second result, we prove that random walks on sufficiently good expander graphs fool tests functions computed by AC^{0} circuits, readonce branching programs, and functions with bounded query complexity. @InProceedings{STOC21p1643, author = {Gil Cohen and Noam Peri and Amnon TaShma}, title = {Expander Random Walks: A FourierAnalytic Approach}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {16431655}, doi = {10.1145/3406325.3451049}, year = {2021}, } Publisher's Version Info 

Perkins, Will 
STOC '21: "Frozen 1RSB Structure of ..."
Frozen 1RSB Structure of the Symmetric Ising Perceptron
Will Perkins and Changji Xu (University of Illinois at Chicago, USA; Harvard University, USA) We prove, under an assumption on the critical points of a realvalued function, that the symmetric Ising perceptron exhibits the `frozen 1RSB' structure conjectured by Krauth and Mezard in the physics literature; that is, typical solutions of the model lie in clusters of vanishing entropy density. Moreover, we prove this in a very strong form conjectured by Huang, Wong, and Kabashima: a typical solution of the model is isolated with high probability and the Hamming distance to all other solutions is linear in the dimension. The frozen 1RSB scenario is part of a recent and intriguing explanation of the performance of learning algorithms by Baldassi, Ingrosso, Lucibello, Saglietti, and Zecchina. We prove this structural result by comparing the symmetric Ising perceptron model to a planted model and proving a comparison result between the two models. Our main technical tool towards this comparison is an inductive argument for the concentration of the logarithm of number of solutions in the model. @InProceedings{STOC21p1579, author = {Will Perkins and Changji Xu}, title = {Frozen 1RSB Structure of the Symmetric Ising Perceptron}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {15791588}, doi = {10.1145/3406325.3451119}, year = {2021}, } Publisher's Version 

Pettie, Seth 
STOC '21: "Information Theoretic Limits ..."
Information Theoretic Limits of Cardinality Estimation: Fisher Meets Shannon
Seth Pettie and Dingyu Wang (University of Michigan, USA) Estimating the cardinality (number of distinct elements) of a large multiset is a classic problem in streaming and sketching, dating back to Flajolet and Martin’s classic Probabilistic Counting (PCSA) algorithm from 1983. In this paper we study the intrinsic tradeoff between the space complexity of the sketch and its estimation error in the random oracle model. We define a new measure of efficiency for cardinality estimators called the FisherShannon (Fish) number H/I. It captures the tension between the limiting Shannon entropy (H) of the sketch and its normalized Fisher information (I), which characterizes the variance of a statistically efficient, asymptotically unbiased estimator. Our results are as follows. (i) We prove that all baseq variants of Flajolet and Martin’s PCSA sketch have Fishnumber H_{0}/I_{0} ≈ 1.98016 and that every baseq variant of (Hyper)LogLog has Fishnumber worse than H_{0}/I_{0}, but that they tend to H_{0}/I_{0} in the limit as q→ ∞. Here H_{0},I_{0} are precisely defined constants. (ii) We describe a sketch called Fishmonger that is based on a smoothed, entropycompressed variant of PCSA with a different estimator function. It is proved that with high probability, Fishmonger processes a multiset of [U] such that at all times, its space is O(log^{2}logU) + (1+o(1))(H_{0}/I_{0})b ≈ 1.98b bits and its standard error is 1/√b. (iii) We give circumstantial evidence that H_{0}/I_{0} is the optimum Fishnumber of mergeable sketches for Cardinality Estimation. We define a class of linearizable sketches and prove that no member of this class can beat H_{0}/I_{0}. The popular mergeable sketches are, in fact, also linearizable. @InProceedings{STOC21p556, author = {Seth Pettie and Dingyu Wang}, title = {Information Theoretic Limits of Cardinality Estimation: Fisher Meets Shannon}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {556569}, doi = {10.1145/3406325.3451032}, year = {2021}, } Publisher's Version 

Pich, Ján 
STOC '21: "Strong Conondeterministic ..."
Strong Conondeterministic Lower Bounds for NP Cannot Be Proved Feasibly
Ján Pich and Rahul Santhanam (Czech Academy of Sciences, Czechia; University of Oxford, UK) We show unconditionally that Cook’s theory PV formalizing polytime reasoning cannot prove, for any nondeterministic polytime machine M defining a language L(M), that L(M) is inapproximable by conondeterministic circuits of subexponential size. In fact, our unprovability result holds also for a theory which supports a fragment of Jeřábek’s theory of approximate counting APC_{1}. We also show similar unconditional unprovability results for the conjecture of Rudich about the existence of superbits. @InProceedings{STOC21p223, author = {Ján Pich and Rahul Santhanam}, title = {Strong Conondeterministic Lower Bounds for NP Cannot Be Proved Feasibly}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {223233}, doi = {10.1145/3406325.3451117}, year = {2021}, } Publisher's Version 

Pilipczuk, Marcin 
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C_{>t} Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C_{>t}free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an nvertex C_{>t}free graph G with weights on vertices, finds in time n^{(log3 n)} a maximumweight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n^{(log2 n)} assuming G is P_{t}free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on P_{t}free graphs in two directions: by encompassing the more general setting of C_{>t}free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version 

Pilipczuk, Michał 
STOC '21: "Finding Large Induced Sparse ..."
Finding Large Induced Sparse Subgraphs in C_{>t} Free Graphs in Quasipolynomial Time
Peter Gartland, Daniel Lokshtanov, Marcin Pilipczuk, Michał Pilipczuk, and Paweł Rzążewski (University of California at Santa Barbara, USA; University of Warsaw, Poland; Warsaw University of Technology, Poland) For an integer t, a graph G is called C_{>t}free if G does not contain any induced cycle on more than t vertices. We prove the following statement: for every pair of integers d and t and a statement φ, there exists an algorithm that, given an nvertex C_{>t}free graph G with weights on vertices, finds in time n^{(log3 n)} a maximumweight vertex subset S such that G[S] has degeneracy at most d and satisfies φ. The running time can be improved to n^{(log2 n)} assuming G is P_{t}free, that is, G does not contain an induced path on t vertices. This expands the recent results of the authors [FOCS 2020 and SOSA 2021] on the Maximum Weight Independent Set problem on P_{t}free graphs in two directions: by encompassing the more general setting of C_{>t}free graphs, and by being applicable to a much wider variety of problems, such as Maximum Weight Induced Forest or Maximum Weight Induced Planar Graph. @InProceedings{STOC21p330, author = {Peter Gartland and Daniel Lokshtanov and Marcin Pilipczuk and Michał Pilipczuk and Paweł Rzążewski}, title = {Finding Large Induced Sparse Subgraphs in <i>C<sub>>t</sub></i> Free Graphs in Quasipolynomial Time}, booktitle = {Proc.\ STOC}, publisher = {ACM}, pages = {330341}, doi = {10.1145/3406325.3451034}, year = {2021}, } Publisher's Version 

Pitassi, Toniann  STOC '21: "Automating Algebraic Proof ..." 