Winter 2021

Thursdays 3:00 pm - 4:00 pm PST

Zoom link in announcements; contact Sabya Basu ( to be added to the list and/or for additional information.

March 18

Speaker: Yatong Chen

Title: On Making Stochastic Classifiers Deterministic [paper]

Abstract: Stochastic classifiers arise in a number of machine learning problems, and have become especially prominent of late, as they often result from constrained optimization problems, e.g. for fairness, churn, or custom losses. Despite their utility, the inherent randomness of stochastic classifiers may cause them to be problematic to use in practice for a variety of practical reasons. In this paper, we attempt to answer the theoretical question of how well a stochastic classifier can be approximated by a deterministic one, and compare several different approaches, proving lower and upper bounds. We also experimentally investigate the pros and cons of these methods, not only in regard to how successfully each deterministic classifier approximates the original stochastic classifier, but also in terms of how well each addresses the other issues that can make stochastic classifiers undesirable.


March 11

Speaker: Manuj Mukherjee

Title: Approximating Probability Distributions by ReLUNetworks

Abstract: How many neurons are needed to  approximate a target probability distribution using a neural network with a given input distribution and approximation error?  This paper examines this question for the case when the input distribution is uniform, and the target distribution belongs to the class of histogram distributions. We obtain a new upper bound on the number of required neurons, which is strictly better than previously existing upper bounds. The key ingredient in this improvement is an efficient construction of the neural nets representing piecewise linear functions. We also obtain a lower bound on the minimum number of neurons needed to approximate the histogram  distributions.

Bio: Manuj Mukherjee is a postdoc at Bar Ilan University, Ramat Gan, Israel. He completed his PhD from the Indian Institute of Science, Bangalore. Manuj has also spent a couple of years in Telecom Paris (erstwhile Telecom ParisTech), Paris, France, as a postdoc before joining Bar Ilan University.


March 4 (Preponed to 2pm)

Speaker: Andrew Stolman

Title: Challenges in geometric graph embeddings

Abstract: Many machine learning methods rely on exploiting the geometry of real-valued data, however network-structured data is fundamentally discrete and combinatorial. In order to bridge this gap, there is a growing body of work around learning real-valued embeddings of graph vertices which can then be fed to downstream ML tasks which includes DeepWalk, Node2Vec and Graph Convolutional Networks. This line of work appears to take for granted that an embedding into low-dimensional geometry which preserves desirable properties exists. In this talk, I will cover some recent theoretical and practical results which shed some doubt on this assumption.


February 25

Speaker: Zhaowei Zhu

Title: Learning with instance-dependent label noise: A sample sieve approach

Abstract: In this talk, I will continue to introduce peer loss, and the confidence regularizer inspired by peer loss, to help mitigate the effects of noisy labels. Based on the confidence regularizer, we further design a dynamic sample sieve to safely filter out the corrupted instances. This work has been accepted by ICLR 2021. Both theoretical and empirical results will be presented.


February 18

Speaker: C. Seshadhri

Title: Random walks and forbidden minors III: poly(d/ε)-time partition oracles for minor-free graph classes [paper]

Abstract: Consider the family of bounded degree graphs in any minor-closed family (such as planar graphs). Let d be the degree bound and n be the number of vertices of such a graph. Graphs in these classes have hyperfinite decompositions, where, for a sufficiently small ε > 0, one removes εdn edges to get connected components of size independent of n. An important tool for sublinear algorithms and property testing for such classes is the partition oracle, introduced by the seminal work of Hassidim-Kelner-Nguyen-Onak (FOCS 2009). A partition oracle is a local procedure that gives consistent access to a hyperfinite decomposition, without any preprocessing. Given a query vertex v, the partition oracle outputs the component containing v in time independent of n. All the answers are consistent with a single hyperfinite decomposition.
An important open problem in property testing of constructing partition oracles that run in poly(d/ε) time per query. We resolve this open problem, obtaining poly(d/ε)-query algorithms for additive εn-approximations for problems such as maximum matching, minimum vertex cover, maximum independent set, and minimum dominating set for these graph families.
Joint work with Akash Kumar and Andrew Stolman


February 11

Speaker: Sabyasachi Basu

Title: Property testing in hyperfinite graphs [paper]

Abstract: We briefly go over the general idea behind property testing, and talk about property testing in the bounded degree model and some special graphs. More specifically, we talk about some major results due to Newman and Sohler pertaining to a special class of graphs called hyperfinite graphs. These can broadly be seen as an extension of the class of planar graphs, and the results use the characterization of such graphs using structures called k-discs. The results also provide a useful characterization of testable properties.


February 4

Speaker: Reilly Raab

Title: Bayesian Truth Serum: Overview, Extensions and Limitations

Abstract: The Bayesian Truth Serum, proposed by Drazen Prelec in 2004, is an interesting and seminal mechanism for eliciting honest information from independent agents in Nash equilibrium without requiring the mechanism to have access to ground truth, though it makes several limiting assumptions. Several attempts have been made to generalize the mechanism to more robust settings, with mixed success. I will present the mechanism, discuss how and why it works, and detail my own work generalizing it to new settings. 


January 21

Speaker: Raghuvansh R. Saxena (graduate student, Princeton University)

Title: Interactive Error Resilience and the Surprising Power of Feedback

Abstract: Interactive error correcting codes are codes that encode a two party communication protocol to an error-resilient protocol that succeeds even if a constant fraction of the communicated symbols are adversarially corrupted, at the cost of increasing the communication by a constant factor. The fraction of corruptions that such codes can protect against is called the error resilience. Several recent results have shown that drastic gains in the error resilience can be achieved by using interactive codes that implement "feedback". I shall be reviewing (at least) two of these works in this talk.
Based on joint works with Klim Efremenko and Gillat Kol.


January 14

Speaker: Yang Liu

Title: Learning when label noise presents

Abstract: In this talk, I will first explain the potential harms of training with corrupted labels, including the training accuracy drop and violation of fairness guarantees. I then discuss the popular solutions in the literature and their caveats. If time permits, I am going to introduce a cohort of new results from my group on introducing a family of new robust loss functions, which I call peer loss, to help mitigate the effects of noisy labels. Both theoretical and empirical results will be presented.


January 7

Speaker: Noujan Pashanasangi

Title: Near-Linear Time Homomorphism Counting in Bounded Degeneracy Graphs: The Barrier of Long Induced Cycles [paper]

Abstract: Counting homomorphisms of a constant sized pattern graph H in an input graph G is a fundamental computational problem. There is a rich history of studying the complexity of this problem, under various constraints on the input G and the pattern H. Given the significance of this problem and the large sizes of modern inputs, we investigate when near-linear time algorithms are possible. We focus on the case when the input graph has bounded degeneracy, a commonly studied and practically relevant class for homomorphism counting. It is known from previous work that for certain classes of H, H-homomorphisms can be counted exactly in near-linear time in bounded degeneracy graphs. Can we precisely characterize the patterns H for which near-linear time algorithms are possible? We completely resolve this problem, discovering a clean dichotomy using fine-grained complexity. Let m denote the number of edges in G. We prove the following: if the largest induced cycle in H has length at most 5, then there is an O(m log m) algorithm for counting H-homomorphisms in bounded degeneracy graphs. If the largest induced cycle in H has length at least 6, then (assuming standard fine-grained complexity conjectures) there is a constant y>0, such that there is no o(m1+y) time algorithm for counting H-homomorphisms.

Spring 2020

Monday 2:00 pm - 3:00 pm

Zoom (link in email announcements)

April 27

Speaker: Sabyasachi Basu

Title: Semi-supervised learning with graphs and harmonic functions [paper]

Abstract: This is based on a paper from 2003 by Zhu, Ghahramani and Lafferty, that commonly goes by the name ZGL. Their work focuses on semi-supervised learning on graphs, where labelled data is scarce. The authors pose the learning problem as a Gaussian random field on the graph, and the natural choice for the objective is then minimised by a harmonic function. Such a harmonic function can be interpreted in terms of random walks, which implies connections between the learning problem and topics in electrical networks and spectral graph theory.


April 20

Speaker: Andrew Stolman

Title: Inductive Representation Learning on Large Graphs [paper]

Abstract: Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node’s local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.


April 6

Speaker: C. Seshadhri

Title: The impossibility of low-rank representations for triangle-rich complex networks, part 2 [video][paper]

Abstract: A common and increasingly prevalent technique for machine learning in graphs is "low-dimensional graph representation". Given a graph (like a social network), one converts the nodes to vectors in a low-dimensional space. These vectors become "features" for the nodes, which are then used for various ML tasks. Implicitly, these conversions are based on models of social networks, where proximity in low-dimensional space corresponds to edges of a graph. We argue that this model, contrary to popular belief, is fundamentally flawed. We formally prove that low-dimensional embeddings fail to capture the triangle structure of real social networks, thereby failing to model the social structure of the graph. We do a series of empirical studies that validate our theory.


March 30

Speaker: C. Seshadhri

Title: The impossibility of low-rank representations for triangle-rich complex networks, part 1 [video][paper]

Abstract: A common and increasingly prevalent technique for machine learning in graphs is "low-dimensional graph representation". Given a graph (like a social network), one converts the nodes to vectors in a low-dimensional space. These vectors become "features" for the nodes, which are then used for various ML tasks. Implicitly, these conversions are based on models of social networks, where proximity in low-dimensional space corresponds to edges of a graph. We argue that this model, contrary to popular belief, is fundamentally flawed. We formally prove that low-dimensional embeddings fail to capture the triangle structure of real social networks, thereby failing to model the social structure of the graph. We do a series of empirical studies that validate our theory.

Winter 2020: Learning Augmented Algorithms

Monday 2:00 pm - 3:00 pm

E2 - 399

February 24

Speaker: Yatong Chen

Title: Decoupled smoothing on graphs

Abstract: Graph smoothing methods are an extremely popular family of approaches for semi-supervised learning. The choice of graph used to represent relationships in these learning problems is often a more important decision than the particular algorithm or loss function used, yet this choice is less well-studied in the literature. In this work, we demonstrate that for social networks, the basic friendship graph itself may often not be the appropriate graph for predicting node attributes using graph smoothing. More specifically, standard graph smoothing is designed to harness the social phenomenon of homophily whereby individuals are similar to “the company they keep.” We present a decoupled approach to graph smoothing that decouples notions of “identity” and “preference,” resulting in an alternative social phenomenon of monophily whereby individuals are similar to “the company they're kept in,” as observed in recent empirical work. Our model results in a rigorous extension of the Gaussian Markov Random Field (GMRF) models that underlie graph smoothing, interpretable as smoothing on an appropriate auxiliary graph of weighted or unweighted two-hop relationships.


February 10

Speaker: Konstantinos Zampetakis

Title: Coloring triangle-free graphs via semi-random methods

Abstract: The chromatic number of a graph is the least number of colors needed to color the vertices of a graph such that adjacent vertices attain different colors. For general graphs the chromatic number can be as large as the maximum degree plus one (e.g. cliques), however, we should to be able to improve on this bound when considering graphs satisfying certain sparsity conditions. Specifically, we will consider graphs that contain no triangles (triangle-free graphs) and prove that their chromatic number is bounded by \delta/\log \delta, using a seminal technique of probabilistic combinatorics known as "semi-random method".


February 3

Speaker: C. Seshadhri

Title: Self-improving algorithms

Abstract: How does one beat worst-case analysis for, say, a simple problem like sorting? Typically, in the real-world, inputs come from some (ill-defined) "source", and our aim is to process inputs from that source. If one could exploit structure in the inputs, one may be able to design faster algorithms. This talk is about a theoretical framework for such results, called self-improving algorithms. We assume an unknown but fixed distribution of inputs. As it processes more inputs, the self-improving algorithm learns about the distribution and eventually converges to the optimal algorithm for that distribution. We will specifically look at self-improving sorting algorithms.


January 27

Speaker: Shweta Jain

Title: The power of pivoting for exact clique counting


January 13

Speaker: Andrew Stolman

Title: Learning-Based Frequency Estimation Algorithms


January 6

Speaker: Akash Kumar

Title: Online Convex Optimization and Regularity Lemmas (based on Lecture notes by Luca Trevisan)

  1. Learning-Augmented Algorithms (6.890) by Piotr Indyk and Costis Daskalakis, Spring 2019 (MIT)
  2. Summer Workshop on Learning-Based Algorithms at TTIC
Suggested List of Papers
  1. Scheduling with Predictions and the Price of Misprediction. Michael Mitzenmacher. ITCS 2020.
  2. Learning Space Partitions for Nearest Neighbor Search. Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner. ICLR 2020.
  3. Online Algorithms for Rent-Or-Buy with Expert Advice. Sreenivas Gollapudi, Debmalya Panigrahi. ICML 2019.
  4. Learning-Based Frequency Estimation Algorithms. Chen-Yu Hsu, Piotr Indyk, Dina Katabi, Ali Vakilian. ICLR 2019. 
  5. Learning-Based Low-Rank Approximations. Piotr Indyk, Ali Vakilian, Yang Yuan. NeurIPS 2019 Workshop.
  6. Improving Online Algorithms via ML Predictions. Ravi Kumar, Manish Purohit, Zoya Svitkina. NeurIPS 2018.
  7. A Model for Learned Bloom Filters and Optimizing by Sandwiching. Michael Mitzenmacher. Neurips 2018. 
  8. Competitive caching with machine learned advice. Thodoris Lykouris, Sergei Vassilvitskii. ICML 2018.
  9. Revenue Optimization with Approximate Bid Predictions. Andres Muñoz Medina and Sergei Vassilvitskii. NeurIPS 2017.
  10. (Learned) Frequency Estimation Algorithms under Zipfian Distribution. Anders Aamand, Piotr Indyk, Ali Vakilian.
Blogs and Talks
  1. Application-Specific Algorithm Selection by Tim Roughgarden, Simons Institute Open Lecture, 2016. [videos] [slides]
  2. Data Driven Algorithm Design by Nina Balcan, Plenary talk at 24th Annual LIDS Student Conference, MIT 2019. [videos] [slides]
  3. Learning in Algorithms by Sergei Vassilvitskii, Survey talk at 4th Highlights of Algorithms (HALG 2019). [slides]
  4. Learning-Based Frequency Estimation Algorithms by Piotr Indyk at Simons. [videos]
  5. Learning-Augmented Algorithms: How ML Can Lead to Provably Better Algorithms by Michael Mitzenmacher at ALGO 2019. [slides]