You are here
 Home >
 Previous Workshop >
 Program of IMaCCS 2016 >
 Abstracts of IMaCCS 2016
Abstracts
Sanjeev Arora, Princeton University 
Linear Algebraic Structure of Word Meanings
In Natural Language Processing (NLP), semantic word embeddings use vectors to capture the meanings of words. They are often constructed using nonlinear/nonconvex techniques such as deep nets and energy‐based models. Recently, Mikolov et al. (2013) showed that such embeddings exhibit linear structure that can be used to solve "word analogy tasks" such as man: woman :: king: ??. Subsequently, Levy and Goldberg (2014) and Pennington et al. (2014) tried to explain why such a linear structure should arise in embeddings derived from nonlinear methods. 
Hari Balakrishnan, Massachusets Institute of Technology 
TCP Ex Machina: ComputerSynthesized Congestion Control
In computer networks, a fundamental problem is congestion control: how should endpoints transmit data to ensure good performance for their transfers and allocate network resources properly? This problem has received much attention over the past thirty years. The design of such protocols, and indeed similar algorithms in all fields of computer science, has largely been a creative, human effort. In this talk, I ask: Is it possible for a computer program "design" (or "discover") the right rules for congestion control? Should computers, rather than humans, be tasked with developing protocols? And just how well can we make computers perform this task? I will describe Remy, a software program and research agenda to answer these questions. Remy seeks to optimize explicitlystated objectives for network applications and create rules that endpoints can follow. The most surprising result we have found is that, in many cases, Remy not only matches what the best network researchers have invented, but handily surpasses them in performance. I will also describe a new protocol designed by humans using insights from Remy.
This project is joint work with my students Keith Winstein (now on the faculty at Stanford), Anirudh Sivaraman, Venkat Arun, and Pratiksha Thaker. 
Avrim Blum, Carnegie Mellon University 
Reconstructing Preferences from Opaque Transactions
In this talk I will discuss the problem of learning the preferences of agents in a complex system by observing the outcomes of a series of joint interactions. This is joint work with Yishay Mansour and Jamie Morgenstern. 
Robert Calderbank, Duke University 
GraphConnect: A Framework for Regularizing Neural Networks
Deep neural networks have proved very successful in domains where large training sets are available, but when the number of training samples is small, their performance suffers from overfitting. Prior methods of reducing overfitting such as weight decay, DropOut and DropConnect are dataindependent. We shall describe GraphConnect, a complementary method for regularizing deep networks that is datadependent, and is motivated by the observation that data of interest typically lie close to a manifold. Our proposed method encourages the relationships between the learned decisions to resemble a graph representing the original manifold structure. Essentially GraphConnect is designed to learn attributes that are present in data samples in contrast to weight decay, DropOut and DropConnect which are simply designed to make it more difficult to fit to random error or noise. Analysis of empirical Rademacher complexity suggests that GraphConnect is stronger than weight decay as a regularization. Experimental results on several benchmark datasets validate the theoretical analysis, and show that when the number of training samples is small, GraphConnect is able to significantly improve performance over weight decay, and when used in isolation, is competitive with DropOut. 
Peter Glynn, Stanford University 
Multilevel Monte Carlo Algorithms for Computing Gradients and Relative Value Functions
In this talk, we start with an overview of existing Monte Carlo methods related to computation of sensitivities/gradients for Markov chains in which the transition dynamics depend on a parameterized family of policies. We then discuss a new class of algorithms that use multilevel Monte Carlo (MLMC) ideas to facilitate the computation of such sensitivities and that exhibit faster rates of convergence for steadystate quantities. We also show how MLMC can be used to efficiently compute the relative value function for a given policy at a given state x, and discuss other related applications of MLMC ideas. This work is joint with Zeyu Zheng. 
Mor HarcholBalter, Carnegie Mellon University 
Queueing with Redundant Requests: A More Realistic Model
Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to replicate a request so that it joins the queue at multiple servers. The request is considered complete as soon as any one copy of the request completes. Redundancy is beneficial because it allows us to overcome serverside variability  the fact that the server we choose might be temporarily slow, due to factors like background load, network interrupts, garbage collection, and so on. When serverside variability dominates runtime, replicating requests can greatly reduce their response times.

Babak Hassibi, California Institute of Technology 
Crowdsourcing for Clustering: Algorithms and InformationTheoretic Lower Bounds
We consider the problem of clustering unlabelled data using a crowd of nonexperts. Take, for example, a data base of dog (or bird) images that we would like to cluster into breeds (or species) with the collective help of workers who are not experts in dog breeds (or bird species). Questions of interest include what queries to ask, how many, and what algorithms to use to reliably cluster the data. We focus on simple comparison queries (on pairs, triples and, perhaps, more items) and on the use of convex optimization clustering algorithms (that attempt to decompose a partiallyobserved adjacency matrix into low rank and sparse components). In this setting, we obtain necessary and sufficient conditions for success of the clustering algorithm as a function of the number of queries, the number of data points, the number of clusters, the number of outliers, the size of the minimum cluster and the quality of the workers. We contrast this with informationtheoretic lower bounds on the required number of queries obtained by viewing the crowdsourced clustering problem as one of joint source channel coding. A key feature of the approach is to quantify the cost of a query with its entropy. We also present various empirical results to put the theoretical results into perspective. 
Alfred Hero, University of Michigan at Ann Arbor 
Foundational Principles for Large Scale Graph Mining
Many problems in data science depend on mining large graphs for patterns such as communities and high centrality nodes. In most cases the graphs are noisy and negatively affect accuracy as measured by the error rates of putative discoveries and the downstream effect of such errors on the decisionmaker. Foundational principles are essential for specifying conditions under which such errors can be controlled and minimized. 
Rong Jin, Michigan State University and Alibaba 
Largescale Robust Online Matching and Its Application in Ecommerce
This talk will be focused on largescale matching problem, a key problem that has found numerous applications in ecommerce but yet has not been studied extensively. I will particularly focus on two challenges of largescale matching problem. The first challenge is robustness of solution as most quantities used in optimization are based on estimation that are often inaccurate and unreliable. I will introduce two different technique for robust matching, one based on robust optimization theory and the other based on the theory of twosided matching. The second challenge is how to make matching decision in an online fashion. I will introduce an approach based on theory of primaldual online algorithm, where the key is to introduce appropriate discounted awards to account for the future unseen events. For all these algorithms, I will introduce their basic theory, performance guarantee, and empirical studies in in Alibaba ecommerce platform.

Steven Low, California Institute of Technology 
Online optimization of power networks
We are at the cusp of a historical transformation of our power systems into a more sustainable, dynamic, intelligent, and distributed form with hundreds of millions of intelligent endpoints. The optimization and control of such a network requires solving power flow equations, which is wellknown to be hard. The grid, however, solves them in realtime at scale, and we propose to exploit it explicitly to carry out part of our optimization algorithm. This approach not only improves scalability, but also naturally adapts to evolving network conditions. In this talk, we present two examples.
The first example presents an online algorithm to solve an optimal power flow problem at a slow timescale on a radial network where the controllable devices continuously interact with the network that implicitly computes a power flow solution given a control action. Collectively these devices and the network implement a gradient projection algorithm in real time. We prove that the proposed algorithm converges to a set of local optima and provide sufficient conditions under which it converges to a global optimum. We derive an upper bound on the suboptimality gap of any local optimum. This bound suggests that any local optimum is almost as good as any strictly feasible point.
In the second example, the online algorithm integrates primary frequency regulation, secondary frequency regulation, and congestion management at a fast timescale. The algorithm is distributed in that it requires communication only between neighboring buses. Collectively, the controllable devices and the swing dynamics of the network implement a primaldual algorithm to rebalance power, restore nominal frequency and interarea flows, and enforce line limits at a minimum control cost. We prove sufficient conditions under which the algorithm converges to a global optimum. 
R Srikant, University of Illinois at UrbanaChampaign 
Clustering Using Pairwise Comparisons
We consider a pairwise comparisons model with n users and m items. Each user is shown a few pairs of items, and when a pair of items is shown to a user, he or she expresses a preference for one of the items based on a probabilistic model. The goal is to group users into clusters so that users within each cluster have similar preferences. We will present an algorithm which clusters all users correctly with high probability using a number of pairwise comparisons which is within a polylog factor of an informationtheoretic lower bound. Joint work with Barbara Dembin and Siddhartha Satpathi. 
Demosthenis Teneketzis, University of Michigan at Ann Arbor 
Dynamic Games with Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition

Jean Walrand, University of California at Berkeley 
Learning in Networks
