Kevin Regan
PhD Computer Science
ML Engineer @ Google
photo
Machine Learning
arXiv 20
Second Order Optimization Made Practical
R. Anil, V. Gupta, T. Koren, K. Regan and Y. Singer
arXiv Preprint
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods that involve second-order derivatives and/or second-order statistics of the data have become far less prevalent despite strong theoretical properties, due to their prohibitive computation, memory and communication costs. In an attempt to bridge this gap between theoretical and practical optimization, we present a proof-of-concept distributed system implementation of a second-order preconditioned method (specifically, a variant of full-matrix Adagrad), that along with a few yet critical algorithmic and numerical improvements, provides significant practical gains in convergence on state-of-the-art deep models and gives rise to actual wall-time improvements in practice compared to conventional first-order methods. Our design effectively utilizes the prevalent heterogeneous hardware architecture for training deep models which consists of a multicore CPU coupled with multiple accelerator units. We demonstrate superior performance on very large learning problems in machine translation where our distributed implementation runs considerably faster than existing gradient-based methods.
NeurIPS 18
Large Margin Deep Networks for Classification
H. Mobahi, D. Krishnan, G. Elsayed, K. Regan and S. Bengio
Thirty-second Conference on Neural Information Processing Systems
We present a formulation of deep learning that aims at producing a large margin classifier. The notion of margin, minimum distance to a decision boundary, has served as the foundation of several theoretically profound and empirically suc- cessful results for both classification and regression tasks. However, most large margin algorithms are applicable only to shallow models with a preset feature representation; and conventional margin methods for neural networks only enforce margin at the output layer. Such methods are therefore not well suited for deep networks. In this work, we propose a novel loss function to impose a margin on any chosen set of layers of a deep network (including input and hidden layers). Our formulation allows choosing any lp norm (p >= 1) on the metric measuring the margin. We demonstrate that the decision boundary obtained by our loss has nice properties compared to standard classification loss functions. Specifically, we show improved empirical results on the MNIST, CIFAR-10 and ImageNet datasets on multiple tasks: generalization from small training sets, corrupted labels, and robustness against adversarial perturbations. The resulting loss is general and complementary to existing data augmentation (such as random/adversarial input transform) and regularization techniques (weight decay, dropout, and batch norm).
@inproceedings{elsayed2018large, title={Large margin deep networks for classification}, author={Elsayed, Gamaleldin and Krishnan, Dilip and Mobahi, Hossein and Regan, Kevin and Bengio, Samy}, booktitle={Advances in neural information processing systems}, pages={842--852}, year={2018} }
Robust Decision Making
IJCAI 11
Robust Online Optimization of Reward-uncertain MDPs
Kevin Regan and Craig Boutilier
Twenty-Second Joint Conference on Artificial Intelligence
Imprecise-reward Markov decision processes (IRMDPs) are MDPs in which the reward function is only partially specified (e.g., by some elicitation process). Recent work using minimax regret to solve IRMDPs has shown, despite theoretical intractability, how the set of policies that are nondom- inated w.r.t. reward uncertainty can be exploited to accelerate regret computation. However, the number of nondominated policies is generally so large as to undermine this leverage. In this paper, we show how the quality of the approximation can be improved online by pruning/adding nondominated policies during reward elicitation, while maintain- ing computational tractability. Drawing insights from the POMDP literature, we also develop a new anytime algorithm for constructing the set of nondominated policies with provable (anytime) error bounds.
@article{ijcai2011:rb:a, author = {Kevin Regan and Craig Boutilier}, title = { Robust Online Optimization of Reward-uncertain MDPs }, journal = { Twenty-Second Joint Conference on Artificial Intelligence (IJCAI 2011) }, year = {2011} }
IJCAI 11
Eliciting Additive Reward Functions for Markov Decision Processes
Kevin Regan and Craig Boutilier
Twenty-Second Joint Conference on Artificial Intelligence
Specifying the reward function of a Markov decision process (MDP) can be demanding, requiring human assessment of the precise quality of, and tradeoffs among, various states and actions. However, reward functions often possess considerable structure which can be leveraged to streamline their specification. We develop new, decision-theoretically sound heuristics for eliciting rewards for factored MDPs whose reward functions exhibit additive independence. Since we can often find good policies without complete reward specification, we also develop new (exact and approximate) algorithms for robust optimization of imprecise-reward MDPs with such additive reward. Our methods are evaluated in two domains: autonomic computing and assistive technology.
@article{ijcai2011:rb:b, author = {Kevin Regan and Craig Boutilier}, title = { Eliciting Additive Reward Functions for Markov Decision Processes }, journal = { Twenty-Second Joint Conference on Artificial Intelligence (IJCAI 2011) }, year = {2011} }
AAAI 10
Robust Policy Computation in Reward-uncertain MDPs using Nondominated Policies
Kevin Regan and Craig Boutilier
Twenty-Fourth AAAI Conference on Artificial Intelligence
The precise specification of reward functions for Markov decision processes (MDPs) is often extremely difficult, motivating research into both reward elicitation and the robust solution of MDPs with imprecisely specified reward (IRMDPs). We develop new techniques for the robust optimization of IRMDPs, using the minimax regret decision criterion, that exploit the set of nondominated policies, i.e., policies that are optimal for some instantiation of the imprecise reward function. Drawing parallels to POMDP value functions, we devise a Witness-style algorithm for identifying nondominated policies. We also examine several new algorithms for computing minimax regret using the nondominated set, and examine both practically and theoretically the impact of approximating this set. Our results suggest that a small subset of the nondominated set can greatly speed up computation, yet yield very tight approximations to minimax regret.
@article{aaai2010rb, author = {Kevin Regan and Craig Boutilier}, title = { Robust Policy Computation in Reward-uncertain MDPs using Nondominated Policies }, journal = { Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010) }, year = {2010} }
UAI 09
Regret-based Reward Elicitation for Markov Decision Processes
Kevin Regan and Craig Boutilier
The 25th Conference on Uncertainty in Artificial Intelligence
The specification of a Markov decision process (MDP) can be difficult. Reward function specification is especially problematic; in practice, it is often cognitively complex and time-consuming for users to precisely specify rewards. This work casts the problem of specifying rewards as one of preference elicitation and aims to minimize the degree of precision with which a reward function must be specified while still allowing optimal or near-optimal policies to be produced. We first discuss how robust policies can be computed for MDPs given only partial reward information using the minimax regret criterion. We then demonstrate how regret can be reduced by efficiently eliciting reward information using bound queries, using regret-reduction as a means for choosing suitable queries. Empirical results demonstrate that regret-based reward elicitation offers an effective way to produce near-optimal policies without resorting to the precise specification of the entire reward function.
@article{uai09rb, author = {Kevin Regan and Craig Boutilier}, title = { Regret-based Reward Elicitation for Markov Decision Processes }, journal = { UAI-09 The 25th Conference on Uncertainty in Artificial Intelligence }, year = {2009} }
NIPS 08
Regret-based Reward Elicitation for Markov Decision Processes
Kevin Regan and Craig Boutilier
Workshop on Model Uncertainty and Risk in Reinforcement Learning held at the Conference on Neural Information Processing Systems
Traditional methods for finding optimal policies in stochastic, multi-step decision environments require a precise model of both the environment dynamics and the rewards associated with taking actions and the effects of those actions. While dynamics are often easily learnable through observation—and in many domains are stable across different users—reward functions are more problematic. In practice, it is often cognitively complex and time-consuming for users to precisely specify rewards. This work casts the problem of specifying rewards as one of preference elicitation. We will first discuss how robust policies can be computed for Markov Decision Processes given partial reward information using the minimax regret criterion. We will then show how regret can be reduced by efficiently eliciting rewards information using bound queries. Regret-based elicitation of reward offers an efficient way to produce desirable policies without resorting to the precise specification of the entire reward function, and as such, opens up a new avenue for the design of stochastic controllers.
@article{nips08workshop, author = {Kevin Regan and Craig Boutilier}, title = { Regret-based Reward Elicitation for Markov Decision Processes }, journal = { NIPS-08 workshop on Model Uncertainty and Risk in Reinforcement Learning }, volume = {1}, year = {2008} }
ICML 06
An Analytic Solution to Discrete Bayesian Reinforcement Learning
Pascal Poupart and Nikos Vlassis and Jesse Hoey and Kevin Regan
The Twenty-First National Conference on Artificial Intelligence
Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms come short of achieving this goal because the amount of exploration required is often too costly and/or too time consuming for online learning. As a result, RL is mostly used for offline learning in simulated environments. We propose a new algorithm, called BEETLE, for effective online learning that is computationally efficient while minimizing the amount of exploration. We take a Bayesian model-based approach, framing RL as a partially observable Markov decision process. Our two main contributions are the analytical derivation that the optimal value function is the upper envelope of a set of multivariate polynomials, and an efficient point-based value iteration algorithm that exploits this simple parameterization.
@article{icml06, author = {Pascal Poupart and Nikos Vlassis and Jesse Hoey and Kevin Regan}, title = { An Analytic Solution to Discrete Bayesian Reinforcement Learning }, journal = { The Twenty-First National Conference on Artificial Intelligence }, volume = {1}, year = {2006} }
Learning User Concepts
AAAI 10
Simultaneous Elicitation of Preference Features and Utility
Paolo Viappiani, Craig Boutilier and Kevin Regan
Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2010)
ICML 09
Online Feature Elicitation in Interactive Optimization
Craig Boutilier, Kevin Regan and Paolo Viappiani
International Conference on Machine Learning
Most models of utility elicitation in decision support and interactive optimization assume a predefined set of "catalog" features over which user preferences are expressed. However, users may differ in the features over which they are most comfortable expressing their preferences. In this work we consider the problem of feature elicitation: a user's utility function is expressed using features whose definitions (in terms of "catalog" features) are unknown. We cast this as a problem of concept learning, but whose goal is to identify only enough about the concept to enable a good decision to be recommended. We describe computational procedures for identifying optimal alternatives w.r.t. minimax regret in the presence of concept uncertainty; and describe several heuristic query strategies that focus on reduction of relevant concept uncertainty.
@article{icml09brv, author = {Craig Boutilier, Kevin Regan and Paolo Viappiani}, title = { Online Feature Elicitation in Interactive Optimization }, journal = { ICML-09 Iternational Conference on Machine Learning }, year = {2009} }
RecSys 09
Preference Elicitation with Subjective Features
Craig Boutilier, Kevin Regan and Paolo Viappiani
The 3rd ACM Conference on Recommender Systems
Utility or preference elicitation is a critical component in many rec- ommender and decision support systems. However, most frame- works for elicitation assume a predefined set of features (e.g., as derived from catalog descriptions) over which user preferences are expressed. Just as user preferences vary considerably, so too can the features over which they are most comfortable expressing these preferences. In this work, we consider preference elicitation in the presence of subjective or user-defined features. We treat the prob- lem of learning a user’s feature definition as one of concept learn- ing, but whose goal is to learn only enough about the concept defi- nition to enable a good decision to be made. This is complicated by the fact that user preferences are unknown. We describe computa- tional procedures for identifying optimal alternatives w.r.t minimax regret in the presence of both utility and concept uncertainty; and develop several heuristic query strategies that focus on reduction of relevant concept and utility uncertainty. Computational experi- ments verify the efficacy of these strategies.
@article{uai09rb, author = {Craig Boutilier, Kevin Regan and Paolo Viappiani}, title = { Preference Elicitation with Subjective Features }, journal = { RECSYS-09, The 3rd ACM Conference on Recommender Systems }, year = {2009} }
Reputation & Trust
AAAI 06
Bayesian Reputation Modeling in E-Marketplaces Sensitive to Subjectivity, Deception and Change
Kevin Regan and Pascal Poupart and Robin Cohen
International Conference on Machine Learning
We present a model for buying agents in e-marketplaces to interpret evaluations of sellers provided by other buying agents, known as advisors. The interpretation of seller evaluations is complicated by the inherent subjectivity of each advisor, the possibility that advisors may deliberately provide misleading evaluations to deceive competitors and the dynamic nature of seller and advisor behaviors that may naturally change seller evaluations over time. Using a Bayesian approach, we demonstrate how to cope with subjectivity, deception and change in a principled way. More specifically, by modeling seller properties and advisor evaluation functions as dynamic random variables, buyers can progressively learn a probabilistic model that naturally and ``correctly'' calibrate the interpretation of seller evaluations without having to resort to heuristics to explicitely detect and filter/discount unreliable seller evaluations. Our model, called BLADE, is shown empirically to achieve lower mean error in the estimation of seller properties when compared to other models for reasoning about advisor ratings of sellers in electronic maketplaces.
@article{aaai06, author = {Kevin Regan and Pascal Poupart and Robin Cohen}, title = { Bayesian Reputation Modeling in E-Marketplaces Sensitive to Subjectivity, Deception and Change }, journal = { International Conference on Machine Learning }, volume = {1}, year = {2006} }
PST 05
The Advisor-POMDP: A Principled Approach to Trust through Reputation in Electronic Markets
Kevin Regan and Robin Cohen and Pascal Poupart
Conference on Privacy Security & Trust
This paper examines approaches to representing uncertainty in reputation systems for electronic markets with the aim of constructing a decision theoretic framework for collecting information about selling agents and making purchase decisions in the context of a social reputation system. A selection of approaches to representing reputation using Dempster-Shafter Theory and Bayesian probability are surveyed and a model for collecting and using reputation is developed using a Partially Observable Markov Decision Process.
@article{pst05, author = {Kevin Regan and Robin Cohen and Pascal Poupart}, title = { The Advisor-POMDP: A Principled Approach to Trust through Reputation in Electronic Markets }, journal = { Conference on Privacy Security & Trust }, volume = {1}, year = {2005} }
JBT 05
Designing Adaptive Buying Agents in Electronic Markets Using Advice from Other Buyers to Model Seller Reputation
Kevin Regan and Robin Cohen
The Journal of Business and Technology
In this paper, we present a model for designing buying agents in electronic marketplaces that adapt by adjusting decisions about which sellers to select for business, based on reputation ratings provided by other buying agents in the marketplace (known as indirect reputation information). The focus of our research is a method for effectively representing and employing this indirect reputation information. In particular, we address the case of buying agents providing deceptive information to other buyers, by having each buyer model not only the reputation of all sellers in the marketplace but also the reputation of each buyer. We also systematically account for differing standards between buyers, in assessing the reputation of sellers. Our approach is to avoid disreputable buyers when seeking advice and then to pass through three phases: adjusting for subjective bias, estimating seller reputability from reputable buyers only and eliminating any wild predictions. Overall the model presented here builds on a strong foundation of how best to model seller reputation but allows for a suitably cautious integration of a social aspect to the reputation modeling, towards improved purchasing decisions for buyers.
@article{jbt05, author = {Kevin Regan and Robin Cohen}, title = { Designing Adaptive Buying Agents in Electronic Markets Using Advice from Other Buyers to Model Seller Reputation }, journal = { The Journal of Business and Technology }, volume = {1}, year = {2005} }
DASUM 05
Sharing models of sellers amongst buying agents in electronic marketplaces
Kevin Regan and Robin Cohen and Thomas Tran
Decentralized Agent Based and Social Approaches to User Modelling workshop
In this paper, we demonstrate the value of decentralized models of sellers in electronic marketplaces, as the basis for purchasing decisions from buyers. We discuss how buying agents can model the reputation of sellers in order to make effective purchases and how these agents can also take advantage of reputation ratings provided by other buying agents in the marketplace, once it is established that each buyer will be independently modelling the sellers. We outline the methods required to make use of reputation ratings of sellers provided by other buyers, including adjustments for possibly different subjective scales and for possible deception in reporting the reputation ratings. In particular, we propose that buying agents model the reputation not only of sellers but of advisors, as part of the overall processing. We describe as well how buyers can most effectively model the set of possible buying advisors in order to apply selective processing of the information provided by these agents, in their purchasing decisions.
@article{dasum05, author = {Kevin Regan and Robin Cohen and Thomas Tran}, title = { Sharing models of sellers amongst buying agents in electronic marketplaces }, journal = { Decentralized Agent Based and Social Approaches to User Modelling workshop }, volume = {1}, year = {2005} }
BASEWEB 05
Indirect Reputation Assessment for Adaptive Buying Agents in Electronic Markets
Kevin Regan and Robin Cohen
Business Agents and the Semantic Web workshop
In this paper, we present a model for designing buying agents in electronic marketplaces that adapt by adjusting decisions about which sellers to select for business, based on reputation ratings provided by other buying agents in the marketplace (known as indirect reputation information). The focus of our research is a method for effectively representing and employing this indirect reputation information. In particular, we address the case of buying agents providing deceptive information to other buyers, by having each buyer model not only the reputation of all sellers in the marketplace but also the reputation of each buyer. We also systematically account for differing standards between buyers, in assessing the reputation of sellers. Overall, the model presented here builds on a strong foundation of how best to model seller reputation but allows for a suitably cautious integration of a social aspect to the reputation modelling, towards improved purchasing decisions for buyers.
@article{baseweb05, author = {Kevin Regan and Robin Cohen}, title = { Indirect Reputation Assessment for Adaptive Buying Agents in Electronic Markets }, journal = { Business Agents and the Semantic Web workshop }, volume = {1}, year = {2005} }