UConn Daily Digest

Academic and Scholarly Events

10/12 Statistics Colloquium, Sakshi Arya

STATISTICS COLLOQUIUM

Virtual talk

Sakshi Arya

Eberly Fellow and Postdoctoral Researcher

Department of Statistics

Penn State University

Epsilon-Greedy strategy for Nonparametric Bandits

Abstract

Contextual bandit algorithms are popular for sequential decision making in several practical applications, ranging from online advertisement recommendations to mobile health. The goal of such problems is to maximize cumulative reward over time for a set of choices/arms, while taking covariate (or contextual) information into account. Epsilon-Greedy is a popular heuristic for the Multi-Armed Bandits problem, however it is not one of the most studied algorithms theoretically in the presence of contextual information. We study the Epsilon-Greedy strategy in nonparametric bandits (when no parametric form is assumed for the reward functions). Firstly, we modify the strategy to incorporate delayed rewards, show strong consistency and establish finite-time regret bounds for the proposed strategy. Secondly, in order to address the curse of dimensionality for estimation using classical nonparametric estimation approaches, we propose a kernelized epsilon-greedy algorithm. More specifically, we assume that the similarities between the covariates and expected rewards can be modeled as arbitrary linear functions of the contexts' images in a certain reproducing kernel Hilbert space (RKHS). We establish convergence rates for the estimation error and the regret, which are closely tied to the intrinsic dimensionality of the RKHS. We show that the rates closely match the optimal rates for linear contextual bandits when restricted to a finite dimensional RKHS. Lastly, we illustrate our results through simulation studies and real-data analysis.

Bio: Sakshi Arya is an Eberly Fellow and Postdoctoral Researcher at Penn State University. Sakshi received her PhD in Statistics from the University of Minnesota in 2020.Her primary research area is sequential decision making. In particular, she has devised algorithms for contextual bandit problems using nonparametric estimation tools and established theoretical guarantees on the performance of these algorithms. More broadly, she is interested in building the necessary statistical methodology for solving problems using nonparametric tools of estimation. Other areas of interest include spatio-temporal methods with applications in climate modeling and infectious disease modeling.

Wednesday, October 12, 2022

4:00 pm ET, 1-hour duration

Join from the meeting link

https://uconn-cmr.webex.com/uconn-cmr/j.php?MTID=ma6e8f07b2b685cbc284266adec89860b

Join by meeting number

Meeting number (access code): 2621 754 0841

Meeting password: 7tbVnikp28u

Tap to join from a mobile device (attendees only)
+1-415-655-0002,,26217540841## US Toll
Join by phone
+1-415-655-0002 US Toll
Global call-in numbers
Join from a video system or application
Dial 26217540841@uconn-cmr.webex.com
You can also dial 173.243.2.68 and enter your meeting number.

For more information, contact: Tracy Burke at tracy.burke@uconn.edu

Other stories from the Student Daily Digest for Monday, October 10, 2022 >>

University of Connecticut

Student Daily Digest

Academic and Scholarly Events

10/12 Statistics Colloquium, Sakshi Arya