Academic and Scholarly Events

  • 10/12 Statistics Colloquium, Sakshi Arya

    STATISTICS COLLOQUIUM

    Virtual talk

     

    Sakshi Arya

    Eberly Fellow and Postdoctoral Researcher

    Department of Statistics

    Penn State University

     

    Epsilon-Greedy strategy for Nonparametric Bandits

     

    Abstract

    Contextual bandit algorithms are popular for sequential decision making in several practical applications, ranging from online advertisement recommendations to mobile health. The goal of such problems is to maximize cumulative reward over time for a set of choices/arms, while taking covariate (or contextual) information into account. Epsilon-Greedy is a popular heuristic for the Multi-Armed Bandits problem, however it is not one of the most studied algorithms theoretically in the presence of contextual information. We study the Epsilon-Greedy strategy in nonparametric bandits (when no parametric form is assumed for the reward functions). Firstly, we modify the strategy to incorporate delayed rewards, show strong consistency and establish finite-time regret bounds for the proposed strategy. Secondly, in order to address the curse of dimensionality for estimation using classical nonparametric estimation approaches, we propose a kernelized epsilon-greedy algorithm. More specifically, we assume that the similarities between the covariates and expected rewards can be modeled as arbitrary linear functions of the contexts' images in a certain reproducing kernel Hilbert space (RKHS). We establish convergence rates for the estimation error and the regret, which are closely tied to the intrinsic dimensionality of the RKHS. We show that the rates closely match the optimal rates for linear contextual bandits when restricted to a finite dimensional RKHS. Lastly, we illustrate our results through simulation studies and real-data analysis.

     

    Bio: Sakshi Arya is an Eberly Fellow and Postdoctoral Researcher at Penn State University. Sakshi received her PhD in Statistics from the University of Minnesota in 2020.Her primary research area is sequential decision making. In particular, she has devised algorithms for contextual bandit problems using nonparametric estimation tools and established theoretical guarantees on the performance of these algorithms. More broadly, she is interested in building the necessary statistical methodology for solving problems using nonparametric tools of estimation. Other areas of interest include spatio-temporal methods with applications in climate modeling and infectious disease modeling.

     


    Wednesday, October 12, 2022

    4:00 pm ET, 1-hour duration

    Join from the meeting link

    https://uconn-cmr.webex.com/uconn-cmr/j.php?MTID=ma6e8f07b2b685cbc284266adec89860b

     

     

     

     

    Join by meeting number

    Meeting number (access code): 2621 754 0841

    Meeting password: 7tbVnikp28u

    Tap to join from a mobile device (attendees only)

    +1-415-655-0002,,26217540841## US Toll

    Join by phone

    +1-415-655-0002 US Toll

    Global call-in numbers

    Join from a video system or application

    Dial 26217540841@uconn-cmr.webex.com

    You can also dial 173.243.2.68 and enter your meeting number.

       

     

     

     

     

     

     

     

     

     

     

    For more information, contact: Tracy Burke at tracy.burke@uconn.edu