I am fascinated by reinforcement learning in high stakes scenarios-- how can an agent learn from experience to make good decisions when experience is costly or risky, such as in educational software, healthcare decision making, robotics or people-facing applications.
Foundations of efficient reinforcement learning. A key challenge is to understand the limits of how an agent should balance exploration versus exploitation. We have proved (to our knowledge) the first probably approximately correct (PAC) results for transfer reinforcement learning (UAI 2013), concurrent multi-task reinforcement learning (AAAI 2015), and partially observable reinforcement learning (AISTATS 2016). Our recent work introduces uniform PAC, a new theoretical framework for evaluating the performance of a RL algorithm which inherits and unites some of the key features of PAC and regret approaches (NIPS 2017).
What If reasoning for sequential decision making. There is an enormous opportunity to leverage the increasing amounts of data to improve decisions made in healthcare, education, maintenance, and many other applications. Doing so requires what if / counterfactual reasoning, to reason about the potential outcomes should different decisions be made. We have introduced new statistical estimators to direct minimize the error of the predicted performance of new strategies (ICML 2016), provided ways of comparing across alternate model predictions (LAS 2017), analyzed the challenges of using importance sampling-based approaches for selecting policies (UAI 2017, best paper) and created new methods that can exponentially reduce the variance of the resulting estimators (NIPS 2017).
Human-in-the-loop systems. Artificial intelligence has the potential to vastly amplify human intelligence and efficiency. We are working on systems to train crowdworkers using (machine) curated material generated by other crowdworkers (CHI 2016), and identify when to expand the system specification to include new content (AAAI 2017) or sensors. We are also interested in ensuring machine learning systems are well behaved (Arxiv 2017) with respect to their human users' intentions, also known as safe and fair machine learning..
Other areas of active interest include hierarchical reinforcement learning.
Interested students: Unfortunately I am unable to respond to most emails about openings for internships, graduate and postdoctoral positions in my group. Admission decisions are made at the department level so I will not be able to respond about your likelihood of acceptance or possibility of working with me. If you are already enrolled at Stanford or have been admitted, please feel free to reach out if you're interested in discussing research opportunities in my group. I accept (already admitted Stanford) students in my research group every year.