Hi, this is the home page of Nan Jiang (姜楠). I am a machine learning researcher. My core research area is reinforcement learning (RL). I care about the sample efficiency of RL, and use ideas from statistical learning theory to analyze and develop RL algorithms.
I am also interested in related areas such as online learning and dynamical system modeling.
Prospective students: please read this note.
Experiences
2018 – Now | Assistant Professor, CS @ UIUC | |
2017 – 2018 | Postdoc Researcher, MSR NYC | |
2011 – 2017 | PhD, CSE @ UMich |
CV (Sept 2022) | |
nanjiang_cs ; also on Mastodon | |
nanjiang at illinois dot edu |
|
3322 Siebel Center |
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning [arXiv, code]
(NeurIPS-21) Siyuan Zhang, Nan Jiang.
BVFT shows promising empirical performance for offline policy selection.
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function [arXiv]
(COLT-21) Gellert Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári.
Cute tensorization trick for generative model + linear V*.
Batch Value-function Approximation with Only Realizability [arXiv, talk]
(ICML-21) Tengyang Xie, Nan Jiang.
Learning Q* from a realizable and otherwise arbitrary function class, which was believed to be impossible
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization [arXiv]
(NeurIPS-20) Nan Jiang, Jiawei Huang.
Unifying "weight-learning" and "value-learning" methods for MIS-based OPE under an interval that quantifies the biases due to function approximation.
Minimax Weight and Q-Function Learning for Off-Policy Evaluation [arXiv]
(ICML-20) Masatoshi Uehara, Jiawei Huang, Nan Jiang.
Learning importance weights and value functions from each other, with connections to many old and new algorithms in RL.
Information-Theoretic Considerations in Batch Reinforcement Learning [pdf, poster, MSR talk, Simons talk]
(ICML-19) Jinglin Chen, Nan Jiang.
Revisiting some fundamental aspects of value-based RL.
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable [ICML version, arXiv, errata, poster, talk video]
(ICML-17) Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
A new and general theory of exploration in RL with function approximation.
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning [pdf, poster]
(ICML-16) Nan Jiang, Lihong Li.
Simple and effective improvement of importance sampling via control variates.
The Dependence of Effective Planning Horizon on Model Accuracy [pdf, errata, poster, talk video]
(AAMAS-15, Best Paper Award) Nan Jiang, Alex Kulesza, Satinder Singh, Richard Lewis.
Using a smaller discount factor than defined can be viewed as regularization.
Low-Rank Spectral Learning with Weighted Loss Functions [pdf]
(AISTATS-15) Alex Kulesza, Nan Jiang, Satinder Singh.
Approximation guarantees for low-rank learning of PSRs.