Hi, this is the home page of Nan Jiang (姜楠). I am a machine learning researcher. My core research area is reinforcement learning (RL). I care about the sample efficiency of RL, and use ideas from statistical learning theory to analyze and develop RL algorithms. I am also interested in related areas such as online learning and dynamical system modeling.
Prospective students: please read this note.
|2018 – Now||Assistant Professor, CS @ UIUC|
|2017 – 2018||Postdoc Researcher, MSR NYC|
|2011 – 2017||PhD, CSE @ UMich|
|CV (Sept. 2021)|
| nanjiang at illinois dot edu
|3322 Siebel Center|
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning [arXiv, code]
(NeurIPS-21) Siyuan Zhang, Nan Jiang.
BVFT shows promising empirical performance for offline policy selection.
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function [arXiv]
(COLT-21) Gellert Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári.
Cute tensorization trick for generative model + linear V*.
Batch Value-function Approximation with Only Realizability [arXiv, talk]
(ICML-21) Tengyang Xie, Nan Jiang.
Learning Q* from a realizable and otherwise arbitrary function class, which was believed to be impossible
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization [arXiv]
(NeurIPS-20) Nan Jiang, Jiawei Huang.
Unifying "weight-learning" and "value-learning" methods for MIS-based OPE under an interval that quantifies the biases due to function approximation.
Minimax Weight and Q-Function Learning for Off-Policy Evaluation [arXiv]
(ICML-20) Masatoshi Uehara, Jiawei Huang, Nan Jiang.
Learning importance weights and value functions from each other, with connections to many old and new algorithms in RL.
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable [ICML version, arXiv, errata, poster, talk video]
(ICML-17) Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
A new and general theory of exploration in RL with function approximation.
The Dependence of Effective Planning Horizon on Model Accuracy [pdf, errata, poster, talk video]
(AAMAS-15, best paper award) Nan Jiang, Alex Kulesza, Satinder Singh, Richard Lewis.
Using a smaller discount factor than defined can be viewed as regularization.
Low-Rank Spectral Learning with Weighted Loss Functions [pdf]
(AISTATS-15) Alex Kulesza, Nan Jiang, Satinder Singh.
Approximation guarantees for low-rank learning of PSRs.