Hi, this is the home page of Nan Jiang (姜楠). I am a machine learning researcher. My core research area is reinforcement learning (RL). I care about the sample efficiency of RL, and use ideas from statistical learning theory to analyze and develop RL algorithms. I am also interested in related areas such as online learning and dynamical system modeling.

Prospective students: please read this note.


2018 – Now   Assistant Professor, CS @ UIUC
2017 – 2018   Postdoc Researcher, MSR NYC
2011 – 2017   PhD, CSE @ UMich

CV (Sept 2022)  
nanjiang_cs ; also on Mastodon  
nanjiang at illinois dot edu
3322 Siebel Center  

Selected Publications
(click to expand)

Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning [arXiv, code]
(NeurIPS-21) Siyuan Zhang, Nan Jiang.
BVFT shows promising empirical performance for offline policy selection.

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function [arXiv]
(COLT-21) Gellert Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári.
Cute tensorization trick for generative model + linear V*.

Batch Value-function Approximation with Only Realizability [arXiv, talk]
(ICML-21) Tengyang Xie, Nan Jiang.
Learning Q* from a realizable and otherwise arbitrary function class, which was believed to be impossible

Minimax Value Interval for Off-Policy Evaluation and Policy Optimization [arXiv]
(NeurIPS-20) Nan Jiang, Jiawei Huang.
Unifying "weight-learning" and "value-learning" methods for MIS-based OPE under an interval that quantifies the biases due to function approximation.

Minimax Weight and Q-Function Learning for Off-Policy Evaluation [arXiv]
(ICML-20) Masatoshi Uehara, Jiawei Huang, Nan Jiang.
Learning importance weights and value functions from each other, with connections to many old and new algorithms in RL.

Information-Theoretic Considerations in Batch Reinforcement Learning [pdf, poster, MSR talk, Simons talk]
(ICML-19) Jinglin Chen, Nan Jiang.
Revisiting some fundamental aspects of value-based RL.

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable [ICML version, arXiv, errata, poster, talk video]
(ICML-17) Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
A new and general theory of exploration in RL with function approximation.

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning [pdf, poster]
(ICML-16) Nan Jiang, Lihong Li.
Simple and effective improvement of importance sampling via control variates.

The Dependence of Effective Planning Horizon on Model Accuracy [pdf, errata, poster, talk video]
(AAMAS-15, Best Paper Award) Nan Jiang, Alex Kulesza, Satinder Singh, Richard Lewis.
Using a smaller discount factor than defined can be viewed as regularization.

Low-Rank Spectral Learning with Weighted Loss Functions [pdf]
(AISTATS-15) Alex Kulesza, Nan Jiang, Satinder Singh.
Approximation guarantees for low-rank learning of PSRs.