On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function. [arXiv]
(preprint) Gellert Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári.
Model-free Representation Learning and Exploration in Low-rank MDPs. [arXiv]
(preprint) Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal.
Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency. [arXiv]
(preprint) Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie.
A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting. [arXiv]
(Technical Note) Philip Amortila, Nan Jiang, Tengyang Xie.
Minimax Model Learning. [coming soon]
(AISTATS-21) Cameron Voloshin, Nan Jiang, Yisong Yue.
Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration. [arXiv]
(AAAI-21) Priyank Agrawal, Jinglin Chen, Nan Jiang.
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization. [arXiv]
(NeurIPS-20) Nan Jiang, Jiawei Huang.
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison. [arXiv]
(UAI-20) Tengyang Xie, Nan Jiang.
Minimax Weight and Q-Function Learning for Off-Policy Evaluation. [arXiv]
(ICML-20) Masatoshi Uehara, Jiawei Huang, Nan Jiang.
From Importance Sampling to Doubly Robust Policy Gradient. [arXiv]
(ICML-20) Jiawei Huang, Nan Jiang.
On Value Functions and the Agent-Environment Boundary. [arXiv]
(preprint) Nan Jiang.
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning. [arXiv]
(preprint) Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue.
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles. [arXiv]
(AISTATS-20) Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh.
Provably Efficient Q-Learning with Low Switching Cost. [arXiv]
(NeurIPS-19) Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang.
Deterministic Bellman Residual Minimization. [pdf]
(OptRL Workshop at NeurIPS-19) Ehsan Saleh, Nan Jiang.
Provably Efficient RL with Rich Observations via Latent State Decoding. [arXiv]
(ICML-19) Simon Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford.
Model-based RL in CDPs: PAC bounds and Exponential Improvements over Model-free Approaches. [arXiv]
(COLT-19) Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford.
On Oracle-Efficient PAC RL with Rich Observations. [arXiv]
(NeurIPS-18, w/ spotlight talk) Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon. [pdf]
(COLT-18) Nan Jiang, Alekh Agarwal.
Hierarchical Imitation and Reinforcement Learning. [arXiv]
(ICML-18) Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dudík, Yisong Yue, Hal Daumé III.
Markov Decision Processes with Continuous Side Information. [arXiv]
(ALT-18) Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari.
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable. [ICML version, arXiv, errata, poster, talk video]
(ICML-17) Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire.
On Structural Properties of MDPs that Bound Loss due to Shallow Planning. [pdf]
(IJCAI-16) Nan Jiang, Satinder Singh, Ambuj Tewari.
Low-Rank Spectral Learning with Weighted Loss Functions. [pdf]
(AISTATS-15) Alex Kulesza, Nan Jiang, Satinder Singh.
Spectral Learning of Predictive State Representations with Insufficient Statistics. [pdf]
(AAAI-15) Alex Kulesza, Nan Jiang, Satinder Singh.
A Theory of Model Selection in Reinforcement Learning. [pdf]
(2017) Nan Jiang.