A Paradox on Sampling Trajectories with Approximate Belief States
As mentioned on X, I got interested in a POMDP question more than 1 year ago, and discovered many interesting results against my initial intuition. After procrastinating forever, I eventually found time to write things down,1 and decided to share a piece of the counterintuitive results here as a teaser for the paper. The answer can be found in the paper, and I will post the link here after it appears on arXiv. (Update 11/26: paper link)
-
The style of the paper does not quite fit the common standards of publications (I had an uphill battle before and do not want to try again), and the main purpose of writing is to offload the ideas from my mind so that they do not keep coming up and occupying my thought space forever… and for that I have to strike a trade-off between the rigor of the paper and the time spent on it. As a result, some proofs in the paper are not fully fleshed out, but I have expanded the key analyses to the extent that, I am reasonably confident I am not missing anything major. ↩