I am looking for collaborators to apply RL in real-world applications. I am excited about RL for X, with X being power grid, network control, personalized recommendations, robotics, education, etc. Note that I am looking for RL for real X, not RL for simulation of X; see this thread for more details. Long story short: if you have the platform for deploying new decision-making strategies and doing A/B testing with real users/customers/patients/students/etc, I am all ears!

An intermediate ground between “RL for X” and “RL for simulator of X” is a setting I call “supervised-learning generalization, RL optimization”. They have real-world impact, but I am only interested if the project inspires methodology innovations. Examples include:

  • RL for stock trading, when the trading amount is small that its influence on the market can be ignored.
  • RL for data center scheduling, where “data” is incoming job requests and is the uncertain part. Given fixed job requests, we have accurate simulators of how the cluster processes the jobs.
  • Inventory management, where the store rarely runs out of stock, so there is no “unmet demand”.

The commonality of all above examples is that it is relatively easy to do counterfactual reasoning with the data available (“had I done something different that day, what would happen”), effectively creating a simulator. There are still statistical challenges to generalize to new scenarios (new stock, new job request patterns, new demand, etc), but that is wrt a fixed distribution and does not involve distribution shift due to decision making, a central challenge in statistical RL.

As mentioned above, I have moderate interest in such scenarios. On the other hand, if you have a problem where such counterfactual reasoning is hard and not obvious (what we call “off-policy evaluation”), and you are convinced that it’s an RL problem where long-term consequences need to be accounted for seriously, I will be very excited to discuss potential collaborations!