Reinforcement Learning Lab

Home | Publications | Teaching | Lab

Welcome to the Reinforcement Learning Lab (RLLab) at KAIST. We focus on designing reinforcement learning algorithms that are both statistically and computationally efficient, with applications in operations research and mobile health.

I am actively looking for motivated PhD and Master's students with a strong background in mathematics to join our lab. If you are interested, please feel free to contact me.

</aside>

Event-Based Reinforcement Learning

Event-based reinforcement learning focuses on environments where actions are triggered by events rather than at fixed time intervals. This approach is particularly relevant for applications such as healthcare, where patient events occur irregularly, or in resource allocation problems with event-triggered decision points. Our research develops algorithms that can efficiently learn optimal policies in these challenging event-driven settings.

Safe Reinforcement Learning

In safety-critical applications, it is essential to restrict policies to those that satisfy safety constraints. Safe reinforcement learning algorithms aim to identify the optimal policy within this restricted set. For example, in autonomous driving, an RL agent may learn to navigate efficiently, but it must also guarantee that safety distances are maintained and traffic rules are never violated. Our research focuses on developing statistically and computationally efficient algorithms for safety-critical reinforcement learning applications.

Learning in Changing Environments

Many real-world systems are non-stationary. For example, in manufacturing, a machine’s performance may gradually degrade or abruptly shift due to malfunction. In such settings, effective decision-making requires both detecting these changes and rapidly adapting policies to remain robust. Our work develops algorithms that address this challenge, enabling reliable performance in dynamic environments.

Learning from Offline Datasets

In many applications, direct interaction with the environment is costly or risky. In such cases, it is essential to leverage previously collected offline data—whether gathered from safe but suboptimal human decisions or from high-fidelity simulators. Our work develops algorithms that are both statistically and computationally efficient, enabling reliable learning in these offline settings.

Learning for Long Horizon

Many operations research applications require optimizing policies for long-term performance rather than short-term gains. Reinforcement learning in this setting is especially demanding because errors can compound over time, and exploration must be carefully balanced against immediate rewards. Our research develops algorithms that ensure reliable long-term performance while maintaining statistical and computational efficiency.