Enhancing Interpretability in Deep Reinforcement Learning through Semantic Clustering

Authors: Liang Zhang, Justin Lieffers, Adarsh Pyarelal
Conference: NeurIPS 2025 Main Track
Paper: arXiv:2409.17411

This repository contains the official implementation of our research on enhancing interpretability in deep reinforcement learning through semantic clustering techniques. Our work extends the OpenAI train-procgen framework to incorporate semantic clustering methods for improved understanding and visualization of learned policies in procedural environments.

📋 Abstract

This work presents a novel approach to enhancing interpretability in deep reinforcement learning by leveraging semantic clustering techniques. We demonstrate how semantic clustering can provide insights into learned policies, enabling better understanding of agent behavior and decision-making processes in complex procedural environments.

🚀 Quick Start

Installation

Prerequisite: Python 3.8.

Clone the repository:

git clone https://github.com/ualiangzhang/semantic_rl.git
cd semantic_rl

Install dependencies (Python 3.8):
```
pip install -r requirements.txt
```
Install Procgen environments: Follow the installation steps in the Procgen repository.

Basic Usage

Train a semantic clustering model:

python -m train_procgen.train_sppo --env_name <ENV_NAME> --num_levels 0 --distribution_mode easy --timesteps_per_proc 25000000 --rand_seed <RAND_SEED>

Train a baseline model:

python -m train_procgen.train_ppo --env_name <ENV_NAME> --num_levels 0 --distribution_mode easy --timesteps_per_proc 25000000 --rand_seed <RAND_SEED>

📊 Visualization and Analysis

Performance Analysis

Generate generalization figures for a single game:

cd train_procgen
python single_graph.py --env_name <ENV_NAME>
# Example:
python single_graph.py --env_name coinrun

Semantic Clustering Visualization

Generate embedding space visualizations:

python -m train_procgen.enjoy_sppo --env_name <ENV_NAME> --mode 1

Generate skill demonstration videos:

python -m train_procgen.enjoy_sppo --env_name <ENV_NAME> --mode 0

Interactive cluster exploration:

python -m train_procgen.hover_clusters --env_name <ENV_NAME>
# Example:
python -m train_procgen.hover_clusters --env_name fruitbot

🎮 Supported Environments

Our implementation supports four Procgen environments:

CoinRun
FruitBot
Jumper
Ninja

🎬 Semantic Clustering Demonstration

Ninja Environment - 8 Semantic Clusters

The following videos demonstrate the 8 distinct semantic clusters learned by our model in the Ninja environment. Each cluster represents a different behavioral pattern and skill set:

📹 Semantic Cluster Demonstrations

Cluster 0	Cluster 1	Cluster 2	Cluster 3
Cluster 4	Cluster 5	Cluster 6	Cluster 7

🧭 Behavior Descriptions (Ninja)

Cluster	Behavior
0	The agent starts by walking through the first platform and then performs a high jump to reach a higher ledge.
1	The agent makes small jumps in the middle of the scene.
2	Two interpretations are present: (1) the agent starts from the leftmost end of the scene and walks to the starting position of Cluster 0; (2) when there are no higher ledges to jump to, the agent begins from the scene, walks over the first platform, and prepares to jump to the subsequent ledge.
3	The agent walks on the ledge and prepares to jump to a higher ledge.
4	After performing a high jump, the agent loses sight of the ledge below.
5	The agent walks on the ledge and prepares to jump onto a ledge at the same height or lower.
6	The agent executes a high jump while keeping the ledge below in sight.
7	The agent moves towards the right edge of the scene and touches the mushroom.

📊 Alternative: Generate Your Own Videos

You can also generate these videos yourself using our code:

# Generate Ninja skill cluster videos
python -m train_procgen.enjoy_sppo --env_name ninja --mode 0 --num_embeddings 8

Note: These videos showcase the distinct behavioral patterns learned by our semantic clustering approach. Each cluster demonstrates different combat strategies, movement patterns, and decision-making processes in the Ninja environment.

📁 Output Structure

baseline/                # Required RL training package
train_procgen/
├── checkpoints/         # Trained model checkpoints
├── figures/             # Generated visualizations and videos
videos/                  # video clips corresponding to the clusters in the paper

📈 Reproducing Results

To reproduce the results from our paper:

(Optional) Use existing checkpoints: We have provided pre-trained checkpoints for Ninja, FruitBot, and Jumper (random seed 2021) in this repository under train_procgen/checkpoints/. You can skip training and directly run the visualization scripts. Otherwise, train models using the commands above.
Generate visualizations using the provided scripts
Analyze results using the interactive tools

Note: Video generation may take 30-60 minutes depending on machine performance, as it ensures comprehensive exploration of all clusters.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

This work builds upon the OpenAI train-procgen framework. We thank the original authors for their excellent work on procedural generation for reinforcement learning benchmarking.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning