Enhancing Interpretability in Deep Reinforcement Learning through Semantic Clustering
Authors: Liang Zhang, Justin Lieffers, Adarsh Pyarelal
Conference: NeurIPS 2025 Main Track
Paper: arXiv:2409.17411
This repository contains the official implementation of our research on enhancing interpretability in deep reinforcement learning through semantic clustering techniques. Our work extends the OpenAI train-procgen framework to incorporate semantic clustering methods for improved understanding and visualization of learned policies in procedural environments.
๐ Abstract
This work presents a novel approach to enhancing interpretability in deep reinforcement learning by leveraging semantic clustering techniques. We demonstrate how semantic clustering can provide insights into learned policies, enabling better understanding of agent behavior and decision-making processes in complex procedural environments.
๐ Quick Start
Installation
Prerequisite: Python 3.8.
Clone the repository:
git clone https://github.com/ualiangzhang/semantic_rl.git cd semantic_rlInstall dependencies (Python 3.8):
pip install -r requirements.txtInstall Procgen environments: Follow the installation steps in the Procgen repository.
Basic Usage
Train a semantic clustering model:
python -m train_procgen.train_sppo --env_name <ENV_NAME> --num_levels 0 --distribution_mode easy --timesteps_per_proc 25000000 --rand_seed <RAND_SEED>
Train a baseline model:
python -m train_procgen.train_ppo --env_name <ENV_NAME> --num_levels 0 --distribution_mode easy --timesteps_per_proc 25000000 --rand_seed <RAND_SEED>
๐ Visualization and Analysis
Performance Analysis
Generate generalization figures for a single game:
cd train_procgen
python single_graph.py --env_name <ENV_NAME>
# Example:
python single_graph.py --env_name coinrun
Semantic Clustering Visualization
Generate embedding space visualizations:
python -m train_procgen.enjoy_sppo --env_name <ENV_NAME> --mode 1
Generate skill demonstration videos:
python -m train_procgen.enjoy_sppo --env_name <ENV_NAME> --mode 0
Interactive cluster exploration:
python -m train_procgen.hover_clusters --env_name <ENV_NAME>
# Example:
python -m train_procgen.hover_clusters --env_name fruitbot
๐ฎ Supported Environments
Our implementation supports four Procgen environments:
- CoinRun
- FruitBot
- Jumper
- Ninja
๐ฌ Semantic Clustering Demonstration
Ninja Environment - 8 Semantic Clusters
The following videos demonstrate the 8 distinct semantic clusters learned by our model in the Ninja environment. Each cluster represents a different behavioral pattern and skill set:
๐น Semantic Cluster Demonstrations
Cluster 0
|
Cluster 1
|
Cluster 2
|
Cluster 3
|
Cluster 4
|
Cluster 5
|
Cluster 6
|
Cluster 7
|
๐งญ Behavior Descriptions (Ninja)
| Cluster | Behavior |
|---|---|
| 0 | The agent starts by walking through the first platform and then performs a high jump to reach a higher ledge. |
| 1 | The agent makes small jumps in the middle of the scene. |
| 2 | Two interpretations are present: (1) the agent starts from the leftmost end of the scene and walks to the starting position of Cluster 0; (2) when there are no higher ledges to jump to, the agent begins from the scene, walks over the first platform, and prepares to jump to the subsequent ledge. |
| 3 | The agent walks on the ledge and prepares to jump to a higher ledge. |
| 4 | After performing a high jump, the agent loses sight of the ledge below. |
| 5 | The agent walks on the ledge and prepares to jump onto a ledge at the same height or lower. |
| 6 | The agent executes a high jump while keeping the ledge below in sight. |
| 7 | The agent moves towards the right edge of the scene and touches the mushroom. |
๐ Alternative: Generate Your Own Videos
You can also generate these videos yourself using our code:
# Generate Ninja skill cluster videos
python -m train_procgen.enjoy_sppo --env_name ninja --mode 0 --num_embeddings 8
Note: These videos showcase the distinct behavioral patterns learned by our semantic clustering approach. Each cluster demonstrates different combat strategies, movement patterns, and decision-making processes in the Ninja environment.
๐ Output Structure
baseline/ # Required RL training package
train_procgen/
โโโ checkpoints/ # Trained model checkpoints
โโโ figures/ # Generated visualizations and videos
videos/ # video clips corresponding to the clusters in the paper
๐ Reproducing Results
To reproduce the results from our paper:
- (Optional) Use existing checkpoints: We have provided pre-trained checkpoints for Ninja, FruitBot, and Jumper (random seed 2021) in this repository under
train_procgen/checkpoints/. You can skip training and directly run the visualization scripts. Otherwise, train models using the commands above. - Generate visualizations using the provided scripts
- Analyze results using the interactive tools
Note: Video generation may take 30-60 minutes depending on machine performance, as it ensures comprehensive exploration of all clusters.
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
This work builds upon the OpenAI train-procgen framework. We thank the original authors for their excellent work on procedural generation for reinforcement learning benchmarking.