Snippets Groups Projects

ddpt-public

Christian authored Aug 18, 2022

d3874c23

d3874c23 Aug 18, 2022

Name	Last commit	Last update
.github/ISSUE_TEMPLATE
convlab2
data_loaders
deploy
docs
plot_continual_learning
plot_results
tests
tutorials
.DS_Store
.codebeatignore
.gitignore
.travis.yml
LICENSE
MANIFEST.in
NOTICE
PULL_REQUEST_TEMPLATE.md
README.md
setup.cfg
setup.py
start_up.sh

Dynamic Dialogue Policy for Continual Reinforcement Learning

This is the code base to the paper Dynamic Dialogue Policy for Continual Reinforcement Learning (https://arxiv.org/abs/2204.05928)

The code is adapted and extended from ConvLab-2 (https://github.com/thu-coai/ConvLab-2)

Installation

Require python 3.6.

Run the start_up.sh script to create a virtual environment and install requirements:

cd convlab-2
bash start_up.sh
source venv/bin/activate

Run Continual Reinforcement Learning experiments

We provide three different models for running experiments: DDPT, MLP (Bin), semantic (Sem). These can be found in the folder /convlab2/policy/ under vtrace_DPT, vtrace_MLP and vtrace_semantic.

In each model folder you will find scripts for running trainings in model_folder/run_scripts. By using them, you can train models on three different domain orders or with the transformer-based user simulator (TUS). You can of course adapt them to run different trainings. Each folder also contains a config.json file, where you can speficy continual learning parameters such as the online-offline-ratio. You can also directly execute a continual learning training, for instance, by running

python convlab2/policy/vtrace_DPT/train_continually.py --use_masking

to start a DDPT training directly on the mixed order.

Once a training is started, it will create an experiments folder experiment_TIMESTAMP in the model folder with all necessary information. This folder will be moved to model_folder/finished_experiments after training is done.

Evaluation

Evaluation is done through executing the script

python plot_continual_learning/plot_cl.py model1 model2 model3 --dir=experiment_folder

As an example of the folder structure, have a look at plot_continual_learning/easy2hard_order_experiments, which already contains experiment folders. In the example, where you would like to compare Bin, DDPT and Sem, you would execute

python plot_continual_learning/plot_cl.py Bin Sem DDPT --dir=plot_continual_learning/easy2hard_order_experiments

This will create a folder cl_plots inside easy2hard_order_experiments, where you can find all plotted results. In addition, the model folders Bin, DDPT and Sem will have excel-sheets with information regarding forward transfer and forgetting.