Skip to content
Snippets Groups Projects
Commit 0e43f667 authored by Michael Heck's avatar Michael Heck
Browse files

README.md update

parent abded2cb
No related branches found
No related tags found
No related merge requests found
## *** For readers of "Out-of-Task Training for Dialog State Tracking Models" ***
The first version of the MTL code is available now. `DO.example.mtl` will train a model with MTL using an auxiliary task. As of now, pre-tokenized data is loaded for the auxiliary tasks. The next update will also include tokenization of the original data.
The paper is available here:
https://www.aclweb.org/anthology/2020.coling-main.596
https://arxiv.org/abs/2011.09379
## Introduction ## Introduction
TripPy is a new approach to dialogue state tracking (DST) which makes use of various copy mechanisms to fill slots with values. Our model has no need to maintain a list of candidate values. Instead, all values are extracted from the dialog context on-the-fly. TripPy is a new approach to dialogue state tracking (DST) which makes use of various copy mechanisms to fill slots with values. Our model has no need to maintain a list of candidate values. Instead, all values are extracted from the dialog context on-the-fly.
...@@ -17,7 +9,9 @@ Our approach combines the advantages of span-based slot filling methods with mem ...@@ -17,7 +9,9 @@ Our approach combines the advantages of span-based slot filling methods with mem
## How to run ## How to run
Two example scripts are provided for how to use TripPy. `DO.example.simple` will train and evaluate a simpler model, whereas `DO.example.advanced` uses the parameters that will result in performance similar to the reported ones. Best performance can be achieved by using the maximum sequence length of 512. Two example scripts are provided for how to use TripPy. `DO.example.simple` will train and evaluate a simpler model, whereas `DO.example.advanced` uses the parameters that will result in performance similar to the reported ones. `DO.example.recommended` uses RoBERTa as encoder and the currently recommended set of hyperparameters. For more challenging datasets with longer dialogues, better performance may be achieved by using the maximum sequence length of 512.
`DO.example.mtl` will train a model with multi-task learning (MTL) using an auxiliary task (See our paper "Out-of-Task Training for Dialog State Tracking Models" for details).
## Datasets ## Datasets
...@@ -25,7 +19,11 @@ Supported datasets are: ...@@ -25,7 +19,11 @@ Supported datasets are:
- sim-M (https://github.com/google-research-datasets/simulated-dialogue.git) - sim-M (https://github.com/google-research-datasets/simulated-dialogue.git)
- sim-R (https://github.com/google-research-datasets/simulated-dialogue.git) - sim-R (https://github.com/google-research-datasets/simulated-dialogue.git)
- WOZ 2.0 (see data/) - WOZ 2.0 (see data/)
- MultiWOZ 2.0 (https://github.com/budzianowski/multiwoz.git)
- MultiWOZ 2.1 (see data/, https://github.com/budzianowski/multiwoz.git) - MultiWOZ 2.1 (see data/, https://github.com/budzianowski/multiwoz.git)
- MultiWOZ 2.2 (https://github.com/budzianowski/multiwoz.git)
- MultiWOZ 2.3 (https://github.com/lexmen318/MultiWOZ-coref.git)
- MultiWOZ 2.4 (https://github.com/smartyfh/MultiWOZ2.4.git)
With a sequence length of 180, you should expect the following average JGA: With a sequence length of 180, you should expect the following average JGA:
- 56% for MultiWOZ 2.1 - 56% for MultiWOZ 2.1
...@@ -59,4 +57,20 @@ If you use TripPy in your own work, please cite our work as follows: ...@@ -59,4 +57,20 @@ If you use TripPy in your own work, please cite our work as follows:
} }
``` ```
This repository also contains the code of our paper [Out-of-Task Training for Dialog State Tracking Models"](https://www.aclweb.org/anthology/2020.coling-main.596).
If you use TripPy for MTL, please cite our work as follows:
```
@inproceedings{heck2020task,
title = "Out-of-Task Training for Dialog State Tracking Models",
author = "Heck, Michael and Geishauser, Christian and Lin, Hsien-chin and Lubis, Nurul and
Moresi, Marco and van Niekerk, Carel and Ga{\v{s}}i{\'c}, Milica",
booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
month = dec,
year = "2020",
address = "Barcelona, Spain (Online)",
publisher = "International Committee on Computational Linguistics",
pages = "6767--6774",
}
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment