## *** For readers of "Out-of-Task Training for Dialog State Tracking Models" ***
The first version of the MTL code is available now. `DO.example.mtl` will train a model with MTL using an auxiliary task. As of now, pre-tokenized data is loaded for the auxiliary tasks. The next update will also include tokenization of the original data.
TripPy is a new approach to dialogue state tracking (DST) which makes use of various copy mechanisms to fill slots with values. Our model has no need to maintain a list of candidate values. Instead, all values are extracted from the dialog context on-the-fly.
TripPy is a new approach to dialogue state tracking (DST) which makes use of various copy mechanisms to fill slots with values. Our model has no need to maintain a list of candidate values. Instead, all values are extracted from the dialog context on-the-fly.
...
@@ -17,7 +9,9 @@ Our approach combines the advantages of span-based slot filling methods with mem
...
@@ -17,7 +9,9 @@ Our approach combines the advantages of span-based slot filling methods with mem
## How to run
## How to run
Two example scripts are provided for how to use TripPy. `DO.example.simple` will train and evaluate a simpler model, whereas `DO.example.advanced` uses the parameters that will result in performance similar to the reported ones. Best performance can be achieved by using the maximum sequence length of 512.
Two example scripts are provided for how to use TripPy. `DO.example.simple` will train and evaluate a simpler model, whereas `DO.example.advanced` uses the parameters that will result in performance similar to the reported ones. `DO.example.recommended` uses RoBERTa as encoder and the currently recommended set of hyperparameters. For more challenging datasets with longer dialogues, better performance may be achieved by using the maximum sequence length of 512.
`DO.example.mtl` will train a model with multi-task learning (MTL) using an auxiliary task (See our paper "Out-of-Task Training for Dialog State Tracking Models" for details).
With a sequence length of 180, you should expect the following average JGA:
With a sequence length of 180, you should expect the following average JGA:
- 56% for MultiWOZ 2.1
- 56% for MultiWOZ 2.1
...
@@ -59,4 +57,20 @@ If you use TripPy in your own work, please cite our work as follows:
...
@@ -59,4 +57,20 @@ If you use TripPy in your own work, please cite our work as follows:
}
}
```
```
This repository also contains the code of our paper [Out-of-Task Training for Dialog State Tracking Models"](https://www.aclweb.org/anthology/2020.coling-main.596).
If you use TripPy for MTL, please cite our work as follows:
```
@inproceedings{heck2020task,
title = "Out-of-Task Training for Dialog State Tracking Models",
author = "Heck, Michael and Geishauser, Christian and Lin, Hsien-chin and Lubis, Nurul and
Moresi, Marco and van Niekerk, Carel and Ga{\v{s}}i{\'c}, Milica",
booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
month = dec,
year = "2020",
address = "Barcelona, Spain (Online)",
publisher = "International Committee on Computational Linguistics",