Skip to content
Snippets Groups Projects
Commit 1f40d3d7 authored by zqwerty's avatar zqwerty
Browse files

train bertnlu and t5nlu for all utterance, train t5nlg for system&all utterances

parent f66c0935
No related branches found
No related tags found
No related merge requests found
...@@ -43,10 +43,12 @@ Trained models and their performance are available in [Hugging Face Hub](https:/ ...@@ -43,10 +43,12 @@ Trained models and their performance are available in [Hugging Face Hub](https:/
| ------------------------------------------------------------ | ------------- | ---------------------------- | | ------------------------------------------------------------ | ------------- | ---------------------------- |
| [t5-small-goal2dialogue-multiwoz21](https://huggingface.co/ConvLab/t5-small-goal2dialogue-multiwoz21) | Goal2Dialogue | MultiWOZ 2.1 | | [t5-small-goal2dialogue-multiwoz21](https://huggingface.co/ConvLab/t5-small-goal2dialogue-multiwoz21) | Goal2Dialogue | MultiWOZ 2.1 |
| [t5-small-nlu-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlu-multiwoz21) | NLU | MultiWOZ 2.1 | | [t5-small-nlu-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlu-multiwoz21) | NLU | MultiWOZ 2.1 |
| [t5-small-nlu-all-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlu-all-multiwoz21) | NLU | MultiWOZ 2.1 all utterances |
| [t5-small-nlu-sgd](https://huggingface.co/ConvLab/t5-small-nlu-sgd) | NLU | SGD | | [t5-small-nlu-sgd](https://huggingface.co/ConvLab/t5-small-nlu-sgd) | NLU | SGD |
| [t5-small-nlu-tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlu-tm1_tm2_tm3) | NLU | TM1+TM2+TM3 | | [t5-small-nlu-tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlu-tm1_tm2_tm3) | NLU | TM1+TM2+TM3 |
| [t5-small-nlu-multiwoz21_sgd_tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlu-multiwoz21_sgd_tm1_tm2_tm3) | NLU | MultiWOZ 2.1+SGD+TM1+TM2+TM3 | | [t5-small-nlu-multiwoz21_sgd_tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlu-multiwoz21_sgd_tm1_tm2_tm3) | NLU | MultiWOZ 2.1+SGD+TM1+TM2+TM3 |
| [t5-small-nlu-multiwoz21-context3](https://huggingface.co/ConvLab/t5-small-nlu-multiwoz21-context3) | NLU (context=3) | MultiWOZ 2.1 | | [t5-small-nlu-multiwoz21-context3](https://huggingface.co/ConvLab/t5-small-nlu-multiwoz21-context3) | NLU (context=3) | MultiWOZ 2.1 |
| [t5-small-nlu-all-multiwoz21-context3](https://huggingface.co/ConvLab/t5-small-nlu-all-multiwoz21-context3) | NLU (context=3) | MultiWOZ 2.1 all utterances |
| [t5-small-nlu-tm1-context3](https://huggingface.co/ConvLab/t5-small-nlu-tm1-context3) | NLU (context=3) | TM1 | | [t5-small-nlu-tm1-context3](https://huggingface.co/ConvLab/t5-small-nlu-tm1-context3) | NLU (context=3) | TM1 |
| [t5-small-nlu-tm2-context3](https://huggingface.co/ConvLab/t5-small-nlu-tm2-context3) | NLU (context=3) | TM2 | | [t5-small-nlu-tm2-context3](https://huggingface.co/ConvLab/t5-small-nlu-tm2-context3) | NLU (context=3) | TM2 |
| [t5-small-nlu-tm3-context3](https://huggingface.co/ConvLab/t5-small-nlu-tm3-context3) | NLU (context=3) | TM3 | | [t5-small-nlu-tm3-context3](https://huggingface.co/ConvLab/t5-small-nlu-tm3-context3) | NLU (context=3) | TM3 |
...@@ -55,6 +57,8 @@ Trained models and their performance are available in [Hugging Face Hub](https:/ ...@@ -55,6 +57,8 @@ Trained models and their performance are available in [Hugging Face Hub](https:/
| [t5-small-dst-tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-dst-tm1_tm2_tm3) | DST | TM1+TM2+TM3 | | [t5-small-dst-tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-dst-tm1_tm2_tm3) | DST | TM1+TM2+TM3 |
| [t5-small-dst-multiwoz21_sgd_tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-dst-multiwoz21_sgd_tm1_tm2_tm3) | DST | MultiWOZ 2.1+SGD+TM1+TM2+TM3 | | [t5-small-dst-multiwoz21_sgd_tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-dst-multiwoz21_sgd_tm1_tm2_tm3) | DST | MultiWOZ 2.1+SGD+TM1+TM2+TM3 |
| [t5-small-nlg-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlg-multiwoz21) | NLG | MultiWOZ 2.1 | | [t5-small-nlg-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlg-multiwoz21) | NLG | MultiWOZ 2.1 |
| [t5-small-nlg-user-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlg-user-multiwoz21) | NLG | MultiWOZ 2.1 user utterances |
| [t5-small-nlg-all-multiwoz21](https://huggingface.co/ConvLab/t5-small-nlg-all-multiwoz21) | NLG | MultiWOZ 2.1 all utterances |
| [t5-small-nlg-sgd](https://huggingface.co/ConvLab/t5-small-nlg-sgd) | NLG | SGD | | [t5-small-nlg-sgd](https://huggingface.co/ConvLab/t5-small-nlg-sgd) | NLG | SGD |
| [t5-small-nlg-tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlg-tm1_tm2_tm3) | NLG | TM1+TM2+TM3 | | [t5-small-nlg-tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlg-tm1_tm2_tm3) | NLG | TM1+TM2+TM3 |
| [t5-small-nlg-multiwoz21_sgd_tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlg-multiwoz21_sgd_tm1_tm2_tm3) | NLG | MultiWOZ 2.1+SGD+TM1+TM2+TM3 | | [t5-small-nlg-multiwoz21_sgd_tm1_tm2_tm3](https://huggingface.co/ConvLab/t5-small-nlg-multiwoz21_sgd_tm1_tm2_tm3) | NLG | MultiWOZ 2.1+SGD+TM1+TM2+TM3 |
......
...@@ -40,6 +40,7 @@ To illustrate that it is easy to use the model for any dataset that in our unifi ...@@ -40,6 +40,7 @@ To illustrate that it is easy to use the model for any dataset that in our unifi
<tr> <tr>
<th></th> <th></th>
<th colspan=2>MultiWOZ 2.1</th> <th colspan=2>MultiWOZ 2.1</th>
<th colspan=2>MultiWOZ 2.1 all utterances</th>
<th colspan=2>Taskmaster-1</th> <th colspan=2>Taskmaster-1</th>
<th colspan=2>Taskmaster-2</th> <th colspan=2>Taskmaster-2</th>
<th colspan=2>Taskmaster-3</th> <th colspan=2>Taskmaster-3</th>
...@@ -52,12 +53,14 @@ To illustrate that it is easy to use the model for any dataset that in our unifi ...@@ -52,12 +53,14 @@ To illustrate that it is easy to use the model for any dataset that in our unifi
<th>Acc</th><th>F1</th> <th>Acc</th><th>F1</th>
<th>Acc</th><th>F1</th> <th>Acc</th><th>F1</th>
<th>Acc</th><th>F1</th> <th>Acc</th><th>F1</th>
<th>Acc</th><th>F1</th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
<tr> <tr>
<td>BERTNLU</td> <td>BERTNLU</td>
<td>74.5</td><td>85.9</td> <td>74.5</td><td>85.9</td>
<td>59.5</td><td>80.0</td>
<td>72.8</td><td>50.6</td> <td>72.8</td><td>50.6</td>
<td>79.2</td><td>70.6</td> <td>79.2</td><td>70.6</td>
<td>86.1</td><td>81.9</td> <td>86.1</td><td>81.9</td>
...@@ -65,6 +68,7 @@ To illustrate that it is easy to use the model for any dataset that in our unifi ...@@ -65,6 +68,7 @@ To illustrate that it is easy to use the model for any dataset that in our unifi
<tr> <tr>
<td>BERTNLU (context=3)</td> <td>BERTNLU (context=3)</td>
<td>80.6</td><td>90.3</td> <td>80.6</td><td>90.3</td>
<td>58.1</td><td>79.6</td>
<td>74.2</td><td>52.7</td> <td>74.2</td><td>52.7</td>
<td>80.9</td><td>73.3</td> <td>80.9</td><td>73.3</td>
<td>87.8</td><td>83.8</td> <td>87.8</td><td>83.8</td>
......
{
"dataset_name": "multiwoz21",
"data_dir": "unified_datasets/data/multiwoz21/all/context_window_size_0",
"output_dir": "unified_datasets/output/multiwoz21/all/context_window_size_0",
"zipped_model_path": "unified_datasets/output/multiwoz21/all/context_window_size_0/bertnlu_unified_multiwoz21_all_context0.zip",
"log_dir": "unified_datasets/output/multiwoz21/all/context_window_size_0/log",
"DEVICE": "cuda:0",
"seed": 2019,
"cut_sen_len": 40,
"use_bert_tokenizer": true,
"context_window_size": 0,
"model": {
"finetune": true,
"context": false,
"context_grad": false,
"pretrained_weights": "bert-base-uncased",
"check_step": 1000,
"max_step": 10000,
"batch_size": 128,
"learning_rate": 1e-4,
"adam_epsilon": 1e-8,
"warmup_steps": 0,
"weight_decay": 0.0,
"dropout": 0.1,
"hidden_units": 768
}
}
\ No newline at end of file
{
"dataset_name": "multiwoz21",
"data_dir": "unified_datasets/data/multiwoz21/all/context_window_size_3",
"output_dir": "unified_datasets/output/multiwoz21/all/context_window_size_3",
"zipped_model_path": "unified_datasets/output/multiwoz21/all/context_window_size_3/bertnlu_unified_multiwoz21_all_context3.zip",
"log_dir": "unified_datasets/output/multiwoz21/all/context_window_size_3/log",
"DEVICE": "cuda:0",
"seed": 2019,
"cut_sen_len": 40,
"use_bert_tokenizer": true,
"context_window_size": 3,
"model": {
"finetune": true,
"context": true,
"context_grad": true,
"pretrained_weights": "bert-base-uncased",
"check_step": 1000,
"max_step": 10000,
"batch_size": 128,
"learning_rate": 1e-4,
"adam_epsilon": 1e-8,
"warmup_steps": 0,
"weight_decay": 0.0,
"dropout": 0.1,
"hidden_units": 1536
}
}
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment