Skip to content
Snippets Groups Projects
Select Git revision
  • 697ab75f48a6e7640624352760f9ef1c384f10d5
  • master default protected
  • emoUS
  • add_default_vectorizer_and_pretrained_loading
  • clean_code
  • readme
  • issue127
  • generalized_action_dicts
  • ppo_num_dialogues
  • crossowoz_ddpt
  • issue_114
  • robust_masking_feature
  • scgpt_exp
  • e2e-soloist
  • convlab_exp
  • change_system_act_in_env
  • pre-training
  • nlg-scgpt
  • remapping_actions
  • soloist
20 results

README.md

Blame
  • user avatar
    zqwerty authored
    697ab75f
    History
    Code owners
    Assign users and groups as approvers for specific file changes. Learn more.

    Dataset Card for DailyDialog

    Dataset Summary

    DailyDialog is a high-quality multi-turn dialog dataset. It is intriguing in several aspects. The language is human-written and less noisy. The dialogues in the dataset reflect our daily communication way and cover various topics about our daily life. We also manually label the developed dataset with communication intention and emotion information.

    • How to get the transformed data from original data:
    • Main changes of the transformation:
      • Use topic annotation as domain. If duplicated dialogs are annotated with different topics, use the most frequent one.
      • Combine intent and domain annotation as binary dialogue acts.
    • Annotations:
      • intent, emotion

    Supported Tasks and Leaderboards

    NLU, NLG

    Languages

    English

    Data Splits

    split dialogues utterances avg_utt avg_tokens avg_domains cat slot match(state) cat slot match(goal) cat slot match(dialogue act) non-cat slot span(dialogue act)
    train 11118 87170 7.84 13.61 1 - - - -
    validation 1000 8069 8.07 13.5 1 - - - -
    test 1000 7740 7.74 13.78 1 - - - -
    all 13118 102979 7.85 13.61 1 - - - -

    10 domains: ['Ordinary Life', 'School Life', 'Culture & Education', 'Attitude & Emotion', 'Relationship', 'Tourism', 'Health', 'Work', 'Politics', 'Finance']

    • cat slot match: how many values of categorical slots are in the possible values of ontology in percentage.
    • non-cat slot span: how many values of non-categorical slots have span annotation in percentage.

    Citation

    @InProceedings{li2017dailydialog,
        author = {Li, Yanran and Su, Hui and Shen, Xiaoyu and Li, Wenjie and Cao, Ziqiang and Niu, Shuzi},
        title = {DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset},
        booktitle = {Proceedings of The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017)},
        year = {2017}
    }

    Licensing Information

    CC BY-NC-SA 4.0