Skip to content
Snippets Groups Projects
Select Git revision
  • master default protected
  • v1.0.0
2 results

dialogue-term-extraction-using-transfer-learning-and-topological-data-analysis-public

  • Clone with SSH
  • Clone with HTTPS
  • Dialogue Term Extraction using Transfer Learning and Topological Data Analysis

    This is the code for the paper Dialogue Term Extraction using Transfer Learning and Topological Data Analysis.

    1. Data

    We use the Multi-WOZ 2.1 Data-set and the Schema-Guided Dialogue data-set for which the preprocessed datasets we use can be found in the data folder.

    2. Requirements

    Install requirements using:

    pip install -r requirements.txt

    3. BIO-tagging data

    All the needed files for training and testing are given in the data folder.

    4. MLM scores

    Compute the MLM scores by running:

    python data/prep/get_MLM_scores.py --dataset ["multiwoz"|"SGD"]

    5. Topological features

    See the tda directory for instructions on how to generate the word vector embeddings, neighborhoods and persistence features.

    6. Models

    The modelscripts for the MLM scores and the three TDA features (Persistence image vectors, Codensity and Wasserstein norm) are in the models directory.

    7. Training

    There are training scripts for training on Multi-WOZ and SGD respectively, and they are simply executed using:

    python -m training.training_script --train_on ["multiwoz"|"SGD"]

    8. Evaluation

    Compute the prediction for each model using the get_tags scripts in the evaluation directory. Then evaluate the predictions using the evaluation script or the evaluation notebook.

    python -m evaluation.prediction_script --model_trained_on ["multiwoz"|"SGD"] --predictions_on ["multiwoz"|"SGD"]

    Check the README for the evaluation/evalscript for info how to run it.

    License

    This project is licensed under the Apache License, Version 2.0 (the "License"); you may not use the files except in compliance with the License. You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0