Snippets Groups Projects

README.md

3 years ago
612fc8bb

Added saved tSNE projection; edits to TDA jupyter notebook; added example logfile · 612fc8bb
Benjamin Ruppik authored 3 years ago

612fc8bb

History

Added saved tSNE projection; edits to TDA jupyter notebook; added example logfile
Benjamin Ruppik authored 3 years ago

Code owners

Assign users and groups as approvers for specific file changes. Learn more.

README.md 1.87 KiB

TDA code for Dialogue Term Extraction using Transfer Learning and Topological Data Analysis

This is the Topological Data Analysis portion of the code for the paper 'Dialogue Term Extraction using Transfer Learning and Topological Data Analysis'.

The scripts in this folder should be executed in the tda working directory.

Create embeddings

Precomputed sbert embeddings are contained in the /data folder for the ambient fastText vocabulary, and the joint MultiWOZ and SGD vocabulary. These embeddings are the basis for computing neighborhoods. It is not necessary to recompute these embeddings, for the neighborhood extraction and TDA features skip ahead to the next section.

The following command loads the precomputed embeddings of the fastText vocabulary into an interactive python session:

python -i sbert_create_static_embeddings.py \
  --embeddings_config_path ./sbert_static_embeddings_config_50_0.yaml \
  --vocab_desc pretrained_cc_en \
  --load_embeddings

To compute and save embeddings of the multiwoz and sgd vocabulary:

python sbert_create_static_embeddings.py \
  --embeddings_config_path ./sbert_static_embeddings_config_50_0.yaml \
  --vocab_desc multiwoz_and_sgd \
  --save_embeddings

Build neighborhoods and extract persistence features

The jupyter notebook sbert_ambient_static_neighborhoods_create_persistence_images.ipynb guides through the installation of the TDA dependencies, creation of neighborhoods, computation of persistence features via ripser, and creation of persistence images. Along the way, the embedding space and the neighborhoods can be visualized via 2-dimensional t-SNE projections.

License

This project is licensed under the Apache License, Version 2.0 (the "License"); you may not use the files except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0