Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
E
emoUS-public
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Iterations
Wiki
Requirements
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
general
dsml
emoUS-public
Commits
dedad8e7
Commit
dedad8e7
authored
Nov 23, 2022
by
Carel van Niekerk
Browse files
Options
Downloads
Patches
Plain Diff
Fix bug in ontology enxtraction
parent
d7ba8e0f
No related branches found
No related tags found
No related merge requests found
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
convlab/dst/evaluate_unified_datasets.py
+1
-2
1 addition, 2 deletions
convlab/dst/evaluate_unified_datasets.py
convlab/dst/setsumbt/dataset/utils.py
+9
-2
9 additions, 2 deletions
convlab/dst/setsumbt/dataset/utils.py
with
10 additions
and
4 deletions
convlab/dst/evaluate_unified_datasets.py
+
1
−
2
View file @
dedad8e7
...
...
@@ -7,7 +7,6 @@ def evaluate(predict_result):
metrics
=
{
'
TP
'
:
0
,
'
FP
'
:
0
,
'
FN
'
:
0
}
acc
=
[]
for
sample
in
predict_result
:
pred_state
=
sample
[
'
predictions
'
][
'
state
'
]
gold_state
=
sample
[
'
state
'
]
...
...
This diff is collapsed.
Click to expand it.
convlab/dst/setsumbt/dataset/utils.py
+
9
−
2
View file @
dedad8e7
...
...
@@ -16,6 +16,7 @@
"""
Convlab3 Unified dataset data processing utilities
"""
import
numpy
import
pdb
from
convlab.util
import
load_ontology
,
load_dst_data
,
load_nlu_data
from
convlab.dst.setsumbt.dataset.value_maps
import
VALUE_MAP
,
DOMAINS_MAP
,
QUANTITIES
,
TIME
...
...
@@ -68,7 +69,9 @@ def get_values_from_data(dataset: dict, data_split: str = "train") -> dict:
data
=
load_dst_data
(
dataset
,
data_split
=
'
all
'
,
speaker
=
'
user
'
)
# Remove test data from the data when building training/validation ontology
if
data_split
in
[
'
train
'
,
'
validation
'
]:
if
data_split
==
'
train
'
:
data
=
{
key
:
itm
for
key
,
itm
in
data
.
items
()
if
key
==
'
train
'
}
elif
data_split
==
'
validation
'
:
data
=
{
key
:
itm
for
key
,
itm
in
data
.
items
()
if
key
in
[
'
train
'
,
'
validation
'
]}
value_sets
=
{}
...
...
@@ -76,13 +79,14 @@ def get_values_from_data(dataset: dict, data_split: str = "train") -> dict:
for
turn
in
dataset
:
for
domain
,
substate
in
turn
[
'
state
'
].
items
():
domain_name
=
DOMAINS_MAP
.
get
(
domain
,
domain
.
lower
())
if
domain
not
in
value_sets
:
if
domain
_name
not
in
value_sets
:
value_sets
[
domain_name
]
=
{}
for
slot
,
value
in
substate
.
items
():
if
slot
not
in
value_sets
[
domain_name
]:
value_sets
[
domain_name
][
slot
]
=
[]
if
value
and
value
not
in
value_sets
[
domain_name
][
slot
]:
value_sets
[
domain_name
][
slot
].
append
(
value
)
# pdb.set_trace()
return
clean_values
(
value_sets
)
...
...
@@ -165,6 +169,9 @@ def ontology_add_values(ontology_slots: dict, value_sets: dict, data_split: str
if
data_split
in
[
'
train
'
,
'
validation
'
]:
if
domain
not
in
value_sets
:
continue
possible_values
=
[
v
for
slot
,
vals
in
value_sets
[
domain
].
items
()
for
v
in
vals
]
if
len
(
possible_values
)
==
0
:
continue
ontology
[
domain
]
=
{}
for
slot
in
sorted
(
ontology_slots
[
domain
]):
if
not
ontology_slots
[
domain
][
slot
][
'
possible_values
'
]:
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment