diff --git a/policy/FeudalGainPolicy.py b/policy/FeudalGainPolicy.py index 71236cacbb4f06fe9c2ac23f5b79d6b8a65fe189..88f690b855a6012eec006497775a13f2528a6b5c 100644 --- a/policy/FeudalGainPolicy.py +++ b/policy/FeudalGainPolicy.py @@ -22,12 +22,15 @@ ''' -FeudalGainPolicy.py - What Does The User Want? Information Gain for Hierarchical Dialogue Policy Optimisation +FeudalGainPolicy.py - Information Gain for FeudalRL policies ================================================== -Author: Christian Geishauser +Copyright 2019-2021 HHU Dialogue Systems and Machine Learning Group The implementation of the FeudalGain algorithm that incorporates information gain as intrinsic reward in order to update a Feudal policy. +Information gain is defined as the change in probability distributions between consecutive turns in the belief state. The distribution change is measured using the Jensen-Shannon divergence. FeudalGain builds upon the Feudal Dialogue Management architecture and optimises the information-seeking policy to maximise information gain. If the information-seeking policy for instance requests the area of a restaurant, the information gain reward is calculated by the Jensen-Shannon divergence of the value distributions for area before and after the request. + + The details can be found here: https://arxiv.org/abs/2109.07129 '''