In this short tutorial you will install and go through most important parts of PyDial which enables you to start using and extending the existing architecture.
Firstly, install the repository on your disk and cd to it:
git clone https://bitbucket.org/dialoguesystems/cued-pydial.git
cd cued-pydial
Now, ensure that you use the requirements.txt file to install the appropriate dependencies via pip. You can install pip using easy_install. It is suggested to install all packages in your local directory by adding the command --user.
sudo easy_install pip
pip install --user -r requirements.txt
Note: PyDial doesn't dependent on many repositories however it is suggested to install the repo using virtual environment (http://docs.python-guide.org/en/latest/dev/virtualenvs/).
Let's start by playing around with the available system using pydial.py script. This script is a big wrapper that enables you to chat, train and test architectures in PyDial.
Firstly, you have to change the permission for the script pydial.py to make it executable and add a symbolic link to it (eg. pydial) from your local bin directory so you can run the script from the terminal.
chmod 700 pydial.py
ln -s /path/to/repository/cued-pydial/pydial.py /usr/local/bin/pydial
To have a chat session run:
%%bash
pydial chat config.cfg
where you specify a configuration file needed to set up an appropriate communication with the system. Now you can get to know your favourite restaurant in Cambridge or book a room in some hotel. An example of the dialogue is shown below:
INFO :: 16:02:31: root Agent.py <_print_turn>453 : ** Turn 0 **
Sys > hello()
INFO :: 16:02:31: root Agent.py <_print_sys_act>463 : | Sys > hello()
INFO :: 16:02:31: root Agent.py <_agents_semo>512 : Domain with CONTROL: topicmanager
Prompt > Hello, welcome to the Cambridge Multi- Domain dialogue system. How may I help you?
User > Hello I want a cheap restaurant
Turn 1
INFO :: 16:02:42: root Agent.py <_print_turn>453 : ** Turn 1 **
INFO :: 16:02:42: root RuleTopicTrackers.py <infer_domain>144 : TT keyword found in: Hello I want a cheap restaurant
INFO :: 16:02:42: root TopicTracking.py <track_topic>124 : TopicTracker believes we switched domains. From topicmanager to CamRestaurants
INFO :: 16:02:42: root TopicTracking.py <track_topic>129 : After user_act - domain is now: CamRestaurants
INFO :: 16:02:42: root Agent.py <_hand_control>344 : Launching Dialogue Manager for domain: CamRestaurants
INFO :: 16:02:43: root SemI.py <decode>139 : [(u'inform(pricerange=cheap)|hello()|inform(type=restaurant)', 1.0)]
INFO :: 16:02:43: root SemI.py <_add_context_to_user_act>170 : Possibly adding context to user semi hyps: [(u'inform(pricerange=cheap)|hello()|inform(type=restaurant)', 1.0)]
INFO :: 16:02:43: root SemanticBeliefTracker.py <update_belief_state>48 : SemI > [(u'inform(pricerange=cheap)|hello()|inform(type=restaurant)', 1.0)]
Sys > request(area)
INFO :: 16:02:43: root Agent.py <_print_sys_act>463 : | Sys > request(area)
INFO :: 16:02:43: root Agent.py <_agents_semo>512 : Domain with CONTROL: CamRestaurants
Prompt > What part of town do you have in mind?
User > north
The conversation is divided into turns. By default the system starts with hello() act. This is turned to the full sentence by the generation module and is shown via Prompt. The user asked for a cheap restaurant. This finishes the first turn. We can trace how the system finds the appropriate topic via Topic Tracker and how the user sentence is transformed to the user act via Semantic Decoder Module:
[(u'inform(pricerange=cheap)|hello()|inform(type=restaurant)', 1.0)]
The system decides to ask for an area of the town via request act which is transformed by the semantic generator to the full sentence. The user answers back and the second turn is finished.
Although playing with the ready-made system is nice, we would like to train our own. We will use again pydial.py script which enables us to train, test and analyse the results.
The dialogue may be seen as a control problem where having a distributions over possible belief states we need to take some action which determines what the system says to user. We may apply the reinforcement learning framework to our problem where we look for the optimal policy. One of the example of such algorithm is GP-SARSA (for detailed explanation read Policy module tutorial) which we will use it right now.
We can train our model via command:
pydial train config.cfg
Depending on the computational resources this might take a while. To test a particular iteration of learning we have to specify it after providing config file:
pydial test config.cfg
After training, we can analyse how the reward function as well as a the success rate were optimised with more and more dialogues.
pydial plot logdir/*train*
which should look like this:
GP framework allows for a fast training of effective policy. Thanks to modelling the dependencies in the belief state space, the convergence rate improved substantially – already after 250 dialogue, it achieved 90% success rate for default kernel parameters.
Setting the option --printtab, also tabulates the performance data. All policy information is stored in poldir. Since pydial overrides some config params, the actual configs used for each run are recorded in cfgdir.
By default we train 10 batches of dialogues with 100 dialogues per batch. After each batch we evaluate our policy on 100 test dialogues. You can specify those values as well many other by changing you configuration file (precisely the [exec_config] section). The default set-up looks as follow:
[exec_config]
domain = CamRestaurants
policykind = gp
policydir = poldir # folder to store policies
configdir = cfgdir # folder to store configs
logfiledir = logdir # folder to store logfiles
numtrainbatches = 2 # num training batches (iterations)
traindialogsperbatch = 10 # num dialogs per batch
numbatchtestdialogs = 100 # num dialogs to eval each batch
trainsourceiteration = 0 # this creates a new policy in n batches where
# n=numtrainbatches, otherwise an existing policy is trained further
testiteration = 1 # policy iteration to test
numtestdialogs = 100 # num dialogs per test
trainerrorrate = 0 # train error rate in %
testerrorrate = 0 # test error rate in %
testeverybatch = True # enable batch testing
You can also specify how noisy the input will be using trainerrorrate and testerrorrate or change directiories where policies, configs and logfiles will be stored. For convenience, many config parameters can be overridden on the command line, eg.:
pydial train config.cfg --trainerrorrate=20
pydial test config.cfg 4 --testerrorrate=50
to train a policy at 20% error rate and test the 4th iteration at 50% error rate.
A range of test error rates can be specified as a triple (stErr,enErr,stepSize), eg.:
pydial test config.cfg 5 --testerrorrate='(0,50,10)'
to test a policy at 0%, 10%, 20%, 30%, 40%, and 50% error rates. Logfiles for each train/test run are stored in logdir. The plot commands scan one or more logfiles and extract information to plot. Here you can see how error rate affects the performance of the policy learned in the previous section:
So far we set up general configurations in order to train and test our model. However, we can specify much more detailed information related with specific modules. Let's open an illustrative config file where we can analyse specific sections:
vi config/simulate_singledomain.cfg
The most important section is [GENERAL]
where you specify domain(s) ($\verb|domains|$), whether you operate in multiple-domain ($\verb|singledomain|$), path to the repo when you run on the grid ($\verb|root|$). In the section [agent] you can specify how often the policy is saved ($\verb|savefrequency|$) or what is the number of maximal turns that system can take ($\verb|maxturns|$).
In section [policy_CamRestaurants]
you specify type of belief tracker ($\verb|belieftype|$), policy type ($\verb|policy type|$), if policy should be learnt ($\verb|learning|$), paths to files from you can load and save policy ($\verb|inpolicyfile|$, $\verb|outpolicyfile|$).
You can also specify how evaluation of the dialogue is performed ([eval]
), how goals are generated by user simulator ([goalgenerator]
) and other. More information can be found in config/OPTIONS.cfg
To see the general help for the script write:
%run pydial help
We have prepared a several configurations files to test several modules in PyDial library (directory config
).