update tutorial

786d089a · zqwerty · d7a8f0b6 · 786d089a
Commit 786d089a authored 5 years ago by zqwerty
--- a/tutorials/Getting_Started.ipynb
+++ b/tutorials/Getting_Started.ipynb
@@ -598,7 +598,7 @@
      },
      "source": [
        "set_seed(20200131)\n",
-        "analyzer.compare_model(agent_list=[sys_agent1, sys_agent2], model_name=['sys_agent1', 'sys_agent2'], total_dialog=100)"
+        "analyzer.compare_models(agent_list=[sys_agent1, sys_agent2], model_name=['sys_agent1', 'sys_agent2'], total_dialog=100)"
      ],
      "execution_count": 0,
      "outputs": []

 %% Cell type:markdown id: tags:

 # Getting Started

 In this tutorial, you will know how to
 - use the models in **ConvLab-2** to build a dialog agent.
 - build a simulator to chat with the agent and evaluate the performance.
 - try different module combinations.
 - use analysis tool to diagnose your system.

 Let's get started!

 %% Cell type:markdown id: tags:

 ## Environment setup
 Run the command below to install ConvLab-2. Then restart the notebook and skip this commend.

 %% Cell type:code id: tags:

 ``` 
 # first install ConvLab-2 and restart the notebook
 ! git clone https://github.com/thu-coai/ConvLab-2.git && cd ConvLab-2 && pip install -e .
 ```

 %% Cell type:markdown id: tags:

 ## build an agent

 We use the models adapted on [Multiwoz](https://www.aclweb.org/anthology/D18-1547)  dataset to build our agent. This pipeline agent consists of NLU, DST, Policy and NLG modules.

 First, import some models:

 %% Cell type:code id: tags:

 ``` 
 # common import: convlab2.$module.$model.$dataset
 from convlab2.nlu.jointBERT.multiwoz import BERTNLU
 from convlab2.nlu.milu.multiwoz import MILU
 from convlab2.dst.rule.multiwoz import RuleDST
 from convlab2.policy.rule.multiwoz import RulePolicy
 from convlab2.nlg.template.multiwoz import TemplateNLG
 from convlab2.dialog_agent import PipelineAgent, BiSession
 from convlab2.evaluator.multiwoz_eval import MultiWozEvaluator
 from pprint import pprint
 import random
 import numpy as np
 import torch
 ```

 %% Cell type:markdown id: tags:

 Then, create the models and build an agent:

 %% Cell type:code id: tags:

 ``` 
 # go to README.md of each model for more information
 # BERT nlu
 sys_nlu = BERTNLU()
 # simple rule DST
 sys_dst = RuleDST()
 # rule policy
 sys_policy = RulePolicy()
 # template NLG
 sys_nlg = TemplateNLG(is_user=False)
 # assemble
 sys_agent = PipelineAgent(sys_nlu, sys_dst, sys_policy, sys_nlg, name='sys')
 ```

 %% Cell type:markdown id: tags:

 That's all! Let's chat with the agent using its response function:

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("I want to find a moderate hotel")
 ```

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("Which type of hotel is it ?")
 ```

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("OK , where is its address ?")
 ```

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("Thank you !")
 ```

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("Try to find me a Chinese restaurant in south area .")
 ```

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("Which kind of food it provides ?")
 ```

 %% Cell type:code id: tags:

 ``` 
 sys_agent.response("Book a table for 5 , this Sunday .")
 ```

 %% Cell type:markdown id: tags:

 ## Build a simulator to chat with the agent and evaluate

 In many one-to-one task-oriented dialog system, a simulator is essential to train an RL agent. In our framework, we doesn't distinguish user or system. All speakers are **agents**. The simulator is also an agent, with specific policy inside for accomplishing the user goal.

 We use `Agenda` policy for the simulator, this policy requires dialog act input, which means we should set DST argument of `PipelineAgent` to None. Then the `PipelineAgent` will pass dialog act to policy directly. Refer to `PipelineAgent` doc for more details.

 %% Cell type:code id: tags:

 ``` 
 # MILU
 user_nlu = MILU()
 # not use dst
 user_dst = None
 # rule policy
 user_policy = RulePolicy(character='usr')
 # template NLG
 user_nlg = TemplateNLG(is_user=True)
 # assemble
 user_agent = PipelineAgent(user_nlu, user_dst, user_policy, user_nlg, name='user')
 ```

 %% Cell type:markdown id: tags:


 Now we have a simulator and an agent. we will use an existed simple one-to-one conversation controller BiSession, you can also define your own Session class for your special need.

 We add `MultiWozEvaluator` to evaluate the performance. It uses the parsed dialog act input and policy output dialog act to calculate **inform f1**, **book rate**, and whether the task is **success**.

 %% Cell type:code id: tags:

 ``` 
 evaluator = MultiWozEvaluator()
 sess = BiSession(sys_agent=sys_agent, user_agent=user_agent, kb_query=None, evaluator=evaluator)
 ```

 %% Cell type:markdown id: tags:

 Let's make this two agents chat! The key is `next_turn` method of `BiSession` class.

 %% Cell type:code id: tags:

 ``` 
 def set_seed(r_seed):
    random.seed(r_seed)
    np.random.seed(r_seed)
    torch.manual_seed(r_seed)

 set_seed(20200131)

 sys_response = ''
 sess.init_session()
 print('init goal:')
 pprint(sess.evaluator.goal)
 print('-'*50)
 for i in range(20):
    sys_response, user_response, session_over, reward = sess.next_turn(sys_response)
    print('user:', user_response)
    print('sys:', sys_response)
    print()
    if session_over is True:
        break
 print('task success:', sess.evaluator.task_success())
 print('book rate:', sess.evaluator.book_rate())
 print('inform precision/recall/f1:', sess.evaluator.inform_F1())
 print('-'*50)
 print('final goal:')
 pprint(sess.evaluator.goal)
 print('='*100)
 ```

 %% Cell type:markdown id: tags:

 ## Try different module combinations

 The combination modes of pipeline agent modules are flexible. We support joint models such as MDBT, TRADE, SUMBT for word-DST and MDRG, HDSA, LaRL for word-Policy, once the input and output are matched with previous and next module. We also support End2End models such as Sequicity.

 Available models:

 - NLU: BERTNLU, MILU, SVMNLU
 - DST: RuleDST
 - Word-DST: SUMBT, TRADE, MDBT (set `sys_nlu` to `None`)
 - Policy: RulePolicy, Imitation, REINFORCE, PPO, GDPL
 - Word-Policy: MDRG, HDSA, LaRL (set `sys_nlg` to `None`)
 - NLG: Template, SCLSTM
 - End2End: Sequicity, DAMD, RNN_rollout (directly used as `sys_agent`)
 - Simulator policy: Agenda, VHUS (for `user_policy`)

 %% Cell type:code id: tags:

 ``` 
 # available NLU models
 from convlab2.nlu.svm.multiwoz import SVMNLU
 from convlab2.nlu.jointBERT.multiwoz import BERTNLU
 from convlab2.nlu.milu.multiwoz import MILU
 # available DST models
 from convlab2.dst.rule.multiwoz import RuleDST
 from convlab2.dst.mdbt.multiwoz import MDBT
 from convlab2.dst.sumbt.multiwoz import SUMBT
 from convlab2.dst.trade.multiwoz import TRADE
 # available Policy models
 from convlab2.policy.rule.multiwoz import RulePolicy
 from convlab2.policy.ppo.multiwoz import PPOPolicy
 from convlab2.policy.pg.multiwoz import PGPolicy
 from convlab2.policy.mle.multiwoz import MLEPolicy
 from convlab2.policy.gdpl.multiwoz import GDPLPolicy
 from convlab2.policy.vhus.multiwoz import UserPolicyVHUS
 from convlab2.policy.mdrg.multiwoz import MDRGWordPolicy
 from convlab2.policy.hdsa.multiwoz import HDSA
 from convlab2.policy.larl.multiwoz import LaRL
 # available NLG models
 from convlab2.nlg.template.multiwoz import TemplateNLG
 from convlab2.nlg.sclstm.multiwoz import SCLSTM
 # available E2E models
 from convlab2.e2e.sequicity.multiwoz import Sequicity
 from convlab2.e2e.damd.multiwoz import Damd
 ```

 %% Cell type:markdown id: tags:

 NLU+RuleDST or Word-DST:

 %% Cell type:code id: tags:

 ``` 
 # NLU+RuleDST:
 sys_nlu = BERTNLU()
 # sys_nlu = MILU()
 # sys_nlu = SVMNLU()
 sys_dst = RuleDST()

 # or Word-DST:
 # sys_nlu = None
 # sys_dst = SUMBT()
 # sys_dst = TRADE()
 # sys_dst = MDBT()
 ```

 %% Cell type:markdown id: tags:

 Policy+NLG or Word-Policy:

 %% Cell type:code id: tags:

 ``` 
 # Policy+NLG:
 sys_policy = RulePolicy()
 # sys_policy = PPOPolicy()
 # sys_policy = PGPolicy()
 # sys_policy = MLEPolicy()
 # sys_policy = GDPLPolicy()
 sys_nlg = TemplateNLG(is_user=False)
 # sys_nlg = SCLSTM(is_user=False)

 # or Word-Policy:
 # sys_policy = LaRL()
 # sys_policy = HDSA()
 # sys_policy = MDRGWordPolicy()
 # sys_nlg = None
 ```

 %% Cell type:markdown id: tags:

 Assemble the Pipeline system agent:

 %% Cell type:code id: tags:

 ``` 
 sys_agent = PipelineAgent(sys_nlu, sys_dst, sys_policy, sys_nlg, 'sys')
 ```

 %% Cell type:markdown id: tags:

 Or Directly use an end-to-end model:

 %% Cell type:code id: tags:

 ``` 
 # sys_agent = Sequicity()
 # sys_agent = Damd()
 ```

 %% Cell type:markdown id: tags:

 Config an user agent similarly:

 %% Cell type:code id: tags:

 ``` 
 user_nlu = BERTNLU()
 # user_nlu = MILU()
 # user_nlu = SVMNLU()
 user_dst = None
 user_policy = RulePolicy(character='usr')
 # user_policy = UserPolicyVHUS(load_from_zip=True)
 user_nlg = TemplateNLG(is_user=True)
 # user_nlg = SCLSTM(is_user=True)
 user_agent = PipelineAgent(user_nlu, user_dst, user_policy, user_nlg, name='user')
 ```

 %% Cell type:markdown id: tags:

 ## Use analysis tool to diagnose the system
 We provide an analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues, which facilitates error analysis and
 system improvement. The analyzer will generate an HTML report which contains
 rich statistics of simulated dialogues. For more information, please refer to `convlab2/util/analysis_tool`.

 %% Cell type:code id: tags:

 ``` 
 from convlab2.util.analysis_tool.analyzer import Analyzer

 # if sys_nlu!=None, set use_nlu=True to collect more information
 analyzer = Analyzer(user_agent=user_agent, dataset='multiwoz')

 set_seed(20200131)
 analyzer.comprehensive_analyze(sys_agent=sys_agent, model_name='sys_agent', total_dialog=100)
 ```

 %% Cell type:markdown id: tags:

 To compare several models:

 %% Cell type:code id: tags:

 ``` 
 set_seed(20200131)
-analyzer.compare_model(agent_list=[sys_agent1, sys_agent2], model_name=['sys_agent1', 'sys_agent2'], total_dialog=100)
+analyzer.compare_models(agent_list=[sys_agent1, sys_agent2], model_name=['sys_agent1', 'sys_agent2'], total_dialog=100)
 ```