Skip to content
Snippets Groups Projects
Select Git revision
0 results

modeling_utils.py

  • aaa123git's avatar
    758860c5
    update train.py · 758860c5
    aaa123git authored
    * use `GPT2TokenizerFast` by default
    * fix cache
    * fix padding: pad `eos_id` instead of `0`
    * preprocess_batch: remove redundant padding token and make sequence length be multiple of 8
    * use native `amp` instead of `apex`
    * fix bug during evaluation when `args.n_gpu > 1`
    * support `torch.optim._multi_tensor.AdamW`
    * support `gradient_checkpointing`
    758860c5
    History
    update train.py
    aaa123git authored
    * use `GPT2TokenizerFast` by default
    * fix cache
    * fix padding: pad `eos_id` instead of `0`
    * preprocess_batch: remove redundant padding token and make sequence length be multiple of 8
    * use native `amp` instead of `apex`
    * fix bug during evaluation when `args.n_gpu > 1`
    * support `torch.optim._multi_tensor.AdamW`
    * support `gradient_checkpointing`
Code owners
Assign users and groups as approvers for specific file changes. Learn more.