Skip to content
#

natural-language-generation

Here are 295 public repositories matching this topic...

transformers
willfrey
willfrey commented Jul 19, 2021

https://github.com/huggingface/transformers/blob/546dc24e0883e5e9f5eb06ec8060e3e6ccc5f6d7/src/transformers/models/gpt2/modeling_gpt2.py#L698

Assertions can't be relied upon for control flow because they can be disabled, as per the following:

$ python --help
usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...
...
-O     : remove assert and __debug__-dependent statem
gluon-nlp
preeyank5
preeyank5 commented Dec 3, 2020

Description

While using tokenizers.create with the model and vocab file for a custom corpus, the code throws an error and is not able to generate the BERT vocab file

Error Message

ValueError: Mismatch vocabulary! All special tokens specified must be control tokens in the sentencepiece vocabulary.

To Reproduce

from gluonnlp.data import tokenizers
tokenizers.create('spm', model_p

Improve this page

Add a description, image, and links to the natural-language-generation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the natural-language-generation topic, visit your repo's landing page and select "manage topics."

Learn more