Tokenizers documentation
Encode Inputs
Encode Inputs
 Python 
 Rust 
 Node 
These types represent all the different kinds of input that a Tokenizer accepts
when using encode_batch().
TextEncodeInput[[[ tokenizers.TextEncodeInput ]]]
tokenizers.TextEncodeInput Represents a textual input for encoding. Can be either:
- A single sequence: TextInputSequence
 - A pair of sequences:
- A Tuple of TextInputSequence
 - Or a List of TextInputSequence of size 2
 
 
alias of Union[str, Tuple[str, str], List[str]].
PreTokenizedEncodeInput[[[ tokenizers.PreTokenizedEncodeInput ]]]
tokenizers.PreTokenizedEncodeInput Represents a pre-tokenized input for encoding. Can be either:
- A single sequence: PreTokenizedInputSequence
 - A pair of sequences:
- A Tuple of PreTokenizedInputSequence
 - Or a List of PreTokenizedInputSequence of size 2
 
 
alias of Union[List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]].
EncodeInput[[[ tokenizers.EncodeInput ]]]
tokenizers.EncodeInput Represents all the possible types of input for encoding. Can be:
- When 
is_pretokenized=False: TextEncodeInput - When 
is_pretokenized=True: PreTokenizedEncodeInput 
alias of Union[str, Tuple[str, str], List[str], Tuple[str], Tuple[Union[List[str], Tuple[str]], Union[List[str], Tuple[str]]], List[Union[List[str], Tuple[str]]]].