A single unit of text derived during tokenisation, often representing a word, subword, or symbol depending on the language model or preprocessing method.
A single unit of text derived during tokenisation, often representing a word, subword, or symbol depending on the language model or preprocessing method.