Difference between model_max_length and max_length

Tokenizer.model_max_length truncates input

Gundeep Singh
2 min readNov 19, 2024

`model_max_length`, a property of Tokenizer truncates the input sequence and pass the truncated input to the model for generation.

max_length limits number of new tokens

This is more intuitive. It’s the max number of new tokens that should be generated by the model.

Confusion

Token generation is auto regressive. So each new generated token is input again to the model. Now what will happen if following (eq. 1) is true?

num_tokens original input + num_tokens generated > model_max_length

Will tokenizer truncate the newly generated tokens?

Answer is: No.

These 2 parameters are independent of each other and don’t interfere with each other.

Here’s how it works:

New token generation does not go thru tokenizer because it’s already tokenized.

In case eq. 1 is true, due to implementation of model.generate method, the truncation will happen on the tokens furthest from newly generated tokens, based on the max context length supported by the model.

Example code to test this

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")

# Function to test model_max_length parameter
def test_model_max_length(max_length, input_text):
tokenizer.model_max_length = max_length
print(f"\nTesting with model_max_length={max_length}")
print("-" * 50)

# Tokenize input
inputs = tokenizer(input_text, return_tensors="pt", truncation=True)
input_sequence = tokenizer.convert_ids_to_tokens(inputs.input_ids[0])
print(f"Input tokens length: {len(input_sequence)}")
print("Tokenized sequence:", input_sequence)

# Generate output with the model
try:
output = model.generate(inputs.input_ids, max_length=50)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print("Generated text:", decoded_output)
except Exception as e:
print(f"Error with model_max_length={max_length}: {e}")

# Sample text input for testing
input_text = "write a long poem about ice cream"

# Test different values of model_max_length
for max_len in [2, 8, 9, 10]:
test_model_max_length(max_len, input_text)

Output shows that model only attends to model_max_length from the input sequence.

Note that generated text is “i love ice i love ice” vs “i love ice cream i love ice cream” based on if the token cream is in context.

Testing with model_max_length=2
--------------------------------------------------
Input tokens length: 2
Tokenized sequence: ['▁write', '</s>']
Generated text: i'm not sure if i'm a good writer or not, but i'm not sure if i'm a good writer.

Testing with model_max_length=8
--------------------------------------------------
Input tokens length: 8
Tokenized sequence: ['▁write', '▁', 'a', '▁long', '▁poem', '▁about', '▁', '</s>']
Generated text: i love you i love you i love you i love you i love you i love you i love you i love you i love you i love you i love you i love you

Testing with model_max_length=9
--------------------------------------------------
Input tokens length: 9
Tokenized sequence: ['▁write', '▁', 'a', '▁long', '▁poem', '▁about', '▁', 'ice', '</s>']
Generated text: i love ice i love ice i love ice i love ice i love ice i love ice i love ice i love ice i love ice i love

Testing with model_max_length=10
--------------------------------------------------
Input tokens length: 10
Tokenized sequence: ['▁write', '▁', 'a', '▁long', '▁poem', '▁about', '▁', 'ice', '▁cream', '</s>']
Generated text: i love ice cream i love ice cream i love ice cream i love ice cream i love ice cream i love ice cream i love ice cream i love ice cream

Hope this clarifies any confusion around it and saves someone running similar code to figure this out.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Gundeep Singh
Gundeep Singh

Written by Gundeep Singh

Learner, Explorer, Developer, Deep Learning & LLM train. GOTTA CATCH EM ALL.

No responses yet

Write a response