You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the meta-llama/Llama-2-70b-chat-hf model on a data frame with 3000 rows, each including a 500-token text. But after 10 rows is processed, I get the following error
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_text_generation.py in raise_text_generation_error(http_error)
472 raise IncompleteGenerationError(message) from http_error
473 if error_type == "overloaded":
--> 474 raise OverloadedError(message) from http_error
475 if error_type == "validation":
476 raise ValidationError(message) from http_error
OverloadedError: Model is overloaded`
Is there any solution to fix this problem, like increasing the rate limit?
The text was updated successfully, but these errors were encountered:
I am using the
meta-llama/Llama-2-70b-chat-hf
model on a data frame with 3000 rows, each including a 500-token text. But after 10 rows is processed, I get the following error` in call_llama2_api(self, messages)
79 def call_llama2_api(self, messages):
80 huggingface.prompt_builder = "llama2"
---> 81 response = huggingface.ChatCompletion.create(
82 model="meta-llama/Llama-2-70b-chat-hf",
83 messages=messages,
/usr/local/lib/python3.10/dist-packages/easyllm/clients/huggingface.py in create(messages, model, temperature, top_p, top_k, n, max_tokens, stop, stream, frequency_penalty, debug)
205 generated_tokens = 0
206 for _i in range(request.n):
--> 207 res = client.text_generation(
208 prompt,
209 details=True,
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py in text_generation(self, prompt, details, stream, model, do_sample, max_new_tokens, best_of, repetition_penalty, return_full_text, seed, stop_sequences, temperature, top_k, top_p, truncate, typical_p, watermark, decoder_input_details)
1063 decoder_input_details=decoder_input_details,
1064 )
-> 1065 raise_text_generation_error(e)
1066
1067 # Parse output
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_text_generation.py in raise_text_generation_error(http_error)
472 raise IncompleteGenerationError(message) from http_error
473 if error_type == "overloaded":
--> 474 raise OverloadedError(message) from http_error
475 if error_type == "validation":
476 raise ValidationError(message) from http_error
OverloadedError: Model is overloaded`
Is there any solution to fix this problem, like increasing the rate limit?
The text was updated successfully, but these errors were encountered: