You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As I was trying to run deepseek-ai's Janus on a Colab notebook, I encountered some flash-attention errors, including the one you mentioned in installing-flash-attention.md:
NameError: name '_flash_supports_window_size' is not defined
I couldn't resolve this specific error but managed to get it working by disabling flash-attention entirely for this model.
Fortunately, some guys at Xenova had already addressed this and uploaded a PR to Janus's Hugging Face Hub repository. You can use it by specifying the revision refs/pr/7 when downloading the pretrained model. For example:
importtorchfromtransformersimportAutoModelForCausalLMfromjanus.modelsimportMultiModalityCausalLM, VLChatProcessorfromjanus.utils.ioimportload_pil_images# specify the path to the modelrevision_id="refs/pr/7"model_path="deepseek-ai/Janus-1.3B"vl_chat_processor: VLChatProcessor=VLChatProcessor.from_pretrained(model_path)
tokenizer=vl_chat_processor.tokenizervl_gpt: MultiModalityCausalLM=AutoModelForCausalLM.from_pretrained(
model_path, trust_remote_code=True, revision=revision_id
)
vl_gpt=vl_gpt.to(torch.bfloat16).cuda().eval()
I just tried it with the official Janus Colab demo, and it worked like a charm! I thought you might appreciate this.
The text was updated successfully, but these errors were encountered:
As I was trying to run deepseek-ai's Janus on a Colab notebook, I encountered some flash-attention errors, including the one you mentioned in installing-flash-attention.md:
I couldn't resolve this specific error but managed to get it working by disabling flash-attention entirely for this model.
Fortunately, some guys at Xenova had already addressed this and uploaded a PR to Janus's Hugging Face Hub repository. You can use it by specifying the revision
refs/pr/7
when downloading the pretrained model. For example:I just tried it with the official Janus Colab demo, and it worked like a charm! I thought you might appreciate this.
The text was updated successfully, but these errors were encountered: