Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Janus flash-attention issues solution #99

Open
oliveirabruno01 opened this issue Jan 17, 2025 · 0 comments
Open

Janus flash-attention issues solution #99

oliveirabruno01 opened this issue Jan 17, 2025 · 0 comments

Comments

@oliveirabruno01
Copy link

As I was trying to run deepseek-ai's Janus on a Colab notebook, I encountered some flash-attention errors, including the one you mentioned in installing-flash-attention.md:

NameError: name '_flash_supports_window_size' is not defined

I couldn't resolve this specific error but managed to get it working by disabling flash-attention entirely for this model.

Fortunately, some guys at Xenova had already addressed this and uploaded a PR to Janus's Hugging Face Hub repository. You can use it by specifying the revision refs/pr/7 when downloading the pretrained model. For example:

import torch
from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM, VLChatProcessor
from janus.utils.io import load_pil_images

# specify the path to the model
revision_id = "refs/pr/7"
model_path = "deepseek-ai/Janus-1.3B"
vl_chat_processor: VLChatProcessor = VLChatProcessor.from_pretrained(model_path)
tokenizer = vl_chat_processor.tokenizer

vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
    model_path, trust_remote_code=True, revision=revision_id
)
vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()

I just tried it with the official Janus Colab demo, and it worked like a charm! I thought you might appreciate this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant