Skip to main content
-1 votes
1 answer
25 views

transformer datasets 4.0.0, load_dataset issue

dataset: 4.0.0 pytorch: 2.7.1+cu126 system: Ubuntu 22.04.5 LTS Following the official example I tried this code: import datasets print(datasets.__version__) import torch print(torch.__version__) ...
wei wang's user avatar
  • 419
-1 votes
0 answers
27 views

How to adapt domain of a VLM? [closed]

What I want to do is to adapt one of the models (Qwen/Qwen2.5-VL-3B-Instruct or Qwen/Qwen2.5-VL-7B-Instruct) to food domain. I may be wring in choosing the model, correct me in that case, but those ...
Betydlig's user avatar
  • 131
0 votes
0 answers
25 views

How can I get pooled projected output from clip from transformers library where I dont have token embeddings?

I want to use text_embeddings and combine them with output of an intermediate layer of the text_encoder of the clip. My input to the text_encoder is a learnable prompt embeddings which is intialized ...
Abdullah Ejaz Janjua's user avatar
0 votes
0 answers
42 views

RuntimeError: PassManager::run failed when fine-tuning Phi‑3.5‑mini‑bnb‑4bit with TRL’s SFTTrainer on googlecolab [closed]

I’m trying to fine‑tune the unsloth/Phi-3.5-mini-instruct-bnb-4bit model on a custom text dataset in Google Colab using 4‑bit quantization and TRL’s SFTTrainer. There are no errors in my code, and I ...
F233063 Muhammad Ahsan's user avatar
1 vote
1 answer
40 views

How come tokenization and generation of model behaves differently accross different versions of transformers

I downloaded an old custom model based on Llava that runs on transformer 4.31.0 and I tried to use it together with a Qwen model which uses transformer 4.53.1. After updating transformers the Llava ...
Raymond Li's user avatar
0 votes
0 answers
44 views

How to transcribe local audio File/Blob with Transformers.js pipeline? (JSON.parse error)

I'm working on a browser-based audio transcription app using Transformers.js by Xenova. I'm trying to transcribe a .wav file selected by the user using the following code: import { pipeline } from '@...
piyush's user avatar
  • 1
1 vote
1 answer
62 views

Trained Huggingface EncoderDecoderModel.generate() produces only bos-tokens

I am working on a Huggingface transformers EncoderDecoderModel consisting of a frozen BERT-Encoder (answerdotai-ModernBERT-base) and a trainable GPT2-Decoder. Due to the different architectures for ...
soosmann's user avatar
  • 119
-3 votes
0 answers
49 views

I'm getting an error when trying to load a Hugging Face dataset in Colab [closed]

from transformers import pipeline from datasets import load_dataset # 1. Load the summarization pipeline print("Loading summarization pipeline...") summarizer = pipeline("summarization&...
Samanthika Rajapaksa's user avatar
1 vote
1 answer
42 views

Can I use a custom attention layer while still leveraging a pre-trained BERT model?

In the paper “Using Prior Knowledge to Guide BERT’s Attention in Semantic Textual Matching Tasks”, they multiply a similarity matrix with the attention scores inside the attention layer. I want to ...
Blockchain Kid's user avatar
1 vote
1 answer
72 views

(NVIDIA/nv-embed-v2) ImportError: cannot import name 'MISTRAL_INPUTS_DOCSTRING' from 'transformers.models.mistral.modeling_mistral'

My code: from transformers import AutoTokenizer, AutoModel model_name = "NVIDIA/nv-embed-v2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModel.from_pretrained(...
6zL's user avatar
  • 13
-1 votes
1 answer
39 views

Transformers.js pipeline error with React useState: "text.split is not a function" [closed]

I am trying to use transformers.js to build a simple chatbot to answer questions related to my resume. The error is as below: TypeError: text.split is not a function at closure._encode_text (...
Aditya Singh's user avatar
1 vote
1 answer
61 views

How can I pass is_split_into_words option to LayoutLMv3Processor?

I'm fine-tuning a LayoutLMv3 model using HuggingFace Transformers. During preprocessing, I want to use is_split_into_words=True to ensure proper label alignment for token classification. My setup: I'...
gamba's user avatar
  • 11
0 votes
0 answers
32 views

Getting StopIteration or No Output while using HuggingFace InferenceClient with TinyLlama / Falcon models

I'm trying to run text generation using the Hugging Face InferenceClient in Python, but I keep getting either a StopIteration error or no output at all. Here's my setup: from huggingface_hub import ...
Sarvesh's user avatar
0 votes
1 answer
59 views

Vertex AI TEI Deployment Fails for Private Hugging Face Model - "Could not download model artifacts"

I'm trying to deploy a Hugging Face model to Vertex AI using the Text Embeddings Inference (TEI) workflow, but I'm getting consistent errors during deployment. This same deployment approach worked for ...
George Atoyan's user avatar
1 vote
0 answers
37 views

Fine-tuned LLaMA 2–7B with QLoRA, but reloading fails: missing 4bit metadata. Likely saved after LoRA+resize. Need proper 4bit save method

I’ve been working on fine-tuning LLaMA 2–7B using QLoRA with bitsandbytes 4-bit quantization and ran into a weird issue. I did adaptive pretraining on Arabic data with a custom tokenizer (vocab size ~...
orchid Ali's user avatar

15 30 50 per page
1
2 3 4 5
229