3,433 questions
-1
votes
1
answer
25
views
transformer datasets 4.0.0, load_dataset issue
dataset: 4.0.0
pytorch: 2.7.1+cu126
system: Ubuntu 22.04.5 LTS
Following the official example I tried this code:
import datasets
print(datasets.__version__)
import torch
print(torch.__version__)
...
-1
votes
0
answers
27
views
How to adapt domain of a VLM? [closed]
What I want to do is to adapt one of the models (Qwen/Qwen2.5-VL-3B-Instruct or Qwen/Qwen2.5-VL-7B-Instruct) to food domain. I may be wring in choosing the model, correct me in that case, but those ...
0
votes
0
answers
25
views
How can I get pooled projected output from clip from transformers library where I dont have token embeddings?
I want to use text_embeddings and combine them with output of an intermediate layer of the text_encoder of the clip. My input to the text_encoder is a learnable prompt embeddings which is intialized ...
0
votes
0
answers
42
views
RuntimeError: PassManager::run failed when fine-tuning Phi‑3.5‑mini‑bnb‑4bit with TRL’s SFTTrainer on googlecolab [closed]
I’m trying to fine‑tune the unsloth/Phi-3.5-mini-instruct-bnb-4bit model on a custom text dataset in Google Colab using 4‑bit quantization and TRL’s SFTTrainer. There are no errors in my code, and I ...
1
vote
1
answer
40
views
How come tokenization and generation of model behaves differently accross different versions of transformers
I downloaded an old custom model based on Llava that runs on transformer 4.31.0 and I tried to use it together with a Qwen model which uses transformer 4.53.1. After updating transformers the Llava ...
0
votes
0
answers
44
views
How to transcribe local audio File/Blob with Transformers.js pipeline? (JSON.parse error)
I'm working on a browser-based audio transcription app using Transformers.js by Xenova. I'm trying to transcribe a .wav file selected by the user using the following code:
import { pipeline } from '@...
1
vote
1
answer
62
views
Trained Huggingface EncoderDecoderModel.generate() produces only bos-tokens
I am working on a Huggingface transformers EncoderDecoderModel consisting of a frozen BERT-Encoder (answerdotai-ModernBERT-base) and a trainable GPT2-Decoder. Due to the different architectures for ...
-3
votes
0
answers
49
views
I'm getting an error when trying to load a Hugging Face dataset in Colab [closed]
from transformers import pipeline
from datasets import load_dataset
# 1. Load the summarization pipeline
print("Loading summarization pipeline...")
summarizer = pipeline("summarization&...
1
vote
1
answer
42
views
Can I use a custom attention layer while still leveraging a pre-trained BERT model?
In the paper “Using Prior Knowledge to Guide BERT’s Attention in Semantic Textual Matching Tasks”, they multiply a similarity matrix with the attention scores inside the attention layer. I want to ...
1
vote
1
answer
72
views
(NVIDIA/nv-embed-v2) ImportError: cannot import name 'MISTRAL_INPUTS_DOCSTRING' from 'transformers.models.mistral.modeling_mistral'
My code:
from transformers import AutoTokenizer, AutoModel
model_name = "NVIDIA/nv-embed-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(...
-1
votes
1
answer
39
views
Transformers.js pipeline error with React useState: "text.split is not a function" [closed]
I am trying to use transformers.js to build a simple chatbot to answer questions related to my resume. The error is as below:
TypeError: text.split is not a function
at closure._encode_text (...
1
vote
1
answer
61
views
How can I pass is_split_into_words option to LayoutLMv3Processor?
I'm fine-tuning a LayoutLMv3 model using HuggingFace Transformers. During preprocessing, I want to use is_split_into_words=True to ensure proper label alignment for token classification.
My setup:
I'...
0
votes
0
answers
32
views
Getting StopIteration or No Output while using HuggingFace InferenceClient with TinyLlama / Falcon models
I'm trying to run text generation using the Hugging Face InferenceClient in Python, but I keep getting either a StopIteration error or no output at all. Here's my setup:
from huggingface_hub import ...
0
votes
1
answer
59
views
Vertex AI TEI Deployment Fails for Private Hugging Face Model - "Could not download model artifacts"
I'm trying to deploy a Hugging Face model to Vertex AI using the Text Embeddings Inference (TEI) workflow, but I'm getting consistent errors during deployment. This same deployment approach worked for ...
1
vote
0
answers
37
views
Fine-tuned LLaMA 2–7B with QLoRA, but reloading fails: missing 4bit metadata. Likely saved after LoRA+resize. Need proper 4bit save method
I’ve been working on fine-tuning LLaMA 2–7B using QLoRA with bitsandbytes 4-bit quantization and ran into a weird issue. I did adaptive pretraining on Arabic data with a custom tokenizer (vocab size ~...