1,733 questions
-1
votes
0
answers
21
views
Problems installing pytorch with Anaconda - InvalidArchiveError ("Error with archive ...//pytorch-2.6.0-cpu_mkl_py3)
I installed the latest Anaconda and updated everything. When I try to install bertopic or pytorch itself I'm getting this error:
InvalidArchiveError("Error with archive C:\Users\myuser\AppData\...
1
vote
1
answer
42
views
Can I use a custom attention layer while still leveraging a pre-trained BERT model?
In the paper “Using Prior Knowledge to Guide BERT’s Attention in Semantic Textual Matching Tasks”, they multiply a similarity matrix with the attention scores inside the attention layer. I want to ...
1
vote
0
answers
44
views
What is the appropriate shape and values of the tensor expected by ModernBertSequentialClassification? with Candle in Rust
I don't understand appropriate shape and values of the tensor expected by ModernBertSequentialClassification finetuning in Candle of Rust.
Is there a formula to determine the appropriate shape and ...
0
votes
3
answers
253
views
Error: A KerasTensor is symbolic: it's a placeholder for a shape an a dtype. It doesn't have any actual numerical value
I have trying to recreate this tutorial that's found on tensorflow's docs. However, I've been having an error I cannot solve and seems to be related to the literal source code of the tutorial. Also, ...
1
vote
1
answer
48
views
Training a custom tokenizer with Huggingface gives weird token splits at inference
So I trained a tokenizer from scratch using Huggingface’s tokenizers library (not AutoTokenizer.from_pretrained, but actually trained a new one). Seemed to go fine, no errors. But when I try to use it ...
0
votes
0
answers
84
views
PyTorch with Docker issues: torch.cuda.is_available() = False
I'm having an issue with PyTorch in a Docker container where torch.cuda.is_available() returns False, but the same PyTorch version works correctly outside the container.
Environment
Host: Debian 12
...
0
votes
1
answer
116
views
How can I decide how many epochs to train for when re-training a model on the full dataset without a validation set?
I have a BERT model that I want to fine-tune. Initially, I use a training dataset, which I split into a training and validation set. During fine-tuning, I monitor the validation loss to ensure that ...
2
votes
1
answer
112
views
How to Identify Similar Code Parts Using CodeBERT Embeddings?
I'm using CodeBERT to compare how similar two pieces of code are. For example:
# Code 1
def calculate_area(radius):
return 3.14 * radius * radius
# Code 2
def compute_circle_area(r):
return 3.14159 * ...
0
votes
0
answers
31
views
How to change last layer in finetuned model?
When I fine-tuned the model Hubert to detect phoneme, I chose a fine-tuned ASR Hubert model and I removed the last two layers and added a linear layer to the config vocab_size of phoneme. What is ...
0
votes
0
answers
58
views
How many obs per class are necessary? - transfer learning w. BERT fine-tuning
I seek advice on a classification problem in industry.
The rows in a dataset must be classified/labeled--it lacks a target/column (labels have dot-separated levels like 'x.x.x.x.x.x.x')--during every ...
0
votes
0
answers
35
views
How to detect out-of-vocabulary words in a prompt
I need to detect words an LLM has no knowledge about, to add RAG-based definition of said word to the prompt, i.e.:
What is the best way to achieve slubalisme using the new fabridocium product ?, ...
0
votes
1
answer
164
views
Why is my BERT model producing NaN loss during training for multi-label classification on imbalanced data?
I’m running into a frustrating issue while training a BERT-based multi-label text classification model on an imbalanced dataset. After a few epochs, the training loss suddenly becomes NaN, and I can’t ...
0
votes
1
answer
196
views
torch.OutOfMemoryError: CUDA out of memory. (Google Colab)
I tried to adapt the mBERT model to an existing code. However, I received the following issue even though I tried different solutions.
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20....
0
votes
0
answers
47
views
Is it Possible to feed Embeddings generate by BERT to a LSTM based autoencoder to get the latent space?
I've just learn about how BERT produce embeddings. I might not understand it fully.
I was thinking of doing a project of leveraging those embeddings and feed it to an autoencoder to generate latent ...
0
votes
1
answer
26
views
Is it possible to evaluate Machine Translations using Sentence BERT?
I'm not referring to BERTScore. BERTScore uses token-level word embeddings, you compute pairwise cosine similarity of word embeddings and obtain scores using greedy matching.
I'm referring to Sentence ...