472 questions
0
votes
0
answers
37
views
Using DataLoader for efficient model prediction
I'm trying to understand the role/utility of batch_size in torch beyond model training. I already have a trained model, where the batch_size was optimized as a hyperparameter. I want to use the model ...
0
votes
1
answer
41
views
What size should an `IterableDataset` report when used in a multi-worker `DataLoader`?
Here's a simple dataset and a data loader that uses it:
import torch
from torch.utils.data import DataLoader, IterableDataset
class Dataset(IterableDataset):
def __init__(self, size: int):
...
0
votes
0
answers
20
views
Batching temporal graphs with Pytorch geometric data loader
I'm conducting research with temporal graph data using Pytorch-geometric.
I'm facing some issues of memory usage when making PyG data in dense format (with to_dense_batch() and to_dense_adj()).
I have ...
1
vote
1
answer
40
views
num workers does not run in parallel
from torch.utils.data import Dataset, DataLoader
import time
import multiprocessing as mp
import torch
class Sleep(Dataset):
def __len__(self): return 20
def __getitem__(self, i):
...
0
votes
0
answers
25
views
Significant overhead when calling DataLoader for a dataset within FastAPI endpoint using multiple processing
I am calling a machine learning model for a dataset that I have loaded using torch DataLoader:
class FilesDataset():
def __init__(self, path):
file_paths = glob.glob(os.path.join(path, "*....
0
votes
1
answer
35
views
Discrepancy in number of elements outputted by torch Dataset and DataLoader
I have a custom Subset:
class TestSubset2(Subset):
def __init__(self, dataset, indices, days=False):
super().__init__(dataset, indices)
self.days = days
def __getitem__(self, ...
0
votes
0
answers
26
views
PyTorch DataLoader gradually slowing down as training progresses
I noticed my dataset iteration gradually slows down as training progresses. I'm using an A100 Google Colab instance. I removed the model and all the training stuff to try to debug the dataset. With ...
1
vote
0
answers
85
views
How to trace PyTorch Dataloader workers with VizTracer?
I'm using VizTracer to debug performance issues in my PyTorch data loading pipeline. Specifically, I'm using a DataLoader with num_workers > 0 to load data in parallel using multiple subprocesses.
...
1
vote
0
answers
36
views
Image Tensors Return As Zero When num_workers > 0
I am facing an issue with multiprocessing. I am trying to load my .pt data as dataloaders. Everything works fine when I set the num_workers = 0. But when I set it to a value greater than 0, the tensor ...
0
votes
1
answer
32
views
Torch tensor dataloader shape issue
I have a simple application of torch.DataLoader that gets a nice performance boost. It's created by the tensor_loader in the following example.
from torch.utils.data import DataLoader, TensorDataset, ...
1
vote
0
answers
59
views
Error When Using Batch Size Greater Than 1 in PyTorch
I'm building a neural network to predict how an image will be partitioned during compression using VVC (Versatile Video Coding). The model takes a single Y-frame from a YUV420 image as input and uses ...
1
vote
0
answers
75
views
PyTorch Forecasting TimeSeriesDataSet Returns None in DataLoader Batch
I am working with pytorch-forecasting to create a TimeSeriesDataSet where I have 30 target variables that I want to predict.
However, when I pass this dataset to a DataLoader, I encounter an issue:
...
1
vote
1
answer
79
views
RuntimeError: Given groups=1, weight of size [64, 3, 3, 7, 7], expected input[1, 8, 3, 112, 112] to have 3 channels, but got 8 channels instead
import os
import shutil
import random
import torch
import torchvision.transforms as transforms
import cv2
import numpy as np
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
...
0
votes
0
answers
13
views
Does Modifying an Attribute of a Custom Dataset Affect Both Subsets After random_split in PyTorch?
I am working on a binary classification task using an audio dataset, which is already divided into training and testing sets. However, I also need a validation set, so I split the training set into ...
0
votes
0
answers
59
views
Pytorch DataLoader loops are slower than expected
I created a training loop with pytorch's TensorDataset and DataLoader classes, but encounter an interesting behavior. The progress intermittently halts every 10-15 batches with seemingly no reason. I ...