Skip to main content
Filter by
Sorted by
Tagged with
-2 votes
0 answers
26 views

I would say im intermediatly experienced in Deep Learning and computer vision. However i have a task to mask certain textured objects for instance segmentation. Im not sure how to train a better model....
Talha Aydın's user avatar
Advice
0 votes
0 replies
16 views

I'm training an HRNet model from scratch on the MPII dataset for human pose estimation. The model succeded to get high accuracy around 0.93 [email protected] however the visual results are not really promising....
msakni22's user avatar
1 vote
0 answers
17 views

I am running a training of instance segmentation on custom coco dataset using pytorch maskrcnn + fpn for 83 categories (+background). What is the problem with following setup and why RPN head not ...
SavEng's user avatar
  • 11
0 votes
1 answer
32 views

I have a problem where I am trying to use PyCharm for PyTorch. I have installed Python separately (quite a task as it tried to install it in the Microsoft/AppData folder?). In PyCharm, I have to first ...
alexanderjansma's user avatar
0 votes
0 answers
14 views

I'm attempting to perform multi-task Gaussian process regression using GPyTorch. I have, for each of N training examples, its corresponding (T x T) cross-task noise covariance matrix. I aim to ...
SirAndy3000's user avatar
0 votes
0 answers
61 views

I'm having problems with python code that uses pytorch. The details are a bit complicated (the code is part of a quantum mechanical calculation) but the code structure is very straightforward and ...
kacper's user avatar
  • 77
0 votes
1 answer
232 views

I'm trying to use PyTorch with an NVIDIA GeForce RTX 5090 (Blackwell architecture, CUDA Compute Capability sm_120) on Windows 11, and I keep running into compatibility issues. PyTorch detects CUDA, ...
sajjadesmaili's user avatar
0 votes
0 answers
61 views

I am trying to visualize the latent representation produced by the VAE inside a Stable Diffusion pipeline from diffusers import StableDiffusionPipeline import torch # A CUDA ordinal is simply the ...
Yilmaz's user avatar
  • 51k
0 votes
0 answers
33 views

I'm trying to deploy a pre-trained PyTorch model to SageMaker using the Python SDK. I have a model.tar.gz file that is uploaded to S3, with the following structure: code/ code/requirements.txt code/...
RefresherM's user avatar
Tooling
0 votes
0 replies
56 views

I'm looking for a good package to train a linear quantile regression model, i.e. $\hat y = \sum_{i=1}^n w_i \cdot X_i$. With $x_i$ are the input features, and $w_i$ are the bounded trainable weights. ...
student13's user avatar
0 votes
0 answers
33 views

I am now trying to use FSDP in Huggingface transformers Trainer. The training script is something like train_dataset = Mydataset(...) args = TrainingArguments(...) model = LlamaForCausalLM....
xuehao-049's user avatar
2 votes
1 answer
90 views

I am trying to follow along with this webpage: https://jtr13.github.io/cc21fall2/tutorial-on-r-torch-package.html I am trying to understand R's implementation of PyTorch. I am having some trouble with ...
Huy Pham's user avatar
  • 173
0 votes
0 answers
57 views

I am compiling NCCL 2.27.5-1 (I tried also 2.28.9-1) from source for a V100 GPU (sm_70). My goal is to have libnccl.so contain compute_70 PTX for every kernel. Despite passing explicit -gencode=arch=...
CiZ's user avatar
  • 9
1 vote
0 answers
113 views

I'm trying to set up a Python project using uv and pyproject.toml on Windows. I want to install the CUDA-enabled PyTorch, but after installing, when I check the version, it shows CPU-only. Here’s my ...
wonone11's user avatar
Advice
0 votes
0 replies
30 views

With Prioritized Experience Replay (PER), we use Beta parameter, so we can find weight that will be used to offset the bias introduced by PER. Now, with PyTorch's TensorDictPrioritizedReplayBuffer, I ...
Bejo's user avatar
  • 13
1 vote
2 answers
130 views

After converting module A to CPU, the origin parameter tensor still stays on the GPU? When it is released? Is it wrong if I reuse the parameter? My code: import torch.nn as nn class A(nn.Module): ...
jiwei zhang's user avatar
2 votes
1 answer
28 views

In Torch, .view() reshapes the tensor. However, there are multiple ways to reshape a multi-dimensional tensor to a target shape. How does it decide between those different ways? For example, in Torch, ...
Sanchit's user avatar
  • 21
2 votes
1 answer
1k views

I'm trying to deploy a Python project on Windows Server 2019, but PyTorch fails to import with a DLL loading error. On my local machine (Windows 10, same Python version), everything works perfectly. ...
Rael Clariana's user avatar
1 vote
1 answer
61 views

I am building a wheel of PyTorch from source, based on their https://github.com/pytorch/pytorch/blob/v2.6.0/.ci/manywheel/build_common.sh CI build script. I tested on a "local" instance of a ...
Corneau's user avatar
  • 193
Advice
0 votes
2 replies
48 views

I have a UNET model trained in pytorch (by someone else) that produces quite different results in eval mode to train mode (train mode results look good, eval mode they are rubbish). A bit of googling ...
user18504955's user avatar
0 votes
0 answers
55 views

I have encountered this issue and I searched on the forums but I couldnt solve it. How can I solve this problem ? I tried to add CBAM module in yolov12 for my custom dataset to improve accuracy. I ...
partizal's user avatar
0 votes
0 answers
103 views

I'm trying to implement SimSiam using a ViT backbone on the CUB-200-2011 dataset. However, during training, the embeddings collapse to a single direction despite using stop-gradient. Here’s what I ...
p10's user avatar
  • 33
-1 votes
0 answers
25 views

I am trying to use the following model Emotion Llama and try to understand how to download the models and place them in the right dir from huggingface. It actually suggests to donwload three models in ...
Jose Ramon's user avatar
  • 5,374
1 vote
1 answer
76 views

Is @ray.remote def run_experiment(...): (...) if __name__ == '__main__': ray.init() exp_config = sys.argv[1] params_tuples, num_cpus, num_gpus = load_exp_config(exp_config) ray.get(...
Blupon's user avatar
  • 1,091
0 votes
0 answers
48 views

I am currently trying to implement the Temporal Fusion Transformer using PyTorch. This paper (https://arxiv.org/pdf/1912.09363) is my reference. Currently I am stuck with the variable selection ...
Haifischbecken's user avatar
0 votes
0 answers
31 views

I’m trying to register a custom operator for ExecuTorch (v1.0, built from the PyTorch 2.5 source tree). My goal is to create a shared library that defines a few quantum operators and runs them from a ....
Melvin's user avatar
  • 1
0 votes
0 answers
49 views

I wrote a lot of scripts to test the compatibility of my system with PyTorch 2.4.1, and they all indicate I can run it. I don't have enough memory on my GPU, so I tried enabling expandable_segments so ...
N3.2's Channel's user avatar
1 vote
1 answer
150 views

I am managing a project with uv (v0.9.4) that requires a specific PyTorch CUDA build. The generic installation works, but using uv run causes a package conflict, despite the environment being correct. ...
ATILADE OKE's user avatar
0 votes
0 answers
82 views

I am currently experimenting with modifying the KV cache of the LLaVA model in order to perform controlled interventions during generation (similar to cache-steering methods in recent research). The ...
Pulkit Mittal's user avatar
0 votes
1 answer
35 views

I just spent hours figuring out that the following code fails when included in a ComfyUI custom node, but works perfectly fine outside (using the same Python venv). I finally found out that someone ...
user2845840's user avatar
0 votes
1 answer
80 views

I want to step into the linear function using VS Code's step-in , but it skips automatically when I click "step into". Could anyone help me with this? I used DEBUG=1 when compiling PyTorch. ...
Shui_'s user avatar
  • 33
1 vote
0 answers
68 views

I'm following the book "Deep Learning with PyTorch Step By Step" and I have a question about the predict method in the StepByStep class (from this repository: GitHub). The current ...
Matteo's user avatar
  • 93
1 vote
0 answers
186 views

I need to to run a series of pre-trained fine-tuned models from Hugging Face to Jupyter notebook. I have updated to the latest version of both PyTorch and Transformers, but when I run the code from ...
Alex Colville's user avatar
Advice
2 votes
0 replies
89 views

I’m a Computer Science graduate preparing for ML/AI Engineer roles. I’m facing a dilemma about what to focus on, how much to allocate time to each area, and what exact roadmap to follow to prepare ...
syntaxprnv's user avatar
2 votes
0 answers
117 views

I have installed Cuda 13 on fedora 42 . When i use pytorch localy, torch works fine, but when i creat a virtualenv my pytorch cant find the ibcudnn files. I get the error ImportError: libcudnn.so.9: ...
TR SIXtree's user avatar
2 votes
2 answers
94 views

I am making a Decoder only transformer using Pytorch and my dataset of choice is the fullEnglish dataset from kaggle Plaintext Wikipedia (full English). The problem is that my model output is ...
Kirito's user avatar
  • 13
0 votes
1 answer
77 views

I am trying to run Mistral-7B-Instruct-v0.2. Each run is PROMPT + details[i]. PROMPT has instructions on how to generate JSON based on details. As the prefix part of each input is same; kind of like a ...
acdhemtos's user avatar
2 votes
1 answer
39 views

I’m trying to use Cache-DiT to accelerate inference for the Wan2.2 model. However, when I run the example script, python run_wan_2.2_i2v.py --steps 28 --cache I get the following error. Namespace(...
傅靖茹's user avatar
0 votes
0 answers
39 views

I'm performing Gaussian process regression using GPyTorch. I'm modeling two correlated tasks as follows: class MyModel(gpytorch.models.ExactGP): def __init__(self, X, Y, likelihood): super(...
SirAndy3000's user avatar
2 votes
0 answers
59 views

I'm computing PDE residuals for The_Well datasets (e.g. turbulent_radiative_layer_2D and shear_flow) using finite differences, but the residuals are much larger than I expect. The data are generated ...
Kain's user avatar
  • 21
0 votes
1 answer
29 views

In my scenario I use multiple DataLoaders with multiple Datasets to evaluate models against each other (I want to test models with multiple resolutions, which means each dataset has a distinct ...
Yuval's user avatar
  • 3,598
1 vote
1 answer
125 views

So, pytorch requires a rather large bundle of packages. The prebuilt docker pytorch gpu images (https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/running.html) are quite helpful in ...
helt's user avatar
  • 5,347
1 vote
0 answers
186 views

I’m using the PyTorch profiler to analyze sglang, and I noticed that in the CUDA timeline, some kernels show “Command Buffer Full”. This causes the cudaLaunchKernel time to become very long, as shown ...
plznobug's user avatar
  • 143
0 votes
0 answers
98 views

I'm trying to fine-tune the MAT (Masked Attention Transformer) model from the official repository: https://github.com/fenglinglwb/MAT However, I keep getting the following error during training: ...
kitten3032's user avatar
-1 votes
0 answers
38 views

I have NVIDIA GeForce RTX 5060 with the "Blackwell" architecture with compute capability 12.0 that's why i have to use nightly build of pytorch=2.10.0.dev20251017+cu128 which support for ...
msaLina's user avatar
  • 21
0 votes
0 answers
98 views

In the torch example provided here https://github.com/pytorch/examples/tree/main/word_language_model, tansformer only uses torch.TransformerEncoder and torch.TransformerDecoder is overwritten with a ...
cuneyttyler's user avatar
  • 1,395
0 votes
0 answers
40 views

I'm fine-tuning T5-small using PyTorch Lightning and encountering a strange issue during validation and test steps. The Problem: During validation_step and test_step, model.generate() consistently ...
GeraniumCat's user avatar
-1 votes
0 answers
75 views

There is a tutorial i try to follow https://docs.pytorch.org/tutorials/intermediate/torchvision_tutorial.html working with .png files as segmentation masks. The png files can be found here: https://...
Paul Borowy's user avatar
2 votes
1 answer
123 views

I need a really fast vectorized maximal independent set algorithm implemented in pytorch, so I can use it for tasks with thousands of nodes in reasonable time. I cannot use networkx, it is way too ...
Kemsikov's user avatar
  • 640
1 vote
0 answers
68 views

My proxy goal is to change LoRA from h = (W +BA)x to h = (W + BAP)x. Preliminary code attached for your reference My actual goal is to train a model with the following loss: 〖Θ ̃=(arg min)┬Δ ̂ 〗⁡〖‖𝑓_(...
Jason Rich Darmawan's user avatar

1
2 3 4 5
479