20,012 questions
-2
votes
0
answers
28
views
How to efficiently train a CNN based CV model? [closed]
I would say im intermediatly experienced in Deep Learning and computer vision. However i have a task to mask certain textured objects for instance segmentation. Im not sure how to train a better model....
-4
votes
0
answers
29
views
Potential reason for my PINN's failure to learn the physics behind photovoltaic systems properlyproperly [closed]
I’m currently working on a PINN that predicts the maximum power point (MPP) of solar panels based on irradiance, shading, and temperature. When I train the model, the data loss behaves normally, but ...
2
votes
0
answers
31
views
Problem in forecast errors when combining tscv and nnetar
I’m encountering an issue when combining tsCV with nnetar (with h = 2 and additional input variables). The forecast errors for both horizons (h = 1 and h = 2) terminate two observations before the end ...
Advice
1
vote
4
replies
86
views
Python library recommendation for the implementation of a neural network modification algorithm
I want to implement in python some algorithms from a paper that allow for a pre-trained neural network to be modified (adding or removing neurons or layers) conserving (theoretically) the outputs of ...
Advice
0
votes
0
replies
40
views
Large Kernel in ConvNets
I want to find a convolutional network with a large kernel (larger than 5x5 or 7x7). I want to perform kernel analysis, and to do this, I need to convert the model to the onnx format. I found ...
1
vote
1
answer
132
views
Torch Conv2d results in both dimensions convolved
I have input shape to a convolution (50, 1, 7617, 10). Here, 7617 is word vectors as rows, and 10 is the number of words in columns. I want to convolve column-wise and obtain (2631, 1, 7617, 1), 1 ...
1
vote
0
answers
330
views
Installation error while installing GroundingDino
I am trying to install the GroundingDino as instructed in the README file of their official GitHub repo, but I am facing the error below:
Obtaining file:///home/kgupta/workspace/Synthetic_Data_gen/...
0
votes
1
answer
127
views
Why does a LSTM pytorch model yield constant values?
I am training a LSTM model with data from yfinance. The process is really standard. I get the data with yf.download(ticker=ticker) where ticker='AAPL and do df.rolling(30, min_periods=1) to smooth the ...
0
votes
1
answer
126
views
Preventing GPU memory leak due to a custom neural network layer
I am using the MixStyle methodology for domain adaptation, and it involves using a custom layer that is inserted after every encoder stage. However, it is causing VRAM to grow linearly, which causes ...
-3
votes
1
answer
103
views
Can I visualize a neural network’s loss landscape to see if it’s stuck in a bad minimum? Any code example for this? [closed]
So, I’m trying to understand why sometimes neural networks get stuck during training. I heard people talk about ‘local minima’ and ‘saddle points,’ but I can’t really picture them. I want to actually ...
0
votes
0
answers
82
views
KFold cross-validation in Keras: model not resetting between folds (MobileNet backbone)
I am trying to perform KFold cross-validation on a Keras model. The first fold runs exactly as expected, but from the second fold onwards the model doesn’t seem to reset. The training behaves ...
2
votes
0
answers
171
views
TensorFlow/Keras model accumulates system and GPU RAM during training
I am training a model using TensorFlow/Keras using TensorFlow 2.19.0/Keras 3.10.0. During training, I monitor nvidia-smi and top, and the system RAM and the GPU RAM increase during the training period....
0
votes
1
answer
91
views
Differentiable weight setting in flax NNX
I'm doing some experiments with Flax NNX (not Linen!).
What I'm trying to do is compute the weights of a network using another network:
A hypernetwork receives some input parameters W and outputs a ...
3
votes
1
answer
126
views
Neural Network built from scratch using numpy isn't learning
I'm building a neural network from scratch using only Python and numpy, It's meant for classifying the MNIST data set, I got everything to work but the network isn't really learning, at epoch 0 it's ...
0
votes
1
answer
35
views
Model with ResNet blocks stuck at low accuracy
I am trying to implement classification of ECG segments from PTB-XL database (https://physionet.org/content/ptb-xl/1.0.3/). The architecture of the model which I am using is:
import torch
import torch....
0
votes
0
answers
66
views
Building NN from scratch, why does my NN not memorize a small sample size of training data? It ends up being a class distribution
No matter which input I give it after training, it still spits the class distribution.. whereas if I just remove the hidden layer and use a single layer nn, it works much better.
I know the proper ...
0
votes
0
answers
80
views
Tensorflow speed of tf.nn.conv2D used instead of opencv GaussianBlur
I'm trying to move some computer vision tasks to tensorflow. The most intensive ops are convolutions, like GaussianBlur. The timings I get using timeit suggest that the GPU equivalent is >10 x ...
-1
votes
1
answer
77
views
Randomized Search CV with Neural Network
I want to tune a neural network but since i would be doing a lot of test runs i decided to use Randomized search CV
I first started by freeing up some space.
# Reseting layers and freeing up some ...
2
votes
1
answer
104
views
Error forming mini-batch for network input
I am trying to predict vehicle trajectory, using t-30 data sequence to predict until t+30 trajectory. The data I have created is an neighbour occupancy matrix for each frames for each cars, this data ...
1
vote
1
answer
716
views
SFTTrainer: The specified `eos_token` ('<EOS_TOKEN>') is not found in the vocabulary of the given `processing_class` (Qwen2TokenizerFast)
I upgraded my Python trl package to version 0.18.1. I use the SFTTrainer of the package to finetune a Qwen2.5 LLM neural net. Previously, I used the TrainingArgument class to set additional params. I ...
0
votes
0
answers
28
views
Flat LR in NN model
Having problems with my flexible model - as I'm trying to find the reason why it outputs flat lr. Earlier, when I run "normal" [without the option to apply different numbers of layers etc] ...
2
votes
1
answer
101
views
Confused about r-torch syntax
I have been trying to tackle a regression problem by training a neural network to predict a continuous variable using r-torch. My question pertains to the syntax used to achieve this.
When ...
0
votes
0
answers
58
views
Layer-wise Relevance Propagation (LRP) not conserving total relevance in my BiLSTM
I'm implementing Layer-wise Relevance Propagation manually for a BiLSTM model, following Arras et al. implementation.
def lrp_bilstm(R_seq_3d, manual_bilstm_results, bilstm_weights, epsilon=1e-6):
...
0
votes
1
answer
57
views
TPU for Pix-to-Pix possible?
I have model Pix-to-Pix. Not segmentation. Transfer from texture to texture. Loss = 'mse'.
And it works properly with GPU. But if i try to use TPU i get loss = 'nan'
I simplified the model by cutting ...
0
votes
0
answers
58
views
Neural network layer returns zeroed out matrix instead of proper matrix multiplication
I'm having a problem with the forward pass of my neural network.
In this step i'm not doing any backpropagation, just forward passes.
And it's not a fully connected layer. It's a sparse network.
I'm ...
2
votes
2
answers
265
views
Is it possible to save a Pytorch model and continue training the model later without needing the model class definition?
I am curious, as I have not found a way to save a Pytorch model and load the same model to continue training it on a new dataset without needing the original model definition.
So far, I know how to ...
0
votes
0
answers
54
views
How do I tell tensorflow to throw an error if I am trying to do a non-differentiable operation on a variable?
I am learning tensorflow and spent a good amount of time trying to find what is causing this error:
No gradients provided for any variable.
In the end I tracked that it was caused by using argmax at ...
0
votes
0
answers
103
views
How to convert discrete choices (of a color) from neural network output to colors directly in tensorflow?
Context: I am learning to use tensorflow and want to do a simple experiment where I provide a neural network with 4 color choices for each pixel. The network should learn to pick the best colors from ...
0
votes
0
answers
53
views
How to convert PyTorch file from version 3 to version 2 without update PyTorch version
I have a model that I want to infer in the PyTorch v1.4.0 environment, but I get the error "Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. ...
0
votes
1
answer
87
views
How Can I Get Positive Responses Only from a Neural Net?
I'm using TensorFlow. I have many simulated datasets for which values are real. Each one consists of 200 rows and two columns (variables). I'm using a convolutional neural net. I want the simulated ...
-1
votes
1
answer
85
views
How does the Tensorflow Gradient Tape calculation work at a low level?
I have just seen some code which has sparked my interest.
with tf.GradientTape() as g:
y = f(x)
dy_dx = g.gradient(y, x)
(Code loosely taken from this reference.)
There are two things which I ...
0
votes
0
answers
120
views
ray.tune: Tracked actor is not managed
from ray import tune
from ray.tune.search.optuna import OptunaSearch
from ray.tune.integration.pytorch_lightning import TuneReportCheckpointCallback
from pytorch_lightning import Trainer
...
0
votes
0
answers
45
views
how to create a custom LossFunction in deeplearning4j that uses pearson correlation coefficient as the metric
I asked chatgpt to code a custom loss function for deeplearning4j and it keeps producing errors when i try to use it... Here is what chatgpt coded up for me:
public class ...
0
votes
0
answers
34
views
Size Mismatch in MultiModal Feedback Model Using T5 + Audio/Visual Features - The size of tensor a (48) must match the size of tensor b (4) with T5
I’m working on a multimodal model that combines audio and visual features with a T5-based encoder for a feedback generation task. However, I’m facing an issue with batch size mismatch between the ...
0
votes
0
answers
63
views
Make my Tensorflow dataset generator more efficient in training
I have built a Tensorflow Dataset from a generator but it seems to be very slow when training. In testing I cut the size of the dataset greatly and the training is much faster when uploaded as a ...
0
votes
0
answers
46
views
Torch gradient estimates disagreeing with analytic and perturbation approximated gradients
I'm faced with a problem where as the title says I'm having trouble with the torch package's built in automatic differentiation algorithms (or my usage?). I think it was meant to be used on mini-...
0
votes
2
answers
76
views
Define anonymour variables elsewhere in the code R
I am trying to render my code a bit less "unique" in the sense that I want this to be able to run regardless of what kind of dataset I use.
I'm essentially running different algorithms of ...
1
vote
1
answer
79
views
Regression error with Python tensorflow keras
could someone please help me to fix the following error :
[AttributeError: 'super' object has no attribute 'sklearn_tags']
based on my code :
from tensorflow import keras
from scikeras.wrappers ...
1
vote
1
answer
102
views
How to handle outputs of variable length when predicting with a Feed Forward Neural Network?
I am working on a regression problem where, given an input X of fixed size, the output Y can be a sequence of values of variable length.
Both input and output are normalised float values. Thus, we are ...
0
votes
1
answer
211
views
tuple has no attribute "rank" error when trying to build bayesian neural net
I'm trying to build a BNN but am encountering the error in the title. I tried to ensure I'm not passing a tuple to .shape.rank by :
using the functional API with explicit Input (should make first ...
1
vote
2
answers
183
views
Why do we reshape key, query, and value tensors in multi-head attention?
In my PyTorch implementation of multi-head attention, i have those in __init__()
class MultiHeadAttentionLayer(nn.Module):
def __init__(self,d_in,d_out,context_length,dropout,num_heads,use_bias=...
0
votes
1
answer
111
views
can't load the saved model YOLOV8Backbone from keras_cv after thefine-tuning
checkpoint_filepath='E:\model.yolo_v8_s_ft.h5'
# backbone = keras_cv.models.YOLOV8Backbone.from_preset("yolo_v8_m_backbone_coco")
backbone = keras.models.load_model(checkpoint_filepath)
...
0
votes
0
answers
63
views
Error: Arrays have incompatible sizes for this operation during Grad-Cam for 3D CNN (MatLab)
I'm trying to apply the Grad Cam technique to a 3D CNN (ResNet101). To do this, I'm trying to use the following code:
dlImg = dlarray(single(img), 'SSSCB');
softmaxName = 'sofctmax';
...
2
votes
1
answer
134
views
How to Implement Backpropagation Without Auto-Differentiation for a Feedforward Neural Network?
I am working on a deep learning assignment that requires implementing a feedforward neural network (FNN) from scratch using only NumPy (without TensorFlow, PyTorch, or other auto-differentiation tools)...
1
vote
0
answers
30
views
XOR using Pre-Calculated Weights and Threshold in a Neural Network
I’m trying to implement a threshold-logic (step-activation) network in C++ that computes a 5‑input XOR function. There is no training whatsoever. I already have working code for a 3‑input XOR, but ...
-1
votes
1
answer
58
views
implement a differentiable L0 regularizer to keras layer
What is the appropriate way to implement a differentiable variant of L0 regularizer (count the non-zero values in a Conv layer / matrix) to keras layer?
I was thinking of using r(x) = tanh(abs(f*x)) ...
0
votes
0
answers
63
views
Why Does My High-Accuracy MNIST Classifier Fail on User-Drawn Digits in React.js?
I trained a simple binary classifier on the MNIST dataset, achieving 99%+ accuracy on test data. However, when I integrate it into a React.js app where users draw digits on a canvas, the model ...
0
votes
0
answers
42
views
How to load a Neural Network Model along with MinMaxScalar? [duplicate]
I have a simple neural network model, of 4 layers, that I trained on a numerical dataset of 25K data points.
It takes a good time to load the data, whenever I want to evaluate new features to python ...
0
votes
1
answer
100
views
Self made backpropagation doesnt work in python (2-neuron-network)
I tried to build a neural network with two neurons as described in the book Why machines learn on page 330. This is my code and I don't know why it didn't work. I tried something like this before with ...
0
votes
1
answer
62
views
Tensorflow error unknown expected keyword during loading of trained model
I have trained a Tensorflow model and saved it on another machine and want to load it locally. When i try to load it i get an error saying Agent.init() got an unexpected keyword argument 'name'. My ...