1

I have an array with values between 0 - 255 and one missing (nan), its shape is (27, 36). I tried to impute the missing data using the Nipals algorithm. After searching I found that there is a PLS Regression().

For the PLS regression method, it takes two matrices or vectors X and Y for fitting and prediction:

from sklearn.cross_decomposition import PLSRegression
X = [[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]]
Y = [[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]]
pls2 = PLSRegression(n_components=2)
pls2.fit(X, Y)

Y_pred = pls2.predict(X)

Now, there are two problems, the first one is that the two matrices/vectors shouldn't have NaN values (it will throw an error if one of them contains NaN), and the second one is that I have just one matrix!

So how to let PLS Regressor impute the missing data? or what is the appropriate algorithm should I follow to solve this problem (of course using the Nipals algorithm)? No problem if rpy2 is used.

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.