0

I'm a newbie in python and trying to normalize each index in list using preprocessing.normalize. However, it gives me an error with ValueError: setting an array element with a sequence.

And then, I found what the problem was. It was because the length(size) of each index in np.array was different.

Here is my code,

result = []

for url in target_url :
    sensor = pd.read_csv(url, header=None, delimiter=r"\s+")
    result.append(sensor[2])

result = np.array(result)
# I want to resample here before it goes to normalize.
result = preprocessing.normalize(result, norm='l1')

I have target_url to get sensor data from webserver, and each appends to the result list. Then, it converts to array by using np.array

For example,

I have len(result[0]) has 121598 and len(result[1]) has 1215601. I want to make result[0] to be same length of result[1] using resample to fill NaN.

How can I do that?

Please help me out here.

Thanks in advance.

EDIT

After normalizing, I'm trying to do correlation using corr()

Here is the code,

result = preprocessing.normalize(result, norm='l1')
ret = pd.DataFrame(result)
corMat = DataFrame(ret.T.corr())

1 Answer 1

1

Since you are using pandas to read csv, you are off to a good start. One way to do it is simply use pd.concat, to join the Series (I assume sensor[2] is a Series) in the result list into one DataFrame. This is an example:

a = [pd.Series([1, 2, 3]), pd.Series([1, 2]), pd.Series([1, 2, 3, 4])]
pd.concat(a, axis=1)

Which gives:

     0    1  2
0  1.0  1.0  1
1  2.0  2.0  2
2  3.0  NaN  3
3  NaN  NaN  4

In the example provided by OP, this should suffice:

result = []

for url in target_url :
    sensor = pd.read_csv(url, header=None, delimiter=r"\s+")
    result.append(sensor[2])

# concatenate Series, and do both forward and backward fill for NaNs 
result = pd.concat(result, axis=1).fillna(method='bfill').fillna(method='ffill')

result = preprocessing.normalize(result, norm='l1')

# correlation
pd.DataFrame(result).T.corr()

Depending on what the Series indices look like, and your application, you can do different types of concatenations. Here's the docs.

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for the answer! :) I'm actually trying to do correlation after normalizing. And I want to fill that NaN with bfill or ffill using resample. :'( Let me edit my question. :)
Why not just use fillna(method='bfill')?
Thanks for the comment and answer.. I have one quick question. After putting the code, it gives me like Error: AttributeError: 'numpy.ndarray' object has no attribute 'corr'
Oh sorry my bad. You should cast to DataFrame to use pandas's cov or just use np.cov(result). I updated the answer.
Thanks for the comment! :) it gives me an error but I figured out with using result_temp = [result.iloc[:,i].tolist() for i in range(0, lenghofColumn)] and then do correlation! :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.