Use row's values as columns

Question

I would like to know how I can get something like this

Net     123  21   41   42  12  21
123      1   0    1    0    0   0
21       0   0    0    0    0   1
41       0   0    1    1    0   0
42       0   0    1    1    0   0
12       0   0    0    0    1   0
21       0   1    0    0    0   0

from the original dataset:

Net     L
123    [123,41]
21     [21]
41     [41,42]
42     [42,41]
12     [12]
21     [21]

I thought of explode, but it works only on rows, not on columns.

Thanks S3DEV. Do you think also stack/unstack could work in this case? — user12907213
– user12907213, Commented Aug 3, 2020 at 18:06
Not sure those functions would be a good fit, given you are looking for the paired frequency. With a bit of simple data engineering, crosstab is what you’re after. — s3dev
– s3dev, Commented Aug 3, 2020 at 18:15

BENY · Accepted Answer · 2020-08-03 18:24:29Z

1

We can do dot

s=df.drop('Net',1)
df['New']=s.dot(s.columns+',').str[:-1].str.split(',')
df
Out[283]: 
   Net  123  21  41  42  12    21        New
0  123    1   0   1   0   0     0  [123, 41]
1   21    0   0   0   0   0     1     [21.1]
2   41    0   0   1   1   0     0   [41, 42]
3   42    0   0   1   1   0     0   [41, 42]
4   12    0   0   0   0   1     0       [12]
5   21    0   1   0   0   0     0       [21]

answered Aug 3, 2020 at 18:24

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user12907213 Over a year ago

Would it be possible to use your code also for strings instead of int/numbers in L column? I am asking because after trying with strings, I have got this message: TypeError: can't multiply sequence by non-int of type 'str'. But probably is because of numpy.dot

hoomant · Accepted Answer · 2020-08-03 18:23:15Z

0

I assume the values in your column 'L' are str (not list), and each value is separated by a comma. If so, you can:

# create a set of column names
columns = set()
for cols in df.L.unique():
    cols = cols.split(',')
    for col in cols:
        columns.add(col)

# generate columns
for col in columns:
    df[col] = df.L.str.contains(col, regex=False)

# change False/True to 0/1
df.loc[:, columns] = df.loc[:, columns].astype(int)

answered Aug 3, 2020 at 18:23

hoomant

4552 silver badges12 bronze badges

4 Comments

user12907213 Over a year ago

Hi hoomant. In df.L.unique() unfortunately I get this: TypeError: unhashable type: 'list'.

hoomant Over a year ago

Is 'df' a pandas.DataFrame? If yes, what is the data type of the 'L' pandas.Series? (hint: df.L.dtype)

user12907213 Over a year ago

Thanks for replying me. Yes, df is a dataframe. L is dtype('O')

hoomant Over a year ago

Try df['L'] = df.L.apply(str) before running the code above. (you may also need to change the 4th line to cols[1:-1].split(', '))

Collectives™ on Stack Overflow

Use row's values as columns

2 Answers 2

1 Comment

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related