0

I'm new to Python. I'm trying to create multiple columns in a for loop but I'm having trouble with it. I have several columns and I'm trying to create a new column that shows whether or not the elements in ohlcs is greater than elements in metrics. I can do it to create one column but I want to save time since I plan on doing the same function but for different variables.

ohlcs = ['open', 'high', 'low', 'close']
metrics = ['vwap', '9EMA', '20EMA']
wip = []
for idx, row in master_df.iterrows():
    for ohlc in ohlcs:
        for metric in metrics:
            row[f'{ohlc} above {metric}'] = np.where(row[ohlc] >= row[metric], 1, 0)

This didn't do anything. I've also done this:

ohlcs = ['open', 'high', 'low', 'close']
metrics = ['vwap', '9EMA', '20EMA']
wip = []
for idx, row in master_df.iterrows():
    for ohlc in ohlcs:
        for metric in metrics:
           if master_df[ohlc] >= master_df[metric]: 
               master_df[f'{ohlc} above {metric}'] = 1
           else:
               master_df[f'{ohlc} above {metric}'] = 0

That gave me an error.

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I did other things but I erased those as I worked on it. At this point I'm out of ideas. Please help!

I got it now but I checked manually to see if the values lined up and it wasn't.

enter image description here

How do I fix it?

2 Answers 2

1

There is no need to iterate over the rows of the dataframe. This will give you the required result:

for ohlc in ohlcs:
    for metric in metrics:
        master_df[f'{ohlc} over {metric}'] = (master_df[ohlc] >= master_df[metric]).astype(int)

The part astype(int) is just to convert True and False into 1 and 0, if you are okay with True and False representation, you can use just master_df[f'{ohlc} over {metric}'] = master_df[ohlc] >= master_df[metric].

EDIT: Of course, (master_df[ohlc] >= master_df[metric]).astype(int) is equivalent to np.where(master_df[ohlc] >= master_df[metric], 1, 0), you can use either.

Sign up to request clarification or add additional context in comments.

8 Comments

Thank you so much. I worked on this for hours and I always seem to get close. I just overcomplicate it haha.
I looked into the columns manually to see if it was right. It wasn't exactly. How do I fix that? I edited the post and attached a picture above. Thank you!
Hmm, that's a strange behavior indeed. There may be a problem with types of your columns. Try using master_df.info() to see datatypes of your columns, possibly some are understood as strings? If datatypes are mismatched, you can try np.where(master_df[ohlc].astype(float) >= master_df[metric].astype(float), 1, 0).
It shows them both as float64. The results I got are integers but the columns used to get that were all float64. Is that weird?
That's not weird, result of the comparision is 0 or 1, so we would expect integer type. As for the wrong result of the comparision, I can't tell what the problem could be. Are you sure you're not setting up values of open above vwap somewhere else in the code? And are other above columns correct? Perhaps you can provide some sample data.
|
1

Consider itertools.product and the functional form DataFrame.ge for all pairwise possibilities fir a flatter looping:

from itertools import product
...

ohlcs = ['open', 'high', 'low', 'close']
metrics = ['vwap', '9EMA', '20EMA']

pairs = product(ohlcs, metrics)

for ohlc, metric in pairs:
    master_df[f"{ohlc} over {metric}"] = master_df[ohlc].ge(master_df[metric])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.