1

I have a dataframe with 6 columns and I wanted to use 5 columns of it (i.e FIRST TRAVEL, SECOND TRAVEL, THIRD TRAVEL) which contains date in it. From these 5 columns, I wanted to consider the greatest date and compare it against the given date "2020-09-25 00:00:00"

Following Condition must be met:

  • If the date is greater than the input date, then it is good. If not we will have to write as "Offer Expired" in a new column named RESULT.

,

  Customer Name        FIRST TRAVEL       SECOND TRAVEL        THIRD TRAVEL       FOURTH TRAVEL        FIFTH TRAVEL         RESULT
0         USER1                 NaT 2020-09-02 08:21:59                 NaT                 NaT                 NaT  Offer Expired
1         USER2                 NaT 2014-11-05 15:23:38                 NaT                 NaT                 NaT  Offer Expired
2         USER3                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
3         USER4                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
4         USER5                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
5         USER6                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
6         USER7                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
7         USER8                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
8         USER9                 NaT 2020-09-02 10:07:11                 NaT                 NaT                 NaT  Offer Expired
9        USER10 2020-03-16 00:00:00                 NaT                 NaT                 NaT                 NaT  Offer Expired
10       USER11 2019-12-11 00:00:00                 NaT                 NaT                 NaT                 NaT  Offer Expired
11       USER12 2020-09-26 00:00:00 2020-04-14 00:00:00                 NaT                 NaT                 NaT            NaN
12       USER13 2020-04-20 00:00:00 2019-10-18 00:00:00                 NaT                 NaT                 NaT  Offer Expired
13       USER14 2020-02-21 00:00:00 2020-04-20 00:00:00                 NaT                 NaT                 NaT  Offer Expired
14       USER15 2020-01-17 00:00:00 2019-10-17 00:00:00                 NaT                 NaT                 NaT  Offer Expired
15       USER16                 NaT 2020-04-20 00:00:00                 NaT                 NaT                 NaT  Offer Expired
16       USER17                 NaT 2019-08-24 00:00:00                 NaT                 NaT                 NaT  Offer Expired
17       USER18                 NaT 2019-11-01 00:00:00                 NaT                 NaT                 NaT  Offer Expired
18       USER19                 NaT 2019-09-13 00:00:00                 NaT                 NaT                 NaT  Offer Expired
19       USER20                 NaT 2020-01-13 00:00:00                 NaT                 NaT                 NaT  Offer Expired
20       USER21                 NaT 2019-09-13 00:00:00                 NaT                 NaT                 NaT  Offer Expired
21       USER22                 NaT 2020-04-20 00:00:00                 NaT                 NaT                 NaT  Offer Expired
22       USER23                 NaT 2020-02-12 00:00:00                 NaT                 NaT                 NaT  Offer Expired
23       USER24                 NaT 2019-10-18 00:00:00                 NaT                 NaT                 NaT  Offer Expired
24       USER25 2020-09-06 22:09:22 2020-04-07 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
25       USER26 2020-09-06 22:09:22 2020-04-21 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
26       USER27                 NaT                 NaT                 NaT 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
27       USER28                 NaT                 NaT                 NaT 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
28       USER29 2020-09-06 22:09:22 2020-04-17 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
29       USER30 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
30       USER31 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
31       USER32 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
32       USER33 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
33       USER34 2020-09-06 22:09:22 2020-10-27 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36            NaN
34       USER35 2020-09-06 22:09:22 2019-06-18 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
35       USER36 2020-09-06 22:09:22 2020-04-15 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
36       USER37 2020-09-06 22:09:22 2020-09-04 15:29:45 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
37       USER38 2020-09-06 22:09:22                 NaT                 NaT 2020-09-25 17:03:20 2020-06-03 19:45:36            NaN
38       USER39                 NaT                 NaT                 NaT 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired

Note: This has been simpler in Excel where we can use the below formula. However, I couldn't find a way to do this Pandas.

=IF(COUNTBLANK($B2:$F2)=5,"", IF(MAX($B2:$F2)>$H$1,"","Offer Expired"))

Any help is appreciated.

1
  • I would start by applying the pandas melt function to get a new dataframe with just two columns "User" and "Travel Date" see the following Pandas Melt with Multiple Value Vars for approach. Then you can sort by dates and group by users and easily find the travel dates for each user and determine if they fall into your time frame. Commented Nov 25, 2020 at 0:55

3 Answers 3

2
+100

This will do the trick (comments inline)

import numpy as np
import pandas as pd

# I'm assuming all the relevant columns are already converted:

dt = pd.to_datetime("2020-09-25 00:00:00") 

# you need to indicate somehow the columns to compare - using regex on column names:

dftravels = df.filter(regex=".* TRAVEL$", axis=1)

# NaT and any logical 2 argument operation on it always will evaluate to False
# hence you check only ones where there's not a single date after dt and where there's at least one date overall

df["Result"] = np.where(~dftravels.gt(dt).any(axis=1) & dftravels.any(axis=1), "Offer expired", "")

Sign up to request clarification or add additional context in comments.

2 Comments

Hello, thanks for helping. Is it possible to use without regex, because I have few other columns with the word TRAVEL in it and therefore I do not wanted to use regex.
Yup - just replace the assignment line: dftravels = df[[<column names for columns with dates>]]
2
import pandas as pd

# Date to compare with
my_date = pd.to_datetime("2020-09-25 00:00:00")

# Columns to search in
columns = [
  "FIRST TRAVEL", "SECOND TRAVEL", "THIRD TRAVEL",
  "FOURTH TRAVEL", "FIFTH TRAVEL"
]

# Function to find if the offer expired
def offer_expired(row):
  # Returns True if the offer expires
  date_found = False
  expired = True
  for column in columns:
    # Valid date
    if not pandas.isnull(row[column]):
      date_found = True
      # Expired and date newer than given date
      if row[column] >= my_date and expired:
        expired = False
  return expired and date_found

df["RESULT"] = df.apply(lambda row: "Offer Expired" if offer_expired(row) else "", axis=1)

Comments

0

Best way you can translate excel expressions is to use np.where():

given_date = pd.to_datetime("2020-09-25 00:00:00")
columns = df.columns.str.endswith('TRAVEL') # select as per you want you just need a list of columns to go with
df.RESULT = np.where(df.loc[:,columns].max(axis = 1) < given_date, 'Offer Valid', 'Offer Expired')

Also try to read the expression from left to right: where selected columns of df have row-wise max of dates less than given date then offer is valid else offer is expired.

It translates exactly to your question! 1 line, fully functional and highly intuitive! Pandas is fun!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.