4

I'm trying to think of a way to accomplish this in the best pythonic way possible. Right now the only method I can think of is to brute force it.

User inputs a date (via command line) in one of the following manners (ex. ./mypy.py date='20110909.00 23' )

date='20110909'
date='20110909.00 23'
date='20110909.00 20110909.23'

All three examples should have the same result, it doesn't matter if it populates a list (which I can sort) such as

['20110909.00', '20110909.23]

or even two sorted separate variables, but in all cases it's YYYYMMDD.HH, and needs to make sure it is indeed a date and not text.

Any ideas?

Thank you.

+++++ EDIT +++++ After plugging away at this, I'm thinking I needed to do a lot of date checking/manipulating first. Which all seems to be working just great. Except at the very end I run the list through the date validation and it fails every time - even when it should be passing.

(I launch it with) ./test.py date='20110909.00 23'

(or any variation of date - i.e. date='20 22' or date='20110909' or date='20110909.00 23' etc.)

import sys, re, time, datetime

now = datetime.datetime.now()
tempdate=[]
strfirstdate=None
strtempdate=None

temparg2 = sys.argv
del temparg2[0]
tempdate = temparg2[0].replace('date=','')
date = tempdate.split(' ');

tempdate=[]
date.sort(key=len, reverse=True)
result = None

# If no date is passed then create list according to [YYMMDD.HH, YYMMDD.HH]
if date[0] == 'None':
    tempdate.extend([now.strftime('%Y%m%d.00'), now.strftime('%Y%m%d.%H')])


# If length of date list is 1 than see if it is YYMMDD only or HH only, and create list according to [YYMMDD.HH, YYMMDD.HH]
elif len(date) == 1:
    if len(date[0]) == 8:
        tempdate.extend([ date[0] + '.00', date[0] + '.23'])
    elif len(date[0]) == 2:
        tempdate.extend([now.strftime('%Y%m%d') + '.' + date[0], now.strftime('%Y%m%d') + '.' + date[0]])
    else:
        tempdate.extend([date[0], date[0]])


# iterate through list, see if value is YYMMDD only or HH only or YYYYMMDD.HH, and create list accoring to [YYYYMMDD.HH, YYYYMMDD.HH] - maximum of 2 values
else:
    for _ in range(2):
        if len(date[_]) == 8:
            strfirstdate = date[0]
            tempdate.append([ date[_] + '.00'])
        elif len(date[_]) == 2:
            if _ == 0:  # both values passed could be hours only
                tempdate.append(now.strftime('%Y%m%d') + '.' + date[_])
            else:  # we must be at the 2nd value passed.
                if strfirstdate == None:
                    tempdate.append(now.strftime('%Y%m%d') + '.' + date[_])
                else:
                    tempdate.append(strfirstdate + '.' + date [_])
        else:
            strfirstdate = date[0][:8]
            tempdate.append(date[_])

tempdate.sort()


for s in tempdate:
    try:
        result = datetime.datetime.strptime(s, '%Y%m%d.%H')
    except:
        pass

if result is None:
    print 'Malformed date.'
else:
    print 'Date is fine.'

print tempdate

++++ Edit 2 ++++ If I remove the bottom part (after tempdate.sort()) and replace it with this.

strfirstdate = re.compile(r'([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]+\.[0-9][0-9])')
for s in tempdate:
    if re.match(strfirstdate, s):
        result = "validated"
    else:
        print "#####################"
        print "#####################"
        print "##  error in date  ##"
        print "#####################"
        print "#####################"
        exit

It will validate appropriately.

This entire method just doesn't seem to be very pythonic.

2
  • What do you mean with brute force? Obviously you have to implement some logic to seperate the different cases you have shown. Just do that, show your code and we will help you to make it more pythonic. Commented Sep 9, 2011 at 20:45
  • @Achim What I had started doing was first look for the length of the item, if it was 2 long than validate for number. If not then validate against regex thedate = re.compile(r'([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]+\.[0-9][0-9])') if re.match(thedate, item): print "validated" ... Commented Sep 9, 2011 at 22:27

4 Answers 4

7

You can create a mask and parse it, using try...except to determine whether the date string matches one of the many masks. I had this code for a project, so I've slightly modified it:

from time import mktime, strptime
from datetime import datetime

date = '20110909.00 20110909.23'.split(' ')[0]
result = None

for format in ['%Y%m%d', '%Y%m%d.%H']:
  try:
    result = datetime.strptime(date, format)
  except:
    pass

if result is None:
  print 'Malformed date.'
else:
  print 'Date is fine.'
Sign up to request clarification or add additional context in comments.

2 Comments

+1, though I'd use the simpler result = datetime.strptime(date, format)
Thank you @Blender. I found adding %H to the format formatted for hours too. One question though - and I guess I wasn't that clear, but if it's only the hour being passed I'd like the output to add the date to the hour. Likewise if it's just the date passed without the hour.
1

I found some problems when I attempted to use the try..except code example in my own parsing so here's a version with the fixes I added, and I also addressed the question of handling only the hour part:

from datetime import datetime

dates = ['20110909.00','20110909.23','13','20111212','20113131']

def dateTest(date):
  dateOk = False
  for format in ['%Y%m%d', '%Y%m%d.%H', '%H']:
    try:
      result = datetime.strptime(date, format)
      dateOk = (date == result.strftime(format)) # this makes sure the parsed date matches the original string
      if format == '%H': # this handles the hour only case
        date = '%s.%s' % (datetime.now().strftime('%Y%m%d'), date)
    except:
      pass

  if dateOk:
    print 'Date is fine.'
  else:
    print 'Malformed date.'
  return date

for date in dates:
  print date
  print dateTest(date)
  print ''

Comments

0

Take a look at the time module. Specifically, see the time.strptime() function.

There's also a pretty easy conversion between time values and datetime objects.

5 Comments

@Achim - My goal wasn't to solve the problem. It was just to provide the right reference material so he could solve it himself. Teach a man to fish, and all that...
@Alex Smith, that's a better approach for questions tagged [homework], which this isn't
@Daenyth - Helping someone solve a problem themself is useful even if it's not a school assignment.
i'm not going to let it get a -1 vote. I appreciate the links.
@Chasester - Thanks. I'm glad you found them useful :)
0

Does this help you ? :

from datetime import datetime
import re

reg = re.compile('(\d{4})(\d\d)(\d\d)'
                 '(?:\.(\d\d)(\d\d)?(\d\d)? *'
                 '(?:(\d{4})(\d\d)(\d\d)\.)?(\d\d)(\d\d)?(\d\d)? *)?')

for x in ('20110909',
          '20110909.00 23',
          '20110909.00 74',
          '20110909.00 20110909.23',
          '20110909.00 19980412.23',
          '20110909.08 20110909.23',
          '20110935.08 20110909.23',
          '20110909.08 19970609.51'):
    print x

    gr = reg.match(x).groups('000')

    try:
        x1 = datetime(*map(int,gr[0:6]))

        if gr[6]=='000':

            if gr[9]=='000':
                x2 = x1

            else:
                y = map(int,gr[0:3] + gr[9:12])
                try:
                    x2 = datetime(*y)
                except:
                    x2 = "The second part isn't in range(0,25)"

        else:
            y = map(int,gr[6:12])
            try:
                x2 = datetime(*y)
            except:
                x2 = "The second part doesn't represent a real date"
    except:
        x1 = "The first part dosen't represent a real date"
        x2 = '--'

    print [str(x1),str(x2)],'\n'

result

20110909
['2011-09-09 00:00:00', '2011-09-09 00:00:00'] 

20110909.00 23
['2011-09-09 00:00:00', '2011-09-09 23:00:00'] 

20110909.00 74
['2011-09-09 00:00:00', "The hour in the second part isn't in range(0,25)"] 

20110909.00 20110909.23
['2011-09-09 00:00:00', '2011-09-09 23:00:00'] 

20110909.00 19980412.23
['2011-09-09 00:00:00', '1998-04-12 23:00:00'] 

20110909.08 20110909.23
['2011-09-09 08:00:00', '2011-09-09 23:00:00'] 

20110935.08 20110909.23
["The first part dosen't represent a real date", '--'] 

20110909.08 19970609.51
['2011-09-09 08:00:00', "The second part doesn't represent a real date"]  

.

Note that groups('000') replace None with '000' for each group that is None

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.