0

I have the following numpy array:

array(['00:00', '00:05', '00:15', '00:20', '00:25', '00:30', '00:35',
       '00:40', '00:45', '00:50', '00:55', '01:00', '01:05', '01:10',
       '01:15', '01:20', '01:40', '01:45', '01:55', '02:05', '02:10',
       '02:15', '02:35', '02:40', '02:45', '02:55', '03:05', '03:10',
       '03:30', '03:55', '04:00', '04:05', '04:25', '04:40', '04:55',
       '05:00', '05:05', '05:15', '05:20', '05:25', '05:30', '05:35',
       '05:50', '05:55', '06:05', '06:20', '06:25', '06:30', '06:35',
       '06:45', '06:50', '07:05', '07:15', '07:30', '07:40', '07:45',
       '07:50', '07:55', '08:10', '08:20', '08:25', '08:40', '08:45',
       '08:50', '09:15', '09:20', '09:45', '09:50', '09:55', '10:10',
       '10:15', '10:25', '10:30', '10:45', '10:50', '11:00', '11:05',
       '11:15', '11:25', '11:35', '11:45', '11:50', '11:55', '12:00',
       '12:10', '12:15', '12:25', '12:50', '12:55', '13:00', '13:40',
       '13:45', '13:50', '14:00', '14:10', '14:20', '14:35', '14:55',
       '15:05', '15:10', '15:15', '15:20', '15:25', '15:45', '15:55',
       '16:10', '16:15', '16:20', '16:25', '16:35', '16:45', '16:50',
       '16:55', '17:05', '17:30', '17:35', '17:45', '17:50', '18:00',
       '18:05', '18:10', '18:15', '18:20', '18:30', '18:35', '18:45',
       '19:00', '19:10', '19:20', '19:40', '19:50', '20:00', '20:15',
       '20:20', '20:35', '20:45', '20:55', '21:00', '21:05', '21:15',
       '21:20', '21:25', '21:30', '21:40', '21:45', '22:00', '22:10',
       '22:15', '22:25', '22:40', '22:45', '22:50', '22:55'], dtype='<U5')

I would like to automatically replace all values not ending in '00' by an empty string, so I would get:

array(['00:00', '', '', '', '', '', '',
       '', '', '', '', '01:00', '', '',
       '', '', '', '', '', '02:00', '',
       ...
       '', '', '', '', '', ''], dtype='<U5')

Ideally using something which is part of the numpy library.

3
  • 5
    np.where(np.char.endswith(arr, ":00"), arr, " ") Commented Jan 10, 2022 at 3:08
  • Why is this an array? Why not a list? Commented Jan 10, 2022 at 7:51
  • Note that the question asked for an array. But as stated in your answer it is clear that a list might be a better answer. Commented Jan 10, 2022 at 10:31

3 Answers 3

5

You can use list comprehension with endswith:

output = [s if s.endswith('00') else '' for s in lst]
print(output)
# ['00:00', '', '', '', '', '', '', '', '', '', '', '01:00', '', '', '', '', '
# ', '', '', '', '', '', '', '', '', '', '', '', '', '', '04:00', '', '', '',
# '', '05:00', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
#  '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
#  '', '', '', '', '11:00', '', '', '', '', '', '', '', '12:00', '', '', '', '
# ', '', '13:00', '', '', '', '14:00', '', '', '', '', '', '', '', '', '', '',
#  '', '', '', '', '', '', '', '', '', '', '', '', '', '', '18:00', '', '', ''
# , '', '', '', '', '19:00', '', '', '', '', '20:00', '', '', '', '', '', '21:
# 00', '', '', '', '', '', '', '', '22:00', '', '', '', '', '', '', '']
Sign up to request clarification or add additional context in comments.

5 Comments

This is useful, answer would be @bb1 comment though, as the question was refered to numpy.
@M.E. Yes, I am aware of that. But I was reluctant to use numpy array for non-numeric items. For example, with a long list with length 15,300,000 (100000 * your list), numpy approach took 15 seconds whereas list comprehension approach took 2 seconds on my machine. I have a hunch that there is almost nothing to gain when we use numpy for strings or general objects (although I am not really sure about this statement).
Starting with a list is 5x faster than starting with and returning an array.
@j1-lee thanks that is a valuable comparison. I am using numpy arrays for matplotlib and I wanted to keep the same data types for everything (both numeric and non numeric). Not sure if there are use cases where numpy vs lists for strings might be advisable, I suspect there might be as there are many scenarios you can face. Both answers are relevant to better understand numpy.
1

traditional numpy way.

import numpy as np

narr = np.array(['00:00', '00:05', '00:15', '00:20', '00:25', '00:30', '00:35',
       '00:40', '00:45', '00:50', '00:55', '01:00', '01:05', '01:10',
       '01:15', '01:20', '01:40', '01:45', '01:55', '02:05', '02:10',
       '02:15', '02:35', '02:40', '02:45', '02:55', '03:05', '03:10',
       '03:30', '03:55', '04:00', '04:05', '04:25', '04:40', '04:55',
       '05:00', '05:05', '05:15', '05:20', '05:25', '05:30', '05:35',
       '05:50', '05:55', '06:05', '06:20', '06:25', '06:30', '06:35',
       '06:45', '06:50', '07:05', '07:15', '07:30', '07:40', '07:45',
       '07:50', '07:55', '08:10', '08:20', '08:25', '08:40', '08:45',
       '08:50', '09:15', '09:20', '09:45', '09:50', '09:55', '10:10',
       '10:15', '10:25', '10:30', '10:45', '10:50', '11:00', '11:05',
       '11:15', '11:25', '11:35', '11:45', '11:50', '11:55', '12:00',
       '12:10', '12:15', '12:25', '12:50', '12:55', '13:00', '13:40',
       '13:45', '13:50', '14:00', '14:10', '14:20', '14:35', '14:55',
       '15:05', '15:10', '15:15', '15:20', '15:25', '15:45', '15:55',
       '16:10', '16:15', '16:20', '16:25', '16:35', '16:45', '16:50',
       '16:55', '17:05', '17:30', '17:35', '17:45', '17:50', '18:00',
       '18:05', '18:10', '18:15', '18:20', '18:30', '18:35', '18:45',
       '19:00', '19:10', '19:20', '19:40', '19:50', '20:00', '20:15',
       '20:20', '20:35', '20:45', '20:55', '21:00', '21:05', '21:15',
       '21:20', '21:25', '21:30', '21:40', '21:45', '22:00', '22:10',
       '22:15', '22:25', '22:40', '22:45', '22:50', '22:55'], dtype='<U5')


with np.nditer(narr, flags=['multi_index'], op_flags=['writeonly']) as it:
    for x in it:
        if(int(str(x)[-2:]) > 0):
            x[...] = ''

print(narr)

3 Comments

What's the benefit of using nditer?
A better control of the array iteration. Though most of the developers don't like the traditional ways.
It's much slower than the list comprehension answer (even for an array). And for some obscure reason giving me some overwrite errors when trying multiple timeit loops. Here you are simply iterating through the array, so I don't see the need for better control.
0

You can try regex like:

import numpy as np
import re
x = np.array(['00:00', '00:05', '00:15', '00:20', '00:25', '00:30', '00:35',
       '00:40', '00:45', '00:50', '00:55', '01:00', '01:05', '01:10',
       '01:15', '01:20', '01:40', '01:45', '01:55', '02:05', '02:10',
       '02:15', '02:35', '02:40', '02:45', '02:55', '03:05', '03:10',
       '03:30', '03:55', '04:00', '04:05', '04:25', '04:40', '04:55',
       '05:00', '05:05', '05:15', '05:20', '05:25', '05:30', '05:35',
       '05:50', '05:55', '06:05', '06:20', '06:25', '06:30', '06:35',
       '06:45', '06:50', '07:05', '07:15', '07:30', '07:40', '07:45',
       '07:50', '07:55', '08:10', '08:20', '08:25', '08:40', '08:45',
       '08:50', '09:15', '09:20', '09:45', '09:50', '09:55', '10:10',
       '10:15', '10:25', '10:30', '10:45', '10:50', '11:00', '11:05',
       '11:15', '11:25', '11:35', '11:45', '11:50', '11:55', '12:00',
       '12:10', '12:15', '12:25', '12:50', '12:55', '13:00', '13:40',
       '13:45', '13:50', '14:00', '14:10', '14:20', '14:35', '14:55',
       '15:05', '15:10', '15:15', '15:20', '15:25', '15:45', '15:55',
       '16:10', '16:15', '16:20', '16:25', '16:35', '16:45', '16:50',
       '16:55', '17:05', '17:30', '17:35', '17:45', '17:50', '18:00',
       '18:05', '18:10', '18:15', '18:20', '18:30', '18:35', '18:45',
       '19:00', '19:10', '19:20', '19:40', '19:50', '20:00', '20:15',
       '20:20', '20:35', '20:45', '20:55', '21:00', '21:05', '21:15',
       '21:20', '21:25', '21:30', '21:40', '21:45', '22:00', '22:10',
       '22:15', '22:25', '22:40', '22:45', '22:50', '22:55'])

print(np.array(list(map(lambda v: re.sub(r'[0-9]{2}:(([1-9][0-9])|(0[1-9]))', '',v) ,x))))

Reference

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.