How to split an string type array by value

Question

Say I got an array of str:

['12.5', '7', '45', '\n', '13.7', '52', '34.3', '\n']

And I want to split it by value, in this case by '\n', so it becomes:

[['12.5',  '7', '45'],
 ['13.7', '52', '34.3']]

I don't want to enumerate every element since it's time consuming when input has a large scale. So I wonder if there are some functions or python tricks that can easily achieve this.

P.S.

I've saw this question but it doesn't help much. Mainly because I don't quite understand how np.where() works with np.split(), also because I'm working on str type.

Another thing might be helpful is that my final goal is to generate a matrix of numbers (maybe float type), so I'll also be glad to know if there's any numpy function can do this.

Even if you don't want to use a loop to iterate through your elements and you prefer using "some functions or python tricks that can easily achieve this", these tools you are looking for will use a loop. So why not use one yourself for such a basic operation ? — IMCoins
– IMCoins, Commented Jan 22, 2018 at 8:36
@IMCoins I learned from some courses that many packages are using GPU computing matrices, which is faster than implement by myself with some explicit for loop. — Amarth Gûl
– Amarth Gûl, Commented Jan 22, 2018 at 8:38
@AmarthGûl Unfortunately, most of the packages that do that are 3rd party packages, and a loop is usually your best bet because it is implemented in C. — cs95
– cs95, Commented Jan 22, 2018 at 8:45
@cᴏʟᴅsᴘᴇᴇᴅ Well, when implementing matrix computations, I found numpy functions are way more faster than operations written by myself. So I was actually hoping numpy could save me again. Now seems you're right, the answers below are still using for loops — Amarth Gûl
– Amarth Gûl, Commented Jan 22, 2018 at 8:52

user2390182 · Accepted Answer · 2018-01-22 08:49:16Z

2

You can use itertools.groupby which, of course, does iterate the list, but is highly optimized:

from itertools import groupby

l = ['12.5', '7', '45', '\n', '13.7', '52', '34.3', '\n']

[list(g) for k, g in groupby(lst, '\n'.__eq__) if not k]
# [['12.5', '7', '45'], ['13.7', '52', '34.3']]

Or, with float conversion:

[list(map(float, g)) for k, g in groupby(lst, '\n'.__eq__) if not k]
# [[12.5, 7.0, 45.0], [13.7, 52.0, 34.3]]

edited Jan 22, 2018 at 8:49

answered Jan 22, 2018 at 8:34

user2390182

73.7k6 gold badges71 silver badges95 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Mateen Ulhaq Over a year ago

Alternatively, one might also use pandas for similar functionality.

Kasravnd Over a year ago

Or [list(g) for k, g in groupby(lst, '\n'.__eq__) if not k]

user2390182 Over a year ago

@Kasramvd Very good point. Updated my answer. Mayby slightly less obvious to the beginner's eye, but definitely worth avoiding the lambda.

Mateen Ulhaq · Accepted Answer · 2018-01-22 09:01:58Z

1

Using numpy:

rows = np.split(z, np.where(arr == '\n')[0] + 1)[:-1]
mat = np.array(rows).astype(np.float)

Alternatively, if we're sure to be dealing with a matrix, you could simply search for the first occurrence of '\n', reshape, and slice using that.

first = np.argmax(arr == '\n')
mat = arr.reshape(-1, first + 1)[:, 0:first].astype(np.float)

This might be faster.

edited Jan 22, 2018 at 9:01

answered Jan 22, 2018 at 8:56

Mateen Ulhaq

27.9k21 gold badges121 silver badges155 bronze badges

Comments

sytech · Accepted Answer · 2018-01-22 08:37:47Z

0

I made a thing for this once upon a time. A chunking module. It's made to work similar to str.split

pip install chunking

Then

>>> from chunking import split
>>> a_list = ["foo", 'bar', 'SEP', 'bacon', 'eggs']
>>> split(a_list, 'SEP')
[['foo', 'bar'], ['bacon', 'eggs']]

There's also chunking.iter_split, which is a generator variant of that.

answered Jan 22, 2018 at 8:37

sytech

42.7k8 gold badges77 silver badges127 bronze badges

Collectives™ on Stack Overflow

How to split an string type array by value

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related