10

I have the following string list. Then, I want to sort it by a number in each element. sorted failed because it cannot handle the order such as between 10 and 3. I can imagine if I use re, I can do it. But it is not interesting. Do you guys have nice implementation ideas? I suppose python 3.x for this code.

names = [
'Test-1.model',
'Test-4.model',
'Test-6.model',
'Test-8.model',
'Test-10.model',
'Test-20.model'
]
number_sorted = get_number_sorted(names)
print(number_sorted)
'Test-20.model'
'Test-10.model'
'Test-8.model'
'Test-6.model'
'Test-4.model'
'Test-1.model'

7 Answers 7

7

the key is ... the key

sorted(names, key=lambda x: int(x.partition('-')[2].partition('.')[0]))

Getting that part of the string recognized as the sort order by separating it out and transforming it to an int.

Sign up to request clarification or add additional context in comments.

Comments

5

Some alternatives:

(1) Slicing by position:

sorted(names, key=lambda x: int(x[5:-6]))

(2) Stripping substrings:

sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))

Or better (Pandas version >3.9):

x.removeprefix('Test-').removesuffix('.model')

(3) Splitting characters (also possible via str.partition):

sorted(names, key=lambda x: int(x.split('-')[1].split('.')[0]))

(4) Map with np.argsort on any of (1)-(3):

list(map(names.__getitem__, np.argsort([int(x[5:-6]) for x in names])))

4 Comments

Given the goal is to sort the original strings, not just get the sorted numbers, using a key function to perform the transform would make more sense (and avoid an unnecessary genexpr), e.g. for your first example, sorted(names, key=lambda x: int(x[5:-6])), or for your second sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))
@ShadowRanger, yep I realise this now. I have edited my answer.
I like the multiple options now. That is innovative.
Since 3.9, x.removeprefix('Test-').removesuffix('.model') might be more appropriate than the .replace version. Doc for str.removeprefix and str.removesuffix
3

I found a similar question and a solution by myself. Nonalphanumeric list order from os.listdir() in Python

import re
def sorted_alphanumeric(data):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
    return sorted(data, key=alphanum_key, reverse=True)

Comments

2

You can use re.findall in with the key of the sort function:

import re
names = [
 'Test-1.model',
 'Test-4.model',
 'Test-6.model',
 'Test-8.model',
 'Test-10.model',
 'Test-20.model'
]
final_data = sorted(names, key=lambda x:int(re.findall('(?<=Test-)\d+', x)[0]), reverse=True)

Output:

['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']

Comments

1
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

and then do something like

 sorted(names, key=lambda x: int(find_between(x, 'Test-', '.model')))

Comments

1

You can use the key parameter along with sorted() to accomplish this, assuming each string is formatted the same way:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]))

It looks like you might want your list reverse sorted (?), in which case you can add reverse=True as such:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]), reverse=True)
number_sorted = get_number_sorted(names)
print(number_sorted)
['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']

See related: Key Functions

Comments

1

Here is a regex based approach. We can extract the test number from the string, cast to int, and then sort by that.

import re

def grp(txt): 
    s = re.search(r'Test-(\d+)\.model', txt, re.IGNORECASE)
    if s:
        return int(s.group(1))
    else:
        return float('-inf')  # Sorts non-matching strings ahead of matching strings

names.sort(key=grp)

3 Comments

This still sorts string style (lexicographically), not numerically. You'd want to return int(s.group(1)) in the first case, and some filler numerical value (e.g. float('-inf') to sort put strings not matching the pattern at the front of the resulting list), not str, in the else case.
@ShadowRanger No, even making those changes still doesn't fix it. I don't know Python, by the way. Feel free to edit this.
@TimBiegeleisen: list.sort runs in place and returns None (which means "has no return value"). Your test code reassigns names to None by assigning the result of names.sort, which is why it breaks. I removed the names = from names = names.sort(key=lambda l: grp(l)) (and simplified to names.sort(key=grp); no lambda wrapper needed since grp already has the correct prototype) and it works fine.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.