Yesterday I posted a similar question to this one: Python Regex Named Groups. This work's pretty well for simple things.
After some researching I've read about the pyparsing library which seems to be pretty perfect for my tasks.
text = '[@a eee, fff fff, ggg @b eee, fff, ggg @c eee eee, fff fff,ggg ggg@d]'
command_s = Suppress(Optional('[') + Literal('@'))
command_e = Suppress(Literal('@') | Literal(']'))
task = Word(alphas)
arguments = ZeroOrMore(
Word(alphas) +
Suppress(
Optional(Literal(',') + White()) | Optional(White() + Literal('@'))
)
)
command = Group(OneOrMore(command_s + task + arguments + command_e))
print command.parseString(text)
# which outputs only the first @a sequence
# [['a', 'eee', 'fff', 'fff', 'ggg']]
# the structure should be someting like:
[
['a', 'eee', 'fff fff', 'ggg'],
['b', 'eee', 'fff', 'ggg'],
['c', 'eee eee', 'fff fff', 'ggg ggg'],
['d']
]
@ indicates the start of a sequence, the first word is a task (a) followed by optional comma-separated arguments (eee, fff fff, ggg). The problem is, that @b, @c and @d are ignored by the above code. Also "fff fff" getting treated as two separated arguments, it should only be one.