#acl All:read #format rst Intended Audience ================= Beginning to intermediate programmers. A basic working knowledge of Python is assumed. Summary ======= This tutorial will introduce beginning to intermediate programmers to the many useful Python tools & techniques for text and data processing. Topics will include regular expressions, filtering data with generators, and parsing. Outline ======= * Common data sources needing processing: - log files - CSV - tabular data - email - XML * Tools & techniques: - lists & dictionaries - ``s.join(list)`` instead of accumulating - ``for line in file`` - filters, large data sources: generators - decorate-sort-undecorate - StringIO * Regular expressions: - pattern matching - filtering - substitution - splitting * Parsing: - ``text.split()`` - ``text.find()`` - regular expressions - "real" parsers (including XML) - state machines Please send feedback & ideas for further specific topics to the trainer, David Goodger (`email `_, `home page `_).