Skip to content

Conversation

@jayaddison
Copy link
Contributor

This changeset should functionally be a no-op, and may be best reviewed with whitespace changes ignored since the indentation level for some of the loop's logic has been reduced.

The changes appear to result in a minor performance improvement; that said there are likely larger benefits to be found by further simplification and refactoring (perhaps including fairly substantial logical restructuring).

Before (2c19b98)

.........................................
html_parse_etree: Mean +- std dev: 204 ms +- 10 ms

After (369a412)

.........................................
html_parse_etree: Mean +- std dev: 197 ms +- 9 ms

type in (StartTagToken, CharactersToken, SpaceCharactersToken))):
break

prev_token = new_token
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: This is technically a behaviour change since prev_token could previously have referenced a ParseErrorToken. That said, I don't believe that the code that inspects the prev_token would ever be relevant for error tokens.

…content phase

This is based on the idea that it's likely easier to understand the code -- and that it's hopefully less fragile -- if there is a single boolean with a readable name rather than repeated assignments to a variable that is invoked as a method call later
@jayaddison
Copy link
Contributor Author

Cleaning up some old / stale pull requests; please let me know if this changeset is considered worthwhile and I'll reopen if so.

@jayaddison jayaddison closed this Dec 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant