Extending the parser

Modules such as page3 extend the CSS 2.1 parser to add support for CSS 3 syntax. They do so by sub-classing css21.CSS21Parser and overriding/extending some of its methods. If fact, the parser is made of methods in a class (rather than a set of functions) solely to enable this kind of sub-classing.

tinycss is designed to enable you to have parser subclasses outside of tinycss, without monkey-patching. If however the syntax you added is for a W3C specification, consider including your subclass in a new tinycss module and send a pull request: see Hacking tinycss.

Example: star hack

The star hack uses invalid declarations that are only parsed by some versions of Internet Explorer. By default, tinycss ignores invalid declarations and logs an error.

>>> from tinycss.css21 import CSS21Parser
>>> css = '#elem { width: [W3C Model Width]; *width: [BorderBox Model]; }'
>>> stylesheet = CSS21Parser().parse_stylesheet(css)
>>> stylesheet.errors
[ParseError('Parse error at 1:35, expected a property name, got DELIM',)]
>>> [decl.name for decl in stylesheet.rules[0].declarations]
['width']

If for example a minifier based on tinycss wants to support the star hack, it can by extending the parser:

>>> class CSSStarHackParser(CSS21Parser):
...     def parse_declaration(self, tokens):
...         has_star_hack = (tokens[0].type == 'DELIM' and tokens[0].value == '*')
...         if has_star_hack:
...             tokens = tokens[1:]
...         declaration = super(CSSStarHackParser, self).parse_declaration(tokens)
...         declaration.has_star_hack = has_star_hack
...         return declaration
...
>>> stylesheet = CSSStarHackParser().parse_stylesheet(css)
>>> stylesheet.errors
[]
>>> [(d.name, d.has_star_hack) for d in stylesheet.rules[0].declarations]
[('width', False), ('width', True)]

This class extends the parse_declaration() method. It removes any * delimeter Token at the start of a declaration, and adds a has_star_hack boolean attribute on parsed Declaration objects: True if a * was removed, False for “normal” declarations.

Parser methods

In addition to methods of the user API (see Parsing a stylesheet), here are the methods of the CSS 2.1 parser that can be overriden or extended:

CSS21Parser.parse_rules(tokens, context)[source]

Parse a sequence of rules (rulesets and at-rules).

Parameters
  • tokens – An iterable of tokens.

  • context – Either 'stylesheet' or an at-keyword such as '@media'. (Most at-rules are only allowed in some contexts.)

Returns

A tuple of a list of parsed rules and a list of ParseError.

CSS21Parser.read_at_rule(at_keyword_token, tokens)[source]

Read an at-rule from a token stream.

Parameters
  • at_keyword_token – The ATKEYWORD token that starts this at-rule You may have read it already to distinguish the rule from a ruleset.

  • tokens – An iterator of subsequent tokens. Will be consumed just enough for one at-rule.

Returns

An unparsed AtRule.

Raises

ParseError if the head is invalid for the core grammar. The body is not validated. See AtRule.

CSS21Parser.parse_at_rule(rule, previous_rules, errors, context)[source]

Parse an at-rule.

Subclasses that override this method must use super() and pass its return value for at-rules they do not know.

In CSS 2.1, this method handles @charset, @import, @media and @page rules.

Parameters
  • rule – An unparsed AtRule.

  • previous_rules – The list of at-rules and rulesets that have been parsed so far in this context. This list can be used to decide if the current rule is valid. (For example, @import rules are only allowed before anything but a @charset rule.)

  • context – Either 'stylesheet' or an at-keyword such as '@media'. (Most at-rules are only allowed in some contexts.)

Raises

ParseError if the rule is invalid.

Returns

A parsed at-rule

CSS21Parser.parse_media(tokens)[source]

For CSS 2.1, parse a list of media types.

Media Queries are expected to override this.

Parameters

tokens – A list of tokens

Raises

ParseError on invalid media types/queries

Returns

For CSS 2.1, a list of media types as strings

CSS21Parser.parse_page_selector(tokens)[source]

Parse an @page selector.

Parameters

tokens – An iterable of token, typically from the head attribute of an unparsed AtRule.

Returns

A page selector. For CSS 2.1, this is 'first', 'left', 'right' or None.

Raises

ParseError on invalid selectors

CSS21Parser.parse_declarations_and_at_rules(tokens, context)[source]

Parse a mixed list of declarations and at rules, as found eg. in the body of an @page rule.

Note that to add supported at-rules inside @page, CSSPage3Parser extends parse_at_rule(), not this method.

Parameters
  • tokens – An iterable of token, typically from the body attribute of an unparsed AtRule.

  • context – An at-keyword such as '@page'. (Most at-rules are only allowed in some contexts.)

Returns

A tuple of:

CSS21Parser.parse_ruleset(first_token, tokens)[source]

Parse a ruleset: a selector followed by declaration block.

Parameters
  • first_token – The first token of the ruleset (probably of the selector). You may have read it already to distinguish the rule from an at-rule.

  • tokens – an iterator of subsequent tokens. Will be consumed just enough for one ruleset.

Returns

a tuple of a RuleSet and an error list. The errors are recovered ParseError in declarations. (Parsing continues from the next declaration on such errors.)

Raises

ParseError if the selector is invalid for the core grammar. Note a that a selector can be valid for the core grammar but not for CSS 2.1 or another level.

CSS21Parser.parse_declaration_list(tokens)[source]

Parse a ; separated declaration list.

You may want to use parse_declarations_and_at_rules() (or some other method that uses parse_declaration() directly) instead if you have not just declarations in the same context.

Parameters

tokens – an iterable of tokens. Should stop at (before) the end of the block, as marked by }.

Returns

a tuple of the list of valid Declaration and a list of ParseError

CSS21Parser.parse_declaration(tokens)[source]

Parse a single declaration.

Parameters

tokens – an iterable of at least one token. Should stop at (before) the end of the declaration, as marked by a ; or }. Empty declarations (ie. consecutive ; with only white space in-between) should be skipped earlier and not passed to this method.

Returns

a Declaration

Raises

ParseError if the tokens do not match the ‘declaration’ production of the core grammar.

CSS21Parser.parse_value_priority(tokens)[source]

Separate any !important marker at the end of a property value.

Parameters

tokens – A list of tokens for the property value.

Returns

A tuple of the actual property value (a list of tokens) and the priority.

Unparsed at-rules

class tinycss.css21.AtRule(at_keyword, head, body, line, column)[source]

An unparsed at-rule.

at_keyword

The normalized (lower-case) at-keyword as a string. Eg: '@page'

head

The part of the at-rule between the at-keyword and the { marking the body, or the ; marking the end of an at-rule without a body. A TokenList.

body

The content of the body between { and } as a TokenList, or None if there is no body (ie. if the rule ends with ;).

The head was validated against the core grammar but not the body, as the body might contain declarations. In case of an error in a declaration, parsing should continue from the next declaration. The whole rule should not be ignored as it would be for an error in the head.

These at-rules are expected to be parsed further before reaching the user API.

Parsing helper functions

The tinycss.parsing module contains helper functions for parsing tokens into a more structured form:

tinycss.parsing.strip_whitespace(tokens)[source]

Remove whitespace at the beggining and end of a token list.

Whitespace tokens in-between other tokens in the list are preserved.

Parameters

tokens – A list of Token or ContainerToken.

Returns

A new sub-sequence of the list.

tinycss.parsing.split_on_comma(tokens)[source]

Split a list of tokens on commas, ie , DELIM tokens.

Only “top-level” comma tokens are splitting points, not commas inside a function or other ContainerToken.

Parameters

tokens – An iterable of Token or ContainerToken.

Returns

A list of lists of tokens

tinycss.parsing.validate_value(tokens)[source]

Validate a property value.

Parameters

tokens – an iterable of tokens

Raises

ParseError if there is any invalid token for the ‘value’ production of the core grammar.

tinycss.parsing.validate_block(tokens, context)[source]
Raises

ParseError if there is any invalid token for the ‘block’ production of the core grammar.

Parameters
  • tokens – an iterable of tokens

  • context – a string for the ‘unexpected in …’ message

tinycss.parsing.validate_any(token, context)[source]
Raises

ParseError if this is an invalid token for the ‘any’ production of the core grammar.

Parameters
  • token – a single token

  • context – a string for the ‘unexpected in …’ message