WebAnnotator Utilities

webstruct.webannotator provides functions for working with HTML pages annotated with WebAnnotator Firefox extension.

webstruct.webannotator.to_webannotator(tree, entity_colors=None, url=None)[source]

Convert a tree loaded by one of WebStruct loaders to WebAnnotator format.

If you want a predictable colors assignment use entity_colors argument; it should be a mapping {'entity_name': (fg, bg, entity_idx)}; entity names should be lowercased. You can use EntityColors to generate this mapping automatically:

>>> from webstruct.webannotator import EntityColors, to_webannotator
>>> # trees = ...
>>> entity_colors = EntityColors()
>>> wa_trees = [to_webannotator(tree, entity_colors) for tree in trees]  
class webstruct.webannotator.EntityColors(**kwargs)[source]

{"entity_name": ("fg_color", "bg_color", entity_index)} mapping that generates entries for new entities on first access.

classmethod from_htmlbytes(html_bytes, encoding=None)[source]
classmethod from_htmlfile(path, encoding=None)[source]

Load the color mapping from WebAnnotator-annotated HTML file