webstruct.metrics contains metric functions that can be used for model developmenton: on their own or as scoring functions for scikit-learn’s cross-validation and model selection.

webstruct.metrics.avg_bio_f1_score(y_true, y_pred)[source]

Macro-averaged F1 score of lists of BIO-encoded sequences y_true and y_pred.

A named entity in a sequence from y_pred is considered correct only if it is an exact match of the corresponding entity in the y_true.

It requires to work.

webstruct.metrics.bio_classification_report(y_true, y_pred)[source]

Classification report for a list of BIO-encoded sequences. It computes token-level metrics and discards “O” labels.

webstruct.metrics.bio_f_score(y_true, y_pred)[source]

F-score for BIO-tagging scheme, as used by CoNLL.

This F-score variant is used for evaluating named-entity recognition and related problems, where the goal is to predict segments of interest within sequences and mark these as a “B” (begin) tag followed by zero or more “I” (inside) tags. A true positive is then defined as a BI* segment in both y_true and y_pred, with false positives and false negatives defined similarly.

Support for tags schemes with classes (e.g. “B-NP”) are limited: reported scores may be too high for inconsistent labelings.


y_true : array-like of strings, shape (n_samples,)

Ground truth labeling.

y_pred : array-like of strings, shape (n_samples,)

Sequence classifier’s predictions.


f : float