Macro-averaged F1 score of lists of BIO-encoded sequences
A named entity in a sequence from
y_predis considered correct only if it is an exact match of the corresponding entity in the
It requires https://github.com/larsmans/seqlearn to work.
Classification report for a list of BIO-encoded sequences. It computes token-level metrics and discards “O” labels.
F-score for BIO-tagging scheme, as used by CoNLL.
This F-score variant is used for evaluating named-entity recognition and related problems, where the goal is to predict segments of interest within sequences and mark these as a “B” (begin) tag followed by zero or more “I” (inside) tags. A true positive is then defined as a BI* segment in both y_true and y_pred, with false positives and false negatives defined similarly.
Support for tags schemes with classes (e.g. “B-NP”) are limited: reported scores may be too high for inconsistent labelings.
y_true : array-like of strings, shape (n_samples,)
Ground truth labeling.
y_pred : array-like of strings, shape (n_samples,)
Sequence classifier’s predictions.
f : float