textflint.generation_layer.subpopulation.UT.lm¶
Extract samples with high perplexity or low perplexity¶
-
class
textflint.generation_layer.subpopulation.UT.lm.LMSubPopulation(intervals=['0%', '20%'], device='cpu', max_sent_size=512)[source]¶ Bases:
textflint.generation_layer.subpopulation.subpopulation.SubPopulationFilter samples based on text perplexity
Example:
sample 1: "I love textflint", score: 6.7 sample 2: "I love TextFlinet", score: 6.34
-
class
textflint.generation_layer.subpopulation.UT.lm.SubPopulation(intervals=None, **kwargs)[source]¶ Bases:
abc.ABCAn abstract class for extracting subset of examples.
-
text_processor= <textflint.common.preprocess.en_processor.EnProcessor object>¶
-
score(sample, field, **kwargs)[source]¶ Score the sample
- Parameters
sample – data sample
field (str|list) – field str
kwargs –
- Return int
score for sample
-
get_slice(scores, dataset)[source]¶ Pick up samples based on scores
- Parameters
scores (list) – list of int
dataset – Dataset
- Returns
subset samples
-