textflint.generation_layer.subpopulation.subpopulation

SubPopulation Abstract Class

class textflint.generation_layer.subpopulation.subpopulation.SubPopulation(intervals=None, **kwargs)[source]

Bases: abc.ABC

An abstract class for extracting subset of examples.

text_processor = <textflint.common.preprocess.en_processor.EnProcessor object>
score(sample, field, **kwargs)[source]

Score the sample

Parameters
  • sample – data sample

  • field (str|list) – field str

  • kwargs

Return int

score for sample

get_slice(scores, dataset)[source]

Pick up samples based on scores

Parameters
  • scores (list) – list of int

  • dataset – Dataset

Returns

subset samples

slice_population(dataset, fields, **kwargs)[source]

Extract a subset of samples.

Parameters
  • dataset – Dataset

  • fields (list) – field str list

  • kwargs

Returns

Subset Dataset

static normalize_bound(limit, size)[source]

Normalize the bound of slice

Parameters
  • limit (str|float|int) – left_bound or right_bound for intervals can be percentile like 10%, 20% can be float between 0 and 1 like 0.3 can be int index like 50

  • size – the size of samples

:return int : bound