textflint.generation_layer.generator.pos_generator

POS Generator Class

class textflint.generation_layer.generator.pos_generator.POSGenerator(task='POS', max_trans=1, fields='x', trans_methods=None, trans_config=None, return_unk=True, sub_methods=None, sub_config=None, attack_methods=None, validate_methods=None, **kwargs)[source]

Bases: textflint.generation_layer.generator.generator.Generator

NER Generator aims to apply NER data generation function.

class textflint.generation_layer.generator.pos_generator.Generator(task='UT', max_trans=1, random_seed=1, fields='x', trans_methods=None, trans_config=None, return_unk=True, sub_methods=None, sub_config=None, attack_methods=None, validate_methods=None, **kwargs)[source]

Bases: abc.ABC

Transformation controller which applies multi transformations to each data sample.

__init__(task='UT', max_trans=1, random_seed=1, fields='x', trans_methods=None, trans_config=None, return_unk=True, sub_methods=None, sub_config=None, attack_methods=None, validate_methods=None, **kwargs)[source]
Parameters
  • task (str) – Indicate which task of your transformation data.

  • max_trans (int) – Maximum transformed samples generate by one original sample pre Transformation.

  • random_seed (int) – random number seed to reproduce generation.

  • fields (str|list) – Indicate which fields to apply transformations. Multi fields transform just for some special task, like: SM、NLI.

  • trans_methods (list) – list of transformations’ name.

  • trans_config (dict) – transformation class configs, useful to control the behavior of transformations.

  • return_unk (bool) – Some transformation may generate unk labels, s.t. insert a word to a sequence in NER task. If set False, would skip these transformations.

  • sub_methods (list) – list of subpopulations’ name.

  • sub_config (dict) – subpopulation class configs, useful to control the behavior of subpopulation.

  • attack_methods (str) – path to the python file containing the Attack instances.

  • validate_methods (list) – confidence calculate functions.

prepare(dataset)[source]

Check dataset

Parameters

dataset (textflint.Dataset) – the input dataset

generate(dataset, model=None)[source]

Returns a list of possible generated samples for dataset.

Parameters
Returns

yield (original samples, new samples, generated function string).

generate_by_transformations(dataset, **kwargs)[source]

Generate samples by a list of transformation methods.

Parameters

dataset – the input dataset

Returns

(original samples, new samples, generated function string)

generate_by_subpopulations(dataset, **kwargs)[source]

Generate samples by a list of subpopulation methods.

Parameters

dataset – the input dataset

Returns

the transformed dataset

generate_by_attacks(dataset, model=None, **kwargs)[source]

Generate samples by a list of attack methods.

Parameters
  • dataset – the input dataset

  • model – the model to attack if given.

Returns

the transformed dataset