textflint.generation_layer.validator.max_words_perturbed

Max Perturb Words Constraints

class textflint.generation_layer.validator.max_words_perturbed.MaxWordsPerturbed(origin_dataset, trans_dataset, fields, need_tokens=True)[source]

Bases: textflint.generation_layer.validator.validator.Validator

A constraint representing a maximum allowed perturbed words.

We use the lcs div the long of the sentence as the score.

Parameters
  • origin_dataset (dataset) – the dataset of origin sample

  • trans_dataset (dataset) – the dataset of translate sample

  • fields (str|list) – the name of the origin field need compare.

  • need_tokens (bool) – if we need tokenize the sentence

validate(transformed_text, reference_text)[source]

Calculate the score

Parameters
  • transformed_text (str) – transformed sentence

  • reference_text (str) – origin sentence

Return float

the score of two sentence

static get_lcs(token1, token2)[source]

Calculating the longest common subsequence

Parameters
  • token1 (list) – the first token list

  • token2 (list) – the second token list

Return int

the longest common subsequence

class textflint.generation_layer.validator.max_words_perturbed.Validator(origin_dataset, trans_dataset, fields, need_tokens=False)[source]

Bases: abc.ABC

An abstract class that computes the semantic similarity score between

original text and adversarial texts

Parameters
  • origin_dataset (dataset) – the dataset of origin sample

  • trans_dataset (dataset) – the dataset of translate sample

  • fields (str|list) – the name of the origin field need compare.

  • need_tokens (bool) – if we need tokenize the sentence

abstract validate(transformed_text, reference_text)[source]

Calculate the score

Parameters
  • transformed_text (str) – transformed sentence

  • reference_text (str) – origin sentence

Return float

the score of two sentence

check_data()[source]

Check whether the input data is legal

property score

Calculate the score of the deformed sentence

Return list

a list of translate sentence score