textflint.generation_layer.validator.edit_distance

Levenshtein distance class

class textflint.generation_layer.validator.edit_distance.EditDistance(origin_dataset, trans_dataset, fields)[source]

Bases: textflint.generation_layer.validator.validator.Validator

A constraint on edit distance (Levenshtein Distance). We use the Levenshtein Distance div the long of the sentence as score.

Parameters
  • origin_dataset (dataset) – the dataset of origin sample

  • trans_dataset (dataset) – the dataset of translate sample

  • fields (str|list) – the name of the origin field need compare.

validate(transformed_text, reference_text)[source]

Calculate the score

Parameters
  • transformed_text (str) – transformed sentence

  • reference_text (str) – origin sentence

Return float

the score of two sentence

class textflint.generation_layer.validator.edit_distance.Validator(origin_dataset, trans_dataset, fields, need_tokens=False)[source]

Bases: abc.ABC

An abstract class that computes the semantic similarity score between

original text and adversarial texts

Parameters
  • origin_dataset (dataset) – the dataset of origin sample

  • trans_dataset (dataset) – the dataset of translate sample

  • fields (str|list) – the name of the origin field need compare.

  • need_tokens (bool) – if we need tokenize the sentence

abstract validate(transformed_text, reference_text)[source]

Calculate the score

Parameters
  • transformed_text (str) – transformed sentence

  • reference_text (str) – origin sentence

Return float

the score of two sentence

check_data()[source]

Check whether the input data is legal

property score

Calculate the score of the deformed sentence

Return list

a list of translate sentence score