textflint.generation_layer.transformation.NLI.swap_ant

Replacing its words with antonyms provided by WordNet

class textflint.generation_layer.transformation.NLI.swap_ant.SwapAnt(language='eng')[source]

Bases: textflint.generation_layer.transformation.transformation.Transformation

Transforms an input by replacing its words with antonyms provided by WordNet. Download nltk_data before running.

Implement follow by Stress Test Evaluation for Natural Language Inference For the correctness of trasformation we swap the word has best_sense( Wordnet) to its antonym

https://www.aclweb.org/anthology/C18-1198/

exmaple: {

hypothesis: I hate this book. premise: This book is my favorite. label: contradiction

}

__init__(language='eng')[source]
Parameters

language (string) – language of transformation

transform(sample, n=1, **kwargs)[source]

Transform data sample to a list of Sample.

Parameters
  • sample (~NLISample) – Data sample for augmentation

  • n (int) – Default is 1. MAX number of unique augmented output

  • **kwargs

Returns

Augmented data

class textflint.generation_layer.transformation.NLI.swap_ant.Transformation(**kwargs)[source]

Bases: abc.ABC

An abstract class for transforming a sequence of text to produce a list of potential adversarial example.

processor = <textflint.common.preprocess.en_processor.EnProcessor object>
transform(sample, n=1, field='x', **kwargs)[source]

Transform data sample to a list of Sample.

Parameters
  • sample (Sample) – Data sample for augmentation.

  • n (int) – Max number of unique augmented output, default is 5.

  • field (str|list) – Indicate which fields to apply transformations.

  • **kwargs (dict) –

    other auxiliary params.

Returns

list of Sample

classmethod sample_num(x, num)[source]

Get ‘num’ samples from x.

Parameters
  • x (list) – list to sample

  • num (int) – sample number

Returns

max ‘num’ unique samples.

textflint.generation_layer.transformation.NLI.swap_ant.lesk(context_sentence, ambiguous_word, pos=None, synsets=None)[source]

Return a synset for an ambiguous word in a context.

Parameters
  • context_sentence (iter) – The context sentence where the ambiguous word occurs, passed as an iterable of words.

  • ambiguous_word (str) – The ambiguous word that requires WSD.

  • pos (str) – A specified Part-of-Speech (POS).

  • synsets (iter) – Possible synsets of the ambiguous word.

Returns

lesk_sense The Synset() object with the highest signature overlaps.

This function is an implementation of the original Lesk algorithm (1986) [1].

Usage example:

>>> lesk(['I', 'went', 'to', 'the', 'bank', 'to', 'deposit', 'money', '.'], 'bank', 'n')
Synset('savings_bank.n.02')

[1] Lesk, Michael. “Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone.” Proceedings of the 5th Annual International Conference on Systems Documentation. ACM, 1986. http://dl.acm.org/citation.cfm?id=318728