textflint.generation_layer.transformation.RE.swap_ent¶
EntitySwap class for entity swap
-
class
textflint.generation_layer.transformation.RE.swap_ent.SwapEnt(type='lowfreq', **kwargs)[source]¶ Bases:
textflint.generation_layer.transformation.transformation.TransformationReplace entity mention with entity with same entity types
-
replace_en(types, index, token)[source]¶ replace entity with random token span
- Parameters
types (str) – entity type
index (list) – entity index [start, end]
token (list) – tokenized sentence
- Return Tuple(list, int)
new sentence and
number of new entity words greater than old entity words
-
-
class
textflint.generation_layer.transformation.RE.swap_ent.RESample(data, origin=None, sample_id=None)[source]¶ Bases:
textflint.input_layer.component.sample.sample.Sampletransform and retrieve features of RESample
-
check_data(data)[source]¶ check whether type of data is correct
- Parameters
data (dict) – data dict containing ‘x’, ‘subj’, ‘obj’ and ‘y’
-
load(data)[source]¶ Convert data dict which contains essential information to SASample.
- Params
dict data: contains ‘token’, ‘subj’ ,’obj’, ‘relation’ keys.
-
get_dp()[source]¶ get dependency parsing
- Return Tuple(list, list)
dependency tag of sentence and head of sentence
-
get_en()[source]¶ get entity index
- Return Tuple(int, int, int, int)
start index of subject entity, end index of subject entity, start index of object entity and end index of object entity
-
get_type()[source]¶ get entity type
- Return Tuple(string, string)
entity type of subject and entity type of object
-
get_sent()[source]¶ get tokenized sentence
- Return Tuple(list, string)
tokenized sentence and relation
-
delete_field_at_indices(field, indices)[source]¶ delete word of given indices in sentence
- Parameters
field (string) – field to be operated on
indices (list) – a list of index to be deleted
- Return dict
contains ‘token’, ‘subj’ ,’obj’ keys
-
insert_field_after_indices(field, indices, new_item)[source]¶ insert word before given indices in sentence
- Parameters
field (string) – field to be operated on
indices (list) – a list of index to be inserted
new_item (list) – list of items to be inserted
- Return dict
contains ‘token’, ‘subj’ ,’obj’ keys
-
insert_field_before_indices(field, indices, new_item)[source]¶ insert word after given indices in sentence
- Parameters
field (string) – field to be operated on
indices (list) – a list of index to be inserted
new_item (list) – list of items to be inserted
- Return dict
contains ‘token’, ‘subj’ ,’obj’ keys
-
-
class
textflint.generation_layer.transformation.RE.swap_ent.Transformation(**kwargs)[source]¶ Bases:
abc.ABCAn abstract class for transforming a sequence of text to produce a list of potential adversarial example.
-
processor= <textflint.common.preprocess.en_processor.EnProcessor object>¶
-
transform(sample, n=1, field='x', **kwargs)[source]¶ Transform data sample to a list of Sample.
- Parameters
sample (Sample) – Data sample for augmentation.
n (int) – Max number of unique augmented output, default is 5.
field (str|list) – Indicate which fields to apply transformations.
**kwargs (dict) –
other auxiliary params.
- Returns
list of Sample
-
-
textflint.generation_layer.transformation.RE.swap_ent.download_if_needed(folder_name)[source]¶ Folder name will be saved as .cache/textflint/[folder_name]. If it doesn’t exist on disk, the zip file will be downloaded and extracted.
- Parameters
folder_name (str) – path to folder or file in cache
- Returns
path to the downloaded folder or file on disk