textflint.generation_layer.transformation.RE.swap_employee

EmployeeSwap class for employee-related transformation

class textflint.generation_layer.transformation.RE.swap_employee.SwapEmployee(**kwargs)[source]

Bases: textflint.generation_layer.transformation.transformation.Transformation

Entity position swap with paraphrase(employee related)

titles_dict = ['dancer', 'pathologist', 'ASTRONAUT', 'Lieutenant', 'preacher', 'Baron', 'astrophysicist', 'auditor', 'chief executive', 'broker', 'reader', 'top former', 'Editor', 'guide', 'photographer', 'Foreign Minister', 'Finance Minister', 'model', 'professor', 'vice minister', 'Reviewer', 'writer', 'secretary', 'biologist', 'Archivist', 'queen', 'insurer', 'Admiral', 'pop star', 'Cleric', 'deputy president', 'Artist', 'Emperor', 'Creator', 'managing editor', 'Producer', 'accountant', 'nurse', 'president', 'shooter', 'assistant conductor', 'police official', 'Sheikh', 'Coroner', 'Director-General', 'Marshal', 'Handler', 'interior minister', 'board chairman', 'police Detective Lt.', 'Author', 'police chief', 'scientist', 'pianist', 'executive chef', 'foreign minister', 'coach', 'header', 'chief operating officer', 'Lt', 'Superintendent', 'cleaner', 'Vice governor', 'Pilot', 'SCHOLAR', 'deputy Prime Minister', 'assistant dean', 'vice governor', 'police detective', 'recruiter', 'homeowner', 'Judge', 'Duke', 'painter', 'owner', 'BROKER', 'dealer', 'executor', 'creator', 'instructor', 'Police Chief', 'musician', 'prophet', 'Speaker', 'Prime Minister', 'State Treasurer', 'chef', 'chemical engineer', 'chief of staff Gen.', 'politician', 'bureau chief', 'chief justice', 'Model', 'spokeswoman 1st Lt.', 'mobile', 'deputy minister', 'Education Secretary', 'novelist', 'Deputy', 'filmmaker', 'stretcher', 'economist', 'chaplain', 'geologist', 'mobster', 'midfielder', 'Private', 'STUDENT', 'foster parent', 'Assistant Secretary of Homeland Security', 'Gov.', 'Trader', 'artist', 'marine spokesman Lt. Col.', 'corporal', 'Lt.', 'theatre director', 'Congressman', 'Caller', 'Prof.', 'attorney', 'marine Col.', 'Attorney General', 'prosecutor', 'police Chief', 'webmaster', 'librarian', 'state Sen.', 'executive', 'deputy secretary-general', 'representative', 'Chief Technology Officer', 'Rev.', 'Architect', 'actor', 'pastor', 'stand in', 'Mayor', 'shaker', 'chairman', 'Prosecutor', 'founder Sheikh', 'river', 'engineer', 'investment banker', 'FOUNDER', 'screenwriter', 'chemist', 'police Detective Lt', 'guard', 'deputy police chief', 'Defence Secretary', 'Technical Director', 'Secretary', 'Secretary of Treasury', 'singer-songwriter', 'assistant', 'scholar', 'Premier', 'deputy director general', 'spokeswoman 1st Lt', 'singer', 'entertainer', 'security chief', 'missionary', 'Blogger', 'public relations representative', 'policeman', 'consul', 'Marine', 'Rep.', 'lobbyist', 'curator', 'spokesman Lt. Col.', 'soprano', 'Messenger', 'defender', 'assistant teacher', 'salesmen', 'CEO', 'social worker', 'supervisor', 'Senator', 'Governor', 'composer', 'rabbi', 'cashier', 'congressman', 'LEADER', 'footballer', 'shopkeeper', 'statistician', 'Representative', 'general manager', 'superintendent', 'organizer', 'banker', 'Interior Secretary', 'architect', 'Professor', 'treasurer', 'doctor', 'Painter', 'PILOT', 'vice chairman', 'cleric', 'wrestler', 'dictator', 'programmer-analyst', 'carpenter', 'secretary of the Treasury', 'hair stylist', 'Director General', 'spokesperson', 'consultant', 'administrative assistant', 'Sgt.', 'artistic director', 'ambassador', 'flutist', 'activist lawyer', 'Jogger', 'research associate', 'intern', 'philosopher', 'oil minister', 'Chief executive', 'goalie', 'Defence Minister', 'lawyer', 'Spokesman', 'attorney general', 'executive deputy director', 'deputy director-general', 'lather', 'soldier', 'tanker', 'Prophet', 'President', 'cook', 'manager', 'Councilor', 'smoother', 'Mailer', 'comptroller', 'commander', 'boss', 'Reporter', 'Correspondent', 'Director', 'spokeswoman', 'correspondent', 'athlete', 'lieutenant', 'technician', 'host', 'Filmmaker', 'pope', 'detective', 'systems analyst', 'assistant secretary of state', 'Physicist', 'mediator', 'managing director', 'marriage and family counselor', 'Chief Justice', 'Deputy Prime Minister', 'companion', 'pilot', 'violinist', 'chief economist', 'manipulator', 'Inspector General', 'House Speaker', 'lieutenant governor', 'editor', 'Secretary-General', 'police officer', 'anthropologist', 'state senator', 'deputy chairman', 'Oracle Developer', 'co-founder', 'industry minister', 'Lt. Col.', 'investigator', 'President-elect', 'caller', 'businesswoman', 'diplomat', 'Student', 'board president', 'vice-president', 'secretary general', 'Analyst', 'Deputy Director', 'Doctor', 'deputy', 'mechanical engineer', 'senator', 'Vice President', 'financier', 'executive vice president', 'chief of police', 'salesman', 'cab driver', 'optometrist', 'Chairman', 'political scientist', 'secretary-general', 'Charter', 'basketball player', 'publisher', 'producer', 'sales manager', 'physicist', 'envoy', 'director', 'WRITER', 'therapist', 'prime minister', 'Pastor', 'Assistant Foreign Minister', 'Secretary of State', 'servant', 'ARTIST', 'vendor', 'chairwoman', 'Colonel', 'actress', 'activist', 'merchant', 'founder', 'Builder', 'Commodore', 'JUDGE', 'Broker', 'astronaut', 'mayor', 'specialist', 'Pope', 'journalist', 'waitress', 'buyer', 'Ambassador', 'deputy director', 'superstar', 'Inspector', 'AIM activist', 'Chief Executive Officer', 'judge', 'Oracle', 'Mobile', 'magistrate', 'driver', 'Managing Director', 'historian', 'designer', 'Chef', 'spokesman', 'coroner', 'principal', 'provincial governor', 'Minister', 'dentist', 'poet', 'dresser', 'Attorney', 'governor', 'developer', 'point guard', 'executive director', 'minister', 'Party chief', 'Secretary of State candidate', 'general', 'conductor', 'author', 'marker', 'director general', 'analyst', 'contractor', 'structural engineers', 'blogger', 'inspector general', 'czar', 'critic', 'Vice Minister', 'office director', 'Deputy Minister', 'goalkeeper', 'student', 'builder', 'guru', 'Developer', 'chief engineer', 'medical examiner', 'CFO', 'River', 'Hunter', 'monk', 'imam', 'manufacturer', 'leader', 'electrical engineer', 'Reader', 'Commerce Minister', 'Affairs Minister', 'negotiator', 'Vice Chairman', 'Leader', 'chief of staff', 'researcher', 'Lieutenant General', 'Queen', 'lawmaker', 'chairwoman Rep.', 'interpreter', 'vice president', 'Gen.', 'honorary chairman', 'Police Officer', 'Col.', 'interrogator', 'Software Engineer', 'Minority Leader', 'Manager', 'striker', 'Secretary General', 'candidate', 'Activist', 'count', 'art teacher', 'chief executive officer', 'columnist', 'Chief Executive', 'miner', 'Boxer', 'administrator', 'farmer', 'Journalist', 'reporter', 'butcher', 'King', 'construction worker', 'security adviser', 'counselor', 'assistant professor', 'Guard', 'entrepreneur', 'psychologist', 'commentator', 'clown', 'REPORTER', 'nun', 'operator', 'drifter', 'landlord', 'premier', 'Environment Minister', 'translator', 'chief financial officer', 'trainer', 'general counsel', 'Executive Vice President', 'Founder', 'General', 'strategist', 'Interior Minister', 'businessman', 'gofer', 'nanny', 'chief General', 'marshal', 'Vice Premier', 'charter', 'dean', 'bachelor', 'special agent', 'captain', 'lieutenant colonel', 'Companion', 'printer', 'Defense Secretary', 'broadcaster', 'executive producer', 'teacher']
split_sent(head_pos, tail_pos, words)[source]

split sentence into 3 pieces: left, middle and right.

Parameters
  • head_pos (list) – position of subject entity

  • tail_pos (list) – position of object entity

  • words (list) – sentence tokens

Return bool

whether to reverse position of subject entity and object entity

list: entity placed on the left list: entity placed on the right list: token indices place between left entity and right entity list: token place between left entity and right entity list: tokens place on the left of left entity list: tokens place on the right of right entity

assert_attributive(left, right, words, heads, middle_words, middle_pos)[source]

Judge whether sentence piece between entities is attributive or not.

Parameters
  • left (list) – entity placed on the left

  • right (list) – entity placed on the right

  • words (list) – sentence tokens

  • middle_pos (list) – token indices place between left entity and right entity

  • middle_words (list) – token place between left entity and right entity

:return boolindicator or whether the middle part is

attributive or not

generate_new_item(reverse, left, right, left_words, right_words, middle_words, title_pos)[source]

split sentence into 3 pieces: left, middle and right.

Parameters
  • reverse (bool) – if the position of head and tail entity is reversed

  • left (list) – entity placed on the left

  • right (list) – entity placed on the right

  • left_words (list) – tokens place on the left of left entity

  • right_words (list) – tokens place on the right of right entity

  • middle_words (list) – token place between left entity and right entity

  • title_pos (list) – the position of TITLE

: return list: new list of words

list: the position of subject entity list: the position of object entity

class textflint.generation_layer.transformation.RE.swap_employee.RESample(data, origin=None, sample_id=None)[source]

Bases: textflint.input_layer.component.sample.sample.Sample

transform and retrieve features of RESample

check_data(data)[source]

check whether type of data is correct

Parameters

data (dict) – data dict containing ‘x’, ‘subj’, ‘obj’ and ‘y’

Validate whether the sample is legal

get_sent_ids()[source]

Generate sentence ID

Returns

string: sentence ID

load(data)[source]

Convert data dict which contains essential information to SASample.

Params

dict data: contains ‘token’, ‘subj’ ,’obj’, ‘relation’ keys.

get_dp()[source]

get dependency parsing

Return Tuple(list, list)

dependency tag of sentence and head of sentence

get_en()[source]

get entity index

Return Tuple(int, int, int, int)

start index of subject entity, end index of subject entity, start index of object entity and end index of object entity

get_type()[source]

get entity type

Return Tuple(string, string)

entity type of subject and entity type of object

get_sent()[source]

get tokenized sentence

Return Tuple(list, string)

tokenized sentence and relation

delete_field_at_indices(field, indices)[source]

delete word of given indices in sentence

Parameters
  • field (string) – field to be operated on

  • indices (list) – a list of index to be deleted

Return dict

contains ‘token’, ‘subj’ ,’obj’ keys

insert_field_after_indices(field, indices, new_item)[source]

insert word before given indices in sentence

Parameters
  • field (string) – field to be operated on

  • indices (list) – a list of index to be inserted

  • new_item (list) – list of items to be inserted

Return dict

contains ‘token’, ‘subj’ ,’obj’ keys

insert_field_before_indices(field, indices, new_item)[source]

insert word after given indices in sentence

Parameters
  • field (string) – field to be operated on

  • indices (list) – a list of index to be inserted

  • new_item (list) – list of items to be inserted

Return dict

contains ‘token’, ‘subj’ ,’obj’ keys

replace_sample_fields(data)[source]

replace sample fields for RE transformation

Parameters

data (dict) – contains transformed x, subj, obj keys

Return RESample

transformed sample

stan_ner_transform()[source]

Generate ner list

Return list

ner tags

get_pos()[source]

get pos tagging of sentence

Return list

pos tags

dump()[source]

output data sample

Return dict

containing x, subj, obj, y and sample_id

class textflint.generation_layer.transformation.RE.swap_employee.Transformation(**kwargs)[source]

Bases: abc.ABC

An abstract class for transforming a sequence of text to produce a list of potential adversarial example.

processor = <textflint.common.preprocess.en_processor.EnProcessor object>
transform(sample, n=1, field='x', **kwargs)[source]

Transform data sample to a list of Sample.

Parameters
  • sample (Sample) – Data sample for augmentation.

  • n (int) – Max number of unique augmented output, default is 5.

  • field (str|list) – Indicate which fields to apply transformations.

  • **kwargs (dict) –

    other auxiliary params.

Returns

list of Sample

classmethod sample_num(x, num)[source]

Get ‘num’ samples from x.

Parameters
  • x (list) – list to sample

  • num (int) – sample number

Returns

max ‘num’ unique samples.

textflint.generation_layer.transformation.RE.swap_employee.download_if_needed(folder_name)[source]

Folder name will be saved as .cache/textflint/[folder_name]. If it doesn’t exist on disk, the zip file will be downloaded and extracted.

Parameters

folder_name (str) – path to folder or file in cache

Returns

path to the downloaded folder or file on disk