textflint.generation_layer.transformation.RE.insert_clause¶
AddClause class for adding entity description transformation
-
class
textflint.generation_layer.transformation.RE.insert_clause.InsertClause(**kwargs)[source]¶ Bases:
textflint.generation_layer.transformation.transformation.TransformationAdd extra entity-related clause to text
-
class
textflint.generation_layer.transformation.RE.insert_clause.Client(base_url: str = 'https://www.wikidata.org/', opener: Optional[urllib.request.OpenerDirector] = None, datavalue_decoder: Optional[Union[Decoder, Callable[[Client, str, Mapping[str, object]], object]]] = None, entity_type_guess: bool = True, cache_policy: wikidata.cache.CachePolicy = <wikidata.cache.NullCachePolicy object>, repr_string: Optional[str] = None)[source]¶ Bases:
objectWikidata client session.
- Parameters
base_url (
str) – The base url of the Wikidata.WIKIDATA_BASE_URLis used by default.opener (
urllib.request.OpenerDirector) – The opener forurllib.request. If omitted orNonethe default opener is used.entity_type_guess (
bool) – Whether to guesstypeofEntityfrom itsidfor less HTTP requests.Trueby default.cache_poliy – A caching policy for API calls. No cache (
NullCachePolicy) by default.
New in version 0.5.0: The
cache_policyoption.Changed in version 0.3.0: The meaning of
base_urlparameter changed. It originally meanthttps://www.wikidata.org/wiki/which contained the trailing pathwiki/, but now it means onlyhttps://www.wikidata.org/.New in version 0.2.0: The
entity_type_guessoption.-
entity_type_guess= True¶ (
bool) Whether to guesstypeofEntityfrom itsidfor less HTTP requests.New in version 0.2.0.
-
cache_policy= <wikidata.cache.NullCachePolicy object>¶ (
CachePolicy) A caching policy for API calls.New in version 0.5.0.
-
get(entity_id: EntityId, load: bool = False) → wikidata.entity.Entity[source]¶ Get a Wikidata entity by its
EntityId.- Parameters
entity_id – The
idof theEntityto find.load (
bool) – Eager loading onTrue. Lazy loading (False) by default.
- Returns
The found entity.
- Return type
Entity
New in version 0.3.0: The
loadoption.
-
guess_entity_type(entity_id: EntityId) → Optional[wikidata.entity.EntityType][source]¶ Guess
EntityTypefrom the givenEntityId. It could returnNonewhen it fails to guess.Note
It always fails to guess when
entity_type_guessis configued toFalse.- Returns
The guessed
EntityId, orNoneif it fails to guess.- Return type
Optional[EntityType]
New in version 0.2.0.
-
exception
textflint.generation_layer.transformation.RE.insert_clause.FlintError[source]¶ Bases:
RuntimeErrorDefault error thrown by textflint functions. FlintError will be raised if you do not give any error type specification,
-
class
textflint.generation_layer.transformation.RE.insert_clause.RESample(data, origin=None, sample_id=None)[source]¶ Bases:
textflint.input_layer.component.sample.sample.Sampletransform and retrieve features of RESample
-
check_data(data)[source]¶ check whether type of data is correct
- Parameters
data (dict) – data dict containing ‘x’, ‘subj’, ‘obj’ and ‘y’
-
load(data)[source]¶ Convert data dict which contains essential information to SASample.
- Params
dict data: contains ‘token’, ‘subj’ ,’obj’, ‘relation’ keys.
-
get_dp()[source]¶ get dependency parsing
- Return Tuple(list, list)
dependency tag of sentence and head of sentence
-
get_en()[source]¶ get entity index
- Return Tuple(int, int, int, int)
start index of subject entity, end index of subject entity, start index of object entity and end index of object entity
-
get_type()[source]¶ get entity type
- Return Tuple(string, string)
entity type of subject and entity type of object
-
get_sent()[source]¶ get tokenized sentence
- Return Tuple(list, string)
tokenized sentence and relation
-
delete_field_at_indices(field, indices)[source]¶ delete word of given indices in sentence
- Parameters
field (string) – field to be operated on
indices (list) – a list of index to be deleted
- Return dict
contains ‘token’, ‘subj’ ,’obj’ keys
-
insert_field_after_indices(field, indices, new_item)[source]¶ insert word before given indices in sentence
- Parameters
field (string) – field to be operated on
indices (list) – a list of index to be inserted
new_item (list) – list of items to be inserted
- Return dict
contains ‘token’, ‘subj’ ,’obj’ keys
-
insert_field_before_indices(field, indices, new_item)[source]¶ insert word after given indices in sentence
- Parameters
field (string) – field to be operated on
indices (list) – a list of index to be inserted
new_item (list) – list of items to be inserted
- Return dict
contains ‘token’, ‘subj’ ,’obj’ keys
-
-
class
textflint.generation_layer.transformation.RE.insert_clause.Transformation(**kwargs)[source]¶ Bases:
abc.ABCAn abstract class for transforming a sequence of text to produce a list of potential adversarial example.
-
processor= <textflint.common.preprocess.en_processor.EnProcessor object>¶
-
transform(sample, n=1, field='x', **kwargs)[source]¶ Transform data sample to a list of Sample.
- Parameters
sample (Sample) – Data sample for augmentation.
n (int) – Max number of unique augmented output, default is 5.
field (str|list) – Indicate which fields to apply transformations.
**kwargs (dict) –
other auxiliary params.
- Returns
list of Sample
-