textflint.generation_layer.transformation.DP.add_sub_tree

Add a subtree in the sentence

class textflint.generation_layer.transformation.DP.add_sub_tree.AddSubTree(**kwargs)[source]

Bases: textflint.generation_layer.transformation.transformation.Transformation

Transforms the input sentence by adding a subordinate clause from WikiData.

Example::

original: “And it left mixed signals for London.” transformed: “And it left mixed signals for London, which is a capital

and largest city of the United Kingdom.”

search_list(query)[source]

Search on Wikidata for the associated entries with the given query.

Parameters

query (str) – A list of words in the entity, which is joined by ‘%20’.

Returns

A list of the information of entries searched.

clause_generate(entity_id)[source]

Generate a subordinate clause for the given Wikidata entry.

Parameters

entity_id (str) – The ID of the given Wikidata entry.

Returns

The subordinate clause generated.

get_clause(query)[source]

Generate a subordinate clause for the given query.

Parameters

query (str) – A list of words in the entity, which is joined by ‘%20’.

Returns

The subordinate clause generated.

find_entity(sample)[source]

Find an entity in the sentence.

Parameters

sample (~DPSample) –

Returns

A list of entities, long to short.

class textflint.generation_layer.transformation.DP.add_sub_tree.Client(base_url: str = 'https://www.wikidata.org/', opener: Optional[urllib.request.OpenerDirector] = None, datavalue_decoder: Optional[Union[Decoder, Callable[[Client, str, Mapping[str, object]], object]]] = None, entity_type_guess: bool = True, cache_policy: wikidata.cache.CachePolicy = <wikidata.cache.NullCachePolicy object>, repr_string: Optional[str] = None)[source]

Bases: object

Wikidata client session.

Parameters
  • base_url (str) – The base url of the Wikidata. WIKIDATA_BASE_URL is used by default.

  • opener (urllib.request.OpenerDirector) – The opener for urllib.request. If omitted or None the default opener is used.

  • entity_type_guess (bool) – Whether to guess type of Entity from its id for less HTTP requests. True by default.

  • cache_poliy – A caching policy for API calls. No cache (NullCachePolicy) by default.

New in version 0.5.0: The cache_policy option.

Changed in version 0.3.0: The meaning of base_url parameter changed. It originally meant https://www.wikidata.org/wiki/ which contained the trailing path wiki/, but now it means only https://www.wikidata.org/.

New in version 0.2.0: The entity_type_guess option.

entity_type_guess = True

(bool) Whether to guess type of Entity from its id for less HTTP requests.

New in version 0.2.0.

cache_policy = <wikidata.cache.NullCachePolicy object>

(CachePolicy) A caching policy for API calls.

New in version 0.5.0.

get(entity_id: EntityId, load: bool = False)wikidata.entity.Entity[source]

Get a Wikidata entity by its EntityId.

Parameters
  • entity_id – The id of the Entity to find.

  • load (bool) – Eager loading on True. Lazy loading (False) by default.

Returns

The found entity.

Return type

Entity

New in version 0.3.0: The load option.

guess_entity_type(entity_id: EntityId)Optional[wikidata.entity.EntityType][source]

Guess EntityType from the given EntityId. It could return None when it fails to guess.

Note

It always fails to guess when entity_type_guess is configued to False.

Returns

The guessed EntityId, or None if it fails to guess.

Return type

Optional[EntityType]

New in version 0.2.0.

decode_datavalue(datatype: str, datavalue: Mapping[str, object])object[source]

Decode the given datavalue using the configured datavalue_decoder.

New in version 0.3.0.

class textflint.generation_layer.transformation.DP.add_sub_tree.Entity(id: EntityId, client: Client)[source]

Bases: collections.abc.Mapping, collections.abc.Hashable

Wikidata entity. Can be an item or a property. Its attrributes can be lazily loaded.

To get an entity use Client.get() method instead of the constructor of Entity.

Note

Although it implements Mapping[EntityId, object], it actually is multidict. See also getlist() method.

Changed in version 0.2.0: Implemented Mapping[EntityId, object] protocol for easy access of statement values.

Changed in version 0.2.0: Implemented Hashable protocol and ==/= operators for equality test.

state

(EntityState) The loading state.

New in version 0.7.0.

label

Define accessor to a multilingual attribute of entity.

description

Define accessor to a multilingual attribute of entity.

getlist(key: wikidata.entity.Entity)Sequence[object][source]

Return all values associated to the given key property in sequence.

Parameters

key (Entity) – The property entity.

Returns

A sequence of all values associated to the given key property. It can be empty if nothing is associated to the property.

Return type

Sequence[object]

lists()Sequence[Tuple[wikidata.entity.Entity, Sequence[object]]][source]

Similar to items() except the returning pairs have each list of values instead of each single value.

Returns

The pairs of (key, values) where values is a sequence.

Return type

Sequence[Tuple[Entity, Sequence[object]]]

property type

(EntityType) The type of entity, item or property.

New in version 0.2.0.

exception textflint.generation_layer.transformation.DP.add_sub_tree.FlintError[source]

Bases: RuntimeError

Default error thrown by textflint functions. FlintError will be raised if you do not give any error type specification,

class textflint.generation_layer.transformation.DP.add_sub_tree.Transformation(**kwargs)[source]

Bases: abc.ABC

An abstract class for transforming a sequence of text to produce a list of potential adversarial example.

processor = <textflint.common.preprocess.en_processor.EnProcessor object>
transform(sample, n=1, field='x', **kwargs)[source]

Transform data sample to a list of Sample.

Parameters
  • sample (Sample) – Data sample for augmentation.

  • n (int) – Max number of unique augmented output, default is 5.

  • field (str|list) – Indicate which fields to apply transformations.

  • **kwargs (dict) –

    other auxiliary params.

Returns

list of Sample

classmethod sample_num(x, num)[source]

Get ‘num’ samples from x.

Parameters
  • x (list) – list to sample

  • num (int) – sample number

Returns

max ‘num’ unique samples.