textflint.input_layer.component.field.text_field¶
Text Field Class¶
A helper class that represents input string that to be modified.
-
class
textflint.input_layer.component.field.text_field.TextField(field_value, mask=None, is_one_sent=False, split_by_space=False, **kwargs)[source]¶ Bases:
textflint.input_layer.component.field.field.FieldA helper class that represents input string that to be modified.
Text that Sample contains parsed in data set,
TextFieldprovides multiple methods for Sample to modify.Support sentence level and word level modification, default using word level API.
-
text_processor= <textflint.common.preprocess.en_processor.EnProcessor object>¶
-
__init__(field_value, mask=None, is_one_sent=False, split_by_space=False, **kwargs)[source]¶ - Parameters
field_value (str|list) – Sentence string or tokenized words.
mask (list) – list of mask values
is_one_sent (bool) – whether input is a sentence
split_by_space (boo) – whether tokenize sentence by split space
kwargs –
-
pos_of_word_index(desired_word_idx)[source]¶ Get pos tag of given index.
- Parameters
desired_word_idx (int) – desire index to get pos tag
- Returns
pos tag of word of desired_word_idx.
-
replace_at_indices(indices, new_items)[source]¶ Replace words at indices and set their mask to MODIFIED_MASK.
- Parameters
indices ([int|listslice]) –
- each index can be int indicate replace single item
or their list like [1, 2, 3].
- each index can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
each index can be slice which would be convert to list.
new_items ([str|list|tuple]) – items corresponding indices.
- Returns
Replaced TextField object.
-
replace_at_index(index, new_items)[source]¶ Replace words at indices and set their mask to MODIFIED_MASK.
- Parameters
index (intlistslice) –
can be int indicate replace single item or their list like [1, 2, 3] can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
can be slice which would be convert to list.
new_items (str|listtuple) – items corresponding index.
- Returns
Replaced TextField object.
-
delete_at_indices(indices)[source]¶ Delete words at indices and remove their mask value.
- Parameters
indices ([int|list|slice]) –
- each index can be int indicate replace single item
or their list like [1, 2, 3].
- each index can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
each index can be slice which would be convert to list.
- Returns
Modified TextField object.
-
delete_at_index(index)[source]¶ Delete words at index and remove their mask value.
- Parameters
index (int|list|slice) –
can be int indicate replace single item or their list like [1, 2, 3] can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
can be slice which would be convert to list.
- Returns
Modified TextField object.
-
insert_before_indices(indices, new_items)[source]¶ Insert words before indices.
- Parameters
indices ([int]) –
can be int indicate replace single item or their list like [1, 2, 3] can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
can be slice which would be convert to list.
new_items ([str|list|tuple]) – items corresponding index.
- Returns
new TextField object.
-
insert_before_index(index, new_items)[source]¶ Insert words before index and remove their mask value.
- Parameters
index (int) –
can be int indicate replace single item or their list like [1, 2, 3] can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
can be slice which would be convert to list.
new_items (str|list|tuple) – items corresponding index.
- Returns
new TextField object.
-
insert_after_indices(indices, new_items)[source]¶ Insert words after indices.
- Parameters
indices ([int]) –
can be int indicate replace single item or their list like [1, 2, 3] can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
can be slice which would be convert to list.
new_items ([str|list|tuple]) – items corresponding index.
- Returns
new TextField object.
-
insert_after_index(index, new_items)[source]¶ Insert words before index and remove their mask value.
- Parameters
index (int) –
can be int indicate replace single item or their list like [1, 2, 3] can be list like (0,3) indicate replace items
from 0 to 3(not included) or their list like [(0, 3), (5,6)]
can be slice which would be convert to list.
new_items (str|list|tuple) – items corresponding index.
- Returns
new TextField object.
-
swap_at_index(first_index, second_index)[source]¶ Swap items between first_index and second_index of origin_list
- Parameters
first_index (int) – index of first item
second_index (int) – index of second item
- Returns
Modified TextField object.
-
property
pos_tagging¶ Get POS tags.
Example:
given sentence 'All things in their being are good for something.' >> [('All', 'DT'), ('things', 'NNS'), ('in', 'IN'), ('their', 'PRP$'), ('being', 'VBG'), ('are', 'VBP'), ('good', 'JJ'), ('for', 'IN'), ('something', 'NN'), ('.', '.')]
- Returns
Tokenized tokens with their POS tags.
-
property
ner¶ Get NER tags.
Example:
given sentence 'Lionel Messi is a football player from Argentina.' >>[('Lionel Messi', 0, 2, 'PERSON'), ('Argentina', 7, 8, 'LOCATION')]
- Returns
A list of tuples, (entity, start, end, label)
-
property
dependency_parsing¶ Dependency parsing.
Example:
given sentence: 'The quick brown fox jumps over the lazy dog.' >> The DT 4 det quick JJ 4 amod brown JJ 4 amod fox NN 5 nsubj jumps VBZ 0 root over IN 9 case the DT 9 det lazy JJ 9 amod dog NN 5 obl
- Returns
A list of tuples, (token, pos, target, type)
-