textflint.common.preprocess.cn_processor

CnProcessor Class

class textflint.common.preprocess.cn_processor.CnProcessor(*args, **kwargs)[source]

Bases: object

Text Processor class implement NER.

static tokenize(sent)[source]

tokenize fiction

Parameters

sent (str) – the sentence need to be tokenized

Returns

list.the tokens in it

get_ner(sentence)[source]

NER function.

Parameters

sent (str) – the sentence need to be ner

:return two forms of tags

The first is the triple form (tags,start,end) The second is the list form, which marks the ner label of each word such as 周小明去玩 [‘Nh’, ‘Nh’, ‘Nh’, ‘O’, ‘O’]

get_pos_tag(sentence)[source]

pos tag function.

Parameters

sentence (str) – the sentence need to be ner

Returns

the triple form (tags,start,end)