python中pos()_python中不带NLTK的POS标记器
我想給索拉尼庫爾德語的限定詞和介詞做一個(gè)詞性標(biāo)記。我使用下面的代碼將每個(gè)標(biāo)記放在庫爾德語文本中的每個(gè)命題或限定詞之后。在import os
SOR = open("SOR-1.txt", "r+", encoding = 'utf-8')
old_text = SOR.read()
punkt = [".", "!", ",", ":", ";"]
text = ""
for i in old_text:
if i in punkt:
text+=" "+i
else:
text += i
d = {"DET":["????" , "????" , "???" , "???" , "?????" , "?????", "????" ], "PREP":["??","??","?????","??","????","?????","??????","?????","??????","??????","?????","?????","??","??","???","????","?????","???","??","??","???????","??????","???????","???????","????","???????","?????","?????","????","??????","??????","?????","???????","?????","?????","???","????????","?????","?????","???","?????","???","???","???","???","" ], "punkt":[".", ",", "!"]}
text = text.split()
for w in text:
for pos in d:
if w in d[pos]:
SOR.write(w+"/"+pos+" ")
SOR.close()
我想做的是在定義的字典中的每個(gè)單詞之后在文本中添加POS標(biāo)記,但是結(jié)果是在文件末尾有一個(gè)單詞和POS標(biāo)記的單獨(dú)列表。在
總結(jié)
以上是生活随笔為你收集整理的python中pos()_python中不带NLTK的POS标记器的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: nvidia docker容器不支持中文
- 下一篇: mysql空洞_optimize tab