當前位置：首頁 > 编程语言 > python >内容正文

python

python里面如何安装nltk_nltk的安装和简单使用

發布時間：2024/1/23 python 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 python里面如何安装nltk_nltk的安装和简单使用小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

使用python進行自然語言處理，有一些第三方庫供大家使用：

·NLTK(Python自然語言工具包)用于諸如標記化、詞形還原、詞干化、解析、POS標注等任務。該庫具有幾乎所有NLP任務的工具。

·Spacy是NLTK的主要競爭對手。這兩個庫可用于相同的任務。

·Scikit-learn為機器學習提供了一個大型庫。此外還提供了用于文本預處理的工具。

·Gensim是一個主題和向量空間建模、文檔集合相似性的工具包。

·Pattern庫的一般任務是充當Web挖掘模塊。因此，它僅支持自然語言處理(NLP)作為輔助任務。

·Polyglot是自然語言處理(NLP)的另一個Python工具包。它不是很受歡迎，但也可以用于各種NLP任務。

先由nltk入手學習。

1. NLTK安裝

簡單來說還是跟python其他第三方庫的安裝方式一樣，直接在命令行運行：pip install nltk

2. 運行不起來？

當你安裝完成后，想要試試下面的代碼對一段英文文本進行簡單的切分：

importnltk

text=nltk.word_tokenize("PierreVinken , 59 years old , will join as a nonexecutive director on Nov. 29 .")print(text)

運行結果，報錯如下：

...raiseLookupError(resource_not_found)

LookupError:**********************************************************************Resource punktnotfound.

Please use the NLTK Downloader to obtain the resource:>>> importnltk>>> nltk.download('punkt')

For more information see: https://www.nltk.org/data.html

Attempted to load tokenizers/punkt/english.pickle

Searchedin:- 'C:\\Users\\Administrator/nltk_data'

- 'C:\\Users\\Administrator\\Desktop\\meatwice\\venv\\nltk_data'

- 'C:\\Users\\Administrator\\Desktop\\meatwice\\venv\\share\\nltk_data'

- 'C:\\Users\\Administrator\\Desktop\\meatwice\\venv\\lib\\nltk_data'

- 'C:\\Users\\Administrator\\AppData\\Roaming\\nltk_data'

- 'C:\\nltk_data'

- 'D:\\nltk_data'

- 'E:\\nltk_data'

- ''

**********************************************************************

3. 解決方法：

不用著急，解決方法在異常中已經給出來了

命令行進入python交互模式，運行如下：

importnltk

nltk.download()

然后會彈出一個窗口，點擊models，找到punkt，雙擊進行下載即可。

然后運行開始的那段python代碼，對文本進行切分：

importnltk

text=nltk.word_tokenize("PierreVinken , 59 years old , will join as a nonexecutive director on Nov. 29 .")print(text)

結果如下，不會報錯：

4. nltk的簡單使用方法。

上面看了一個簡單的nltk的使用示例，下面來具體看看其使用方法。

4.1 將文本切分為語句，?sent_tokenize()

from nltk.tokenize importsent_tokenize

text="Welcome readers. I hope you find it interesting. Please do reply."

print(sent_tokenize(text))

從標點處開始切分，結果：

4.2 將句子切分為單詞，?word_tokenize()

from nltk.tokenize importword_tokenize

text="Welcome readers. I hope you find it interesting. Please do reply."

print(word_tokenize(text))

切分成單個的單詞，運行結果：

4.3.1 使用?TreebankWordTokenizer 進行切分

from nltk.tokenize importTreebankWordTokenizer

tokenizer=TreebankWordTokenizer()print(tokenizer.tokenize("What is Love? I know this question exists in each human being's mind including myse\

lf. If not it is still waiting to be discovered deeply in your heart. What do I think of love? For me, I belie\

ve love is a priceless diamond, because a diamond has thousands of reflections, and each reflection represent\

s a meaning of love."))

也是將語句切分成單詞，運行結果：

總結

以上是生活随笔為你收集整理的python里面如何安装nltk_nltk的安装和简单使用的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：单位载质量能量消耗量_这样运动减肥效果最
下一篇：物流设计大赛优秀作品_重磅！312支高校

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

python

python里面如何安装nltk_nltk的安装和简单使用

總結