python自动翻译pdf_在Python中自动执行PDF
python自動翻譯pdf
Modules used:
使用的模塊:
In this script, we will use PyPDF2 module which will provide us various functions such as to extract the data and read the pdf file and split the file and write a new file.
在此腳本中,我們將使用PyPDF2模塊,該模塊將為我們提供各種功能,例如提取數(shù)據(jù),讀取pdf文件,拆分文件并寫入新文件。
Download PyPDF2:
下載PyPDF2:
General Way: pip install PyPDF2
通用方式:pip安裝PyPDF2
Pycharm Users: Go to the python project interpreter and install it from there.
Pycharm用戶:轉(zhuǎn)到python項目解釋器并從那里安裝它。
Various function provided by PyPDF2:
PyPDF2提供的各種功能:
PyPDF2.PdfFileReader(): This function will read our pdf and return us a data value that we will store in a variable (Let's take as Pdf_Data).
PyPDF2.PdfFileReader() :此函數(shù)將讀取我們的pdf并返回一個將存儲在變量中的數(shù)據(jù)值(以Pdf_Data為例)。
Pdf_Data.isEncrypted: This Function will help us to check if the pdf file is Encrypted.
Pdf_Data.isEncrypted :此功能將幫助我們檢查pdf文件是否已加密。
Pdf_Data.decrypt("<password>"): This function will help us to decrypt the pdf file and inside this function, we have to put the password and our pdf file will get decrypted.
Pdf_Data.decrypt(“ <password>”) :此函數(shù)將幫助我們解密pdf文件,并且在此函數(shù)內(nèi)部,我們必須輸入密碼,然后pdf文件將被解密。
Pdf_Data.numPages: This Function will return us the number of pages our pdf contain.
Pdf_Data.numPages :此函數(shù)將向我們返回pdf包含的頁面數(shù)。
Pdf_Data.getPage(0): This function will return us the data on the first page, here 0 seems to be the first page and 1 to be the second page, the things will go like indexing in python.
Pdf_Data.getPage(0) :此函數(shù)將返回第一頁上的數(shù)據(jù),這里0似乎是第一頁,而1則是第二頁,事情就像在python中建立索引一樣。
Pdf_Writer=PyPDF2.PdfFileWriter(): This function will create a variable that will help us to create a new pdf file.
Pdf_Writer = PyPDF2.PdfFileWriter() :此函數(shù)將創(chuàng)建一個變量,該變量將幫助我們創(chuàng)建新的pdf文件。
Pdf_Writer.addPage(<The Page Data>): This function will add the pdf page to the newly created pdf file.
Pdf_Writer.addPage(<頁面數(shù)據(jù)>) :此函數(shù)會將pdf頁面添加到新創(chuàng)建的pdf文件中。
Note: The text Extraction can be done only with the pdf files which have text.
注意:只有具有text的pdf文件才能進行文本提取。
Python代碼讀取文件并提取文本 (Python code to read the file and extract the text)
# import the modules import PyPDF2# open the file and read the content # open the file Pdf_Open=open("/home/abhinav/Downloads/CS_Defination-converted.pdf","rb")# read the file and store the content Pdf_Data=PyPDF2.PdfFileReader(Pdf_Open)# get the number of pages print(Pdf_Data.numPages)# Lets extract the data for the first page # we will use getPage command to get the page # using 0 for 1st page First_page=Pdf_Data.getPage(0)# printing the text print(First_page.extractText())Output:
輸出:
This is the extracted text from the pdf that we have given in input. In this way, we can extract the text from the pdf.
這是我們在輸入中從pdf中提取的文本。 這樣,我們可以從pdf中提取文本。
Now we will create a pdf file and we will add the starting and the last page of the above-used pdf in that file.
現(xiàn)在我們將創(chuàng)建一個pdf文件 ,并將上面使用的pdf的開始和最后一頁添加到該文件中。
Let's see the code,
讓我們看一下代碼,
# import the modules import PyPDF2# open the file and read the content # open the file Pdf_Open=open("/home/abhinav/Downloads/Abhinav_Gangrade.pdf","rb")# read the file and store the content Pdf_Data=PyPDF2.PdfFileReader(Pdf_Open)# get the number of pages print(Pdf_Data.numPages)# Create a pdf writer pdf_writer=PyPDF2.PdfFileWriter()# we will take the first page of the above pdf first_page=Pdf_Data.getPage(0)# we will take the last page of the above pdf # as the last page will be Total number of pages-1 last_page=Pdf_Data.getPage((Pdf_Data.numPages)-1)# adding page to the new pdf pdf_writer.addPage(first_page) pdf_writer.addPage(last_page)# create a blank file New_pdf=open("/home/abhinav/Downloads/Hello.pdf","wb")# add the content to the blank file pdf_writer.write(New_pdf) # Now close the fileFrom the above code, we can create a new pdf with the help of an existing pdf, and after that, we have taken the first and last page of the existing pdf and combine them and wrote it in the new pdf. In that way, we can create a pdf with the help of existing pdfs.
從上面的代碼中,我們可以在現(xiàn)有pdf的幫助下創(chuàng)建一個新pdf,然后,我們將現(xiàn)有pdf的第一頁和最后一頁進行合并,并將它們寫入新pdf中。 這樣,我們可以在現(xiàn)有pdf的幫助下創(chuàng)建pdf。
翻譯自: https://www.includehelp.com/python/automating-pdfs.aspx
python自動翻譯pdf
總結(jié)
以上是生活随笔為你收集整理的python自动翻译pdf_在Python中自动执行PDF的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: HTTPS为啥子安全?
- 下一篇: 计算机软件理论基础集合论,现代数学专论简