Python实现GCS bucket断点续传功能,分块上传文件
Python實(shí)現(xiàn)GCS bucket斷點(diǎn)續(xù)傳功能,分塊上傳文件?
環(huán)境:Python 3.6
我有一個關(guān)于使用斷點(diǎn)續(xù)傳到Google Cloud Storage的上傳速度的問題。我已經(jīng)編寫了一個Python客戶端,用于將大文件上傳到GCS(它具有一些特殊功能,這就是為什么gsutil對我公司不適用的原因)。在大約2個月前運(yùn)行的測試中,它很好地利用了可用的連接帶寬,其中25Mbps連接中大約有20Mbps。該項(xiàng)目被凍結(jié)了將近2個月,現(xiàn)在,當(dāng)重新打開該項(xiàng)目時,同一客戶端以非常慢的速度上載,速度約為25Mbps的1.4Mbps。我已經(jīng)編寫了簡單的Python腳本來檢查它是否也會遇到相同的問題,并且速度稍快一些,但仍約為2Mbps。Gsutil工具的執(zhí)行效果幾乎與我的Python腳本相同。我還以超過50Mbps的上傳速度在不同的網(wǎng)絡(luò)基礎(chǔ)架構(gòu)上運(yùn)行了該測試,效果非常好。
參考地址:https://googleapis.dev/python/google-resumable-media/latest/resumable_media/requests.html#resumable-uploads
import google.auth import google.auth.transport.requests as tr_requests ro_scope = u'https://www.googleapis.com/auth/devstorage.read_only' credentials, _ = google.auth.default(scopes=(ro_scope,)) transport = tr_requests.AuthorizedSession(credentials) from google.resumable_media.requests import ResumableUpload import iobucket_name='xxxxxxx' # 桶名 csvfile_name = 'xxxxxxxxxxxxxxxxxxxx' # 文件名路徑url_template = (u'https://www.googleapis.com/upload/storage/v1/b/'+ bucket_name +'/o?'u'uploadType=resumable')upload_url = url_template.format(bucket=bucket_name)# 分塊傳輸?shù)拇笮?chunk_size = 1024 * 1024 * 33 # 33MB# 開始斷點(diǎn)續(xù)傳,并分塊,意思是說,一個文件比如50M,33M每塊要執(zhí)行兩次這個語句 upload = ResumableUpload(upload_url, chunk_size)print(response) print(upload.resumable_url == response.headers[u'Location']) print(upload.total_bytes == len(data)) upload_id = response.headers[u'X-GUploader-UploadID'] print(upload_id) print(upload.resumable_url == upload_url + u'&upload_id=' + upload_id) response0 = upload.transmit_next_chunk(transport) print(response0) print(upload.finished) print(upload.bytes_uploaded == upload.chunk_size) response1 = upload.transmit_next_chunk(transport) print(response1) print(upload.finished) print(upload.bytes_uploaded == 2 * upload.chunk_size) response2 = upload.transmit_next_chunk(transport) print(response2) print(upload.finished) print(upload.bytes_uploaded == upload.total_bytes) json_response = response2.json() print(json_response[u'bucket'] == bucket) print(json_response[u'name'] == blob_name)任何程序錯誤,以及技術(shù)疑問或需要解答的,請掃碼添加作者VX:1755337994
?
總結(jié)
以上是生活随笔為你收集整理的Python实现GCS bucket断点续传功能,分块上传文件的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: sicp
- 下一篇: C语言——顺序栈(Stack)