自动翻译视频字幕
某天我在看電影的時候發(fā)現(xiàn)了某部電影只有英文字幕,找了半天沒找到合適的中文字幕,便想著自己制作中英對照字幕。初步設計為3步:1、提取內(nèi)嵌英文字幕。2、獲取字幕的翻譯。3、將翻譯填入字幕文件。最后成功獲取中英文對照字幕。
1、提取內(nèi)嵌英文字幕
安裝并配置ffmpeg
https://www.ffmpeg.org/download.html
安裝后用python調(diào)用,代碼如下
2.獲取字幕的翻譯
由于字幕文本過大,無法使用免費的翻譯接口,故選擇selenium大法控制百度來翻譯,selenium配置方法如下:
https://blog.csdn.net/tk1023/article/details/109078613
配置完成后用python封裝:
3.將翻譯填入字幕文件
這步比較簡單,根據(jù)字幕文本寫對應python腳本
結果
原始字幕:
1
00:00:55,255 --> 00:00:57,557
My planet
Arrakis is so beautiful
2
00:00:57,624 --> 00:00:59,192
when the sun is low.
生成字幕:
1
00:00:55,255 --> 00:00:57,557
My planet
Arrakis is so beautiful
1
00:00:55,255 --> 00:00:57,557
我的星球阿拉基斯是如此美麗
2
00:00:57,624 --> 00:00:59,192
when the sun is low.
2
00:00:57,624 --> 00:00:59,192
當太陽低的時候。
整體代碼
trans.py:
from selenium import webdriver import xlwt import time import sys import shutil import os import mathclass Browser(object):def __init__(self, xls_name='https://fanyi.baidu.com/?aldtype=16047#auto/zh'):self.xls_name = xls_nameself.browser = webdriver.Edge()def tran(self, str):self.browser.get("https://fanyi.baidu.com/?aldtype=16047#en/zh/" + str) # 打開翻譯主頁for i in range(10):time.sleep(1)b = self.browser.find_element_by_xpath("/html/body/div[1]/div[3]/div/div/div[1]/div[2]/div[1]/div[2]/div/div/div[1]/p [2]/span")c = b.text # 提取翻譯結果if c.strip() != '':breakreturn cmain.py:
import os import json import random import hashlib import time import re from trans import Browserdef get_srt(file, outfile):cmd = "ffmpeg-2021-10-21-git-2aa343bb6f-full_build\\bin\\ffmpeg" + ' -i ' + file + ' -map 0:s:0 ' + outfileos.system(cmd)def add_eng(src, dst):state = 'null'box1 = []box2 = []eng = ''b = Browser()with open(src) as f:for line in f.readlines():if state == 'null':state = 'get_count'box1.append(line)box2.append(line)elif state == 'get_count':state = 'get_time'box1.append(line)box2.append(line)elif state == 'get_time':if len(line.strip()) > 0:eng += line.strip() + ' 'box1.append(line)else:state = 'null'try:cn = b.tran(eng)except:cn = ''box2.append(cn)with open(dst, 'a+') as frr:for i in box1:frr.write(i)frr.write('\n')for i in box2:frr.write(i)frr.write('\n\n')box1 = []box2 = []eng = ''file = r'xxx.mkv' mid = 'subs.srt' dst = 'new.srt'get_srt(file, mid) add_eng(mid, dst)總結
- 上一篇: 你还在使用fastjson,可以尝试js
- 下一篇: 为什么说嵌入式开发比单片机要难很多?