Python开发【第五章】:常用模块
一、模塊介紹:
1、模塊定義
用來從邏輯上組織python代碼(變量,函數,類,邏輯:實現一個功能),本質上就是.py結尾python文件
分類:內置模塊、開源模塊、自定義模塊
?
2、導入模塊
本質:導入模塊的本質就是把python文件解釋一遍;導入包的本質就是把包文件下面的init.py文件運行一遍
①?同目錄下模塊的導入
#同級目錄間importimport module_name #直接導入模塊 import module_name,module2_name #導入多個模塊 使用:模塊名.加函數名 from module_name import * #導入模塊中所有函數和變量等。。不建議使用 from module_name import m1,m2,m3 #只導入模塊中函數m1,m2,m3 使用:直接使用m1,m2,m3即可 from module_name import m1 as m #導入module_name模塊中m1函數并且重新賦值給m 使用:直接輸入m即可② 不同目錄下模塊的導入
#不同目錄之間import 當前文件main.py#目錄結構 # ├── Credit_card # │ # ├── core # # │ ├── __init__.py # │ └── main.py # 當前文件 # ├── conf # # │ ├── __init__.py # │ └── setting.py # │ └── lzl.pyimport sys,oscreditcard_path=os.path.dirname(os.path.dirname(os.path.abspath(__file__))) #當前目錄的上上級目錄絕對路徑,即Creditcard目錄 sys.path.insert(0,creditcard_path) #把Creditcard目錄加入到系統路徑print(sys.path) #打印系統環境路徑 #['C:\\Users\\L\\PycharmProjects\\s14\\Day5\\Creditcard,.......]#import settings.py #無法直接import #ImportError: No module named 'settings'from conf import settings #from目錄import模塊settings.set() #執行settings下的函數 #in the settings③ 不同目錄下模塊連環導入
不同目錄多個模塊之間相互導入,為什么要引入這個概念,雖然老師沒講,但這個很重要,當時做atm程序時一個很大的坑........
目錄結構:
目錄結構 ├── Credit_card │ ├── core # │ ├── __init__.py │ └── main.py # 當前文件 ├── conf # │ ├── __init__.py │ └── setting.py │ └── lzl.py 目錄結構conf目錄下的文件:
#!/usr/bin/env python # -*- coding:utf-8 -*- #-Author-Lian#當前文件lzl.py def name():print("name is lzl") lzl.py #當前文件settings,調用lzl.py模塊 import lzl #導入模塊lzldef set():print("in the settings")lzl.name() #運行lzl模塊下的函數 set() #執行函數set #in the settings #name is lzl setttings.py此時執行settings.py文件沒有任何問題,就是同一目錄下的模塊之間的導入,關鍵來了,此刻croe目錄下的main.py導入模塊settings會出現什么狀況呢??!
core目錄下的文件:
#不同目錄之間連環import 當前文件main.py import sys,oscreditcard_path=os.path.dirname(os.path.dirname(os.path.abspath(__file__))) #當前目錄的上上級目錄絕對路徑,即Creditcard目錄 sys.path.insert(0,creditcard_path) #把Creditcard目錄加入到系統路徑from conf import settingssettings.set() #執行settings下的函數 # import lzl #導入模塊lzl #ImportError: No module named 'lzl'可以看到直接報錯:ImportError: No module named 'lzl',想想什么會報錯類?!剛才已經說到了,導入模塊的本質就是把模塊里的內容執行一遍,當main.py導入settings模塊時,也會把settings里的內容執行一遍,即執行import lzl;但是對于main.py來說,不能直接import lzl,所有就出現了剛才的報錯,那有什么辦法可以解決?!
對conf目錄下settings.py文件進行修改
#當前文件settings,調用lzl.py模塊 from . import lzl #通過相對路徑導入模塊lzldef set():print("in the settings")lzl.name() #運行lzl模塊下的函數 set() #執行函數set #in the settings #name is lzl settings.py此時執行main.py文件
#不同目錄之間連環import 當前文件main.py import sys,oscreditcard_path=os.path.dirname(os.path.dirname(os.path.abspath(__file__))) #當前目錄的上上級目錄絕對路徑,即Creditcard目錄 sys.path.insert(0,creditcard_path) #把Creditcard目錄加入到系統路徑from conf import settingssettings.set() #執行settings下的函數 # in the settings # name is lzl # in the settings # name is lzl沒有任何報錯,我們只對settings修改了lzl模塊的調用方式,結果就完全不同,此時的from . import lzl 用到的是相對路徑,這就是相對路徑的優點所在
④ 不同目錄多個模塊相互導入,用相對路徑
?目錄結構:
Day5├── Credit_card├── README.md├── core │ ├── __init__.py│ └── main.py ├── conf │ ├── __init__.py│ └── setting.py│ └── lzl.py 目錄結構conf目錄下的文件:
#!/usr/bin/env python # -*- coding:utf-8 -*- #-Author-Lian#當前文件lzl.py 相對路徑 def name():print("name is lzl") lzl.py #!/usr/bin/env python # -*- coding:utf-8 -*- #-Author-Lian#當前文件settings,調用lzl.py模塊 相對路徑 from . import lzl #通過相對路徑導入模塊lzldef set():print("in the settings")lzl.name() #運行lzl模塊下的函數 set() #執行函數set #in the settings #name is lzl settingscore目錄下的文件:
#不同目錄之間連環import 當前文件main.py 相對路徑from Day5.Credit_card.conf import settingssettings.set() #執行settings下的函數 # in the settings # name is lzl # in the settings # name is lzllzl.py以及settings.py文件未變,main.py文件去掉了繁雜的sys.path添加的過程,直接執行from Day5.Credit_card.conf import settings,使用相對路徑,更加簡潔方便!
?
?
二、內置模塊
1、time和datatime模塊
時間相關的操作,時間有三種表示方式:
- 時間戳 ? ? ? ? ? ? ? 1970年1月1日之后的秒,即:time.time()
- 格式化的字符串 ? ?2014-11-11 11:11, ? ?即:time.strftime('%Y-%m-%d')
- 結構化時間 ? ? ? ? ?元組包含了:年、日、星期等... time.struct_time ? ?即:time.localtime()
time模塊:
#time模塊 import timeprint(time.time()) #時間戳 #1472037866.0750718print(time.localtime()) #結構化時間 #time.struct_time(tm_year=2016, tm_mon=8, tm_mday=25, tm_hour=8, tm_min=44, tm_sec=46, tm_wday=3, tm_yday=238, tm_isdst=0)print(time.strftime('%Y-%m-%d')) #格式化的字符串 #2016-08-25 print(time.strftime('%Y-%m-%d',time.localtime())) #2016-08-25print(time.gmtime()) #結構化時間 #time.struct_time(tm_year=2016, tm_mon=8, tm_mday=25, tm_hour=3, tm_min=8, tm_sec=48, tm_wday=3, tm_yday=238, tm_isdst=0)print(time.strptime('2014-11-11', '%Y-%m-%d')) #結構化時間 #time.struct_time(tm_year=2014, tm_mon=11, tm_mday=11, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=315, tm_isdst=-1)print(time.asctime()) #Thu Aug 25 11:15:10 2016 print(time.asctime(time.localtime())) #Thu Aug 25 11:15:10 2016 print(time.ctime(time.time())) #Thu Aug 25 11:15:10 2016結構化時間:
時間戳、格式化字符串、機構化時間相互轉換:
?
datetime:
import datetimeprint(datetime.date) #表示日期的類。常用的屬性有year, month, day #<class 'datetime.date'> print(datetime.time) #表示時間的類。常用的屬性有hour, minute, second, microsecond #<class 'datetime.time'> print(datetime.datetime) #表示日期時間 #<class 'datetime.datetime'> print(datetime.timedelta) #表示時間間隔,即兩個時間點之間的長度 #<class 'datetime.timedelta'>print(datetime.datetime.now()) #2016-08-25 14:21:07.722285 print(datetime.datetime.now() - datetime.timedelta(days=5)) #2016-08-20 14:21:28.275460更多-》》https://zhuanlan.zhihu.com/p/23679915?utm_source=itdadao&utm_medium=referral
?
1 import time 2 3 str = '2017-03-26 3:12' 4 str2 = '2017-05-26 13:12' 5 date1 = time.strptime(str, '%Y-%m-%d %H:%M') 6 date2 = time.strptime(str2, '%Y-%m-%d %H:%M') 7 if float(time.time()) >= float(time.mktime(date1)) and float(time.time()) <= float(time.mktime(date2)): 8 print 'cccccccc' 9 10 11 import datetime 12 13 str = '2017-03-26 3:12' 14 str2 = '2017-05-26 13:12' 15 date1 = datetime.datetime.strptime(str,'%Y-%m-%d %H:%M') 16 date2 = datetime.datetime.strptime(str2,'%Y-%m-%d %H:%M') 17 datenow = datetime.datetime.now() 18 if datenow <date1: 19 print 'dddddd' 時間比較?
?
2、random模塊
生成隨機數:
#random隨機數模塊 import randomprint(random.random()) #生成0到1的隨機數 #0.7308387398872364print(random.randint(1,3)) #生成1-3隨機數 #3print(random.randrange(1,3)) #生成1-2隨機數,不包含3 #2print(random.choice("hello")) #隨機選取字符串 #eprint(random.sample("hello",2)) #隨機選取特定的字符 #['l', 'h']items = [1,2,3,4,5,6,7] random.shuffle(items) print(items) #[2, 3, 1, 6, 4, 7, 5]驗證碼:
import random checkcode = '' for i in range(4):current = random.randrange(0,4)if current != i:temp = chr(random.randint(65,90))else:temp = random.randint(0,9)checkcode += str(temp)print(checkcode) #51T6?
3、os模塊
用于提供系統級別的操作
#os模塊 import osos.getcwd() #獲取當前工作目錄,即當前python腳本工作的目錄路徑 os.chdir("dirname") #改變當前腳本工作目錄;相當于shell下cd os.curdir #返回當前目錄: ('.') os.pardir #獲取當前目錄的父目錄字符串名:('..') os.makedirs('dirname1/dirname2') #可生成多層遞歸目錄 os.removedirs('dirname1') # 若目錄為空,則刪除,并遞歸到上一級目錄,如若也為空,則刪除,依此類推 os.mkdir('dirname') # 生成單級目錄;相當于shell中mkdir dirname os.rmdir('dirname') #刪除單級空目錄,若目錄不為空則無法刪除,報錯;相當于shell中rmdir dirname os.listdir('dirname') #列出指定目錄下的所有文件和子目錄,包括隱藏文件,并以列表方式打印 os.remove() # 刪除一個文件 os.rename("oldname","newname") # 重命名文件/目錄 os.stat('path/filename') # 獲取文件/目錄信息 os.sep #輸出操作系統特定的路徑分隔符,win下為"\\",Linux下為"/" os.linesep #輸出當前平臺使用的行終止符,win下為"\t\n",Linux下為"\n" os.pathsep #輸出用于分割文件路徑的字符串 os.name #輸出字符串指示當前使用平臺。win->'nt'; Linux->'posix' os.system("bash command") #運行shell命令,直接顯示 commans可以獲取返回值 os.environ #獲取系統環境變量 os.path.abspath(path) #返回path規范化的絕對路徑 os.path.split(path) #將path分割成目錄和文件名二元組返回 os.path.dirname(path) # 返回path的目錄。其實就是os.path.split(path)的第一個元素 os.path.basename(path) # 返回path最后的文件名。如何path以/或\結尾,那么就會返回空值。即os.path.split(path)的第二個元素 os.path.exists(path) #如果path存在,返回True;如果path不存在,返回False os.path.isabs(path) #如果path是絕對路徑,返回True os.path.isfile(path) #如果path是一個存在的文件,返回True。否則返回False os.path.isdir(path) #如果path是一個存在的目錄,則返回True。否則返回False os.path.join(path1[, path2[, ...]]) # 將多個路徑組合后返回,第一個絕對路徑之前的參數將被忽略 os.path.getatime(path) #返回path所指向的文件或者目錄的最后存取時間 os.path.getmtime(path) #返回path所指向的文件或者目錄的最后修改時間?
4、sys模塊
用于提供對解釋器相關的操作
#sys模塊 import syssys.argv #命令行參數List,第一個元素是程序本身路徑 sys.exit(n) #退出程序,正常退出時exit(0) sys.version # 獲取Python解釋程序的版本信息 sys.maxint #最大的Int值 sys.path #返回模塊的搜索路徑,初始化時使用PYTHONPATH環境變量的值 sys.platform #返回操作系統平臺名稱 sys.stdout.write('please:') val = sys.stdin.readline()[:-1]詳情:->>http://www.cnblogs.com/lianzhilei/p/5724847.html?
?
5、shutil模塊
高級的 文件、文件夾、壓縮包 處理模塊
①?shutil.copyfileobj?將文件內容拷貝到另一個文件中,可以部分內容
def copyfileobj(fsrc, fdst, length=16*1024):"""copy data from file-like object fsrc to file-like object fdst"""while 1:buf = fsrc.read(length)if not buf:breakfdst.write(buf) shutil.copyfileobj #shutil 文件拷貝 import shutilf1 = open("fsrc",encoding="utf-8")f2 = open("fdst",encoding="utf-8")shutil.copyfile(f1,f2)#把文件f1里的內容拷貝到f2當中②?shutil.copyfile 文件拷貝
def copyfile(src, dst):"""Copy data from src to dst"""if _samefile(src, dst):raise Error("`%s` and `%s` are the same file" % (src, dst))for fn in [src, dst]:try:st = os.stat(fn)except OSError:# File most likely does not existpasselse:# XXX What about other special files? (sockets, devices...)if stat.S_ISFIFO(st.st_mode):raise SpecialFileError("`%s` is a named pipe" % fn)with open(src, 'rb') as fsrc:with open(dst, 'wb') as fdst:copyfileobj(fsrc, fdst) shutil.copyfile #shutil.copyfile 文件拷貝 import shutilshutil.copyfile("f1","f2") #把文件f1里的內容拷貝到f2當中③?shutil.copymode(src, dst)?僅拷貝權限。內容、組、用戶均不變
def copymode(src, dst):"""Copy mode bits from src to dst"""if hasattr(os, 'chmod'):st = os.stat(src)mode = stat.S_IMODE(st.st_mode)os.chmod(dst, mode) shutil.copymode④?shutil.copystat(src, dst)?拷貝狀態的信息,包括:mode bits, atime, mtime, flags
def copystat(src, dst):"""Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""st = os.stat(src)mode = stat.S_IMODE(st.st_mode)if hasattr(os, 'utime'):os.utime(dst, (st.st_atime, st.st_mtime))if hasattr(os, 'chmod'):os.chmod(dst, mode)if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):try:os.chflags(dst, st.st_flags)except OSError, why:for err in 'EOPNOTSUPP', 'ENOTSUP':if hasattr(errno, err) and why.errno == getattr(errno, err):breakelse:raise shutil.copystat⑤?shutil.copy(src, dst)?拷貝文件和權限
def copy(src, dst):"""Copy data and mode bits ("cp src dst").The destination may be a directory."""if os.path.isdir(dst):dst = os.path.join(dst, os.path.basename(src))copyfile(src, dst)copymode(src, dst) shutil.copy⑥?shutil.copy2(src, dst)?拷貝文件和狀態信息
def copy2(src, dst):"""Copy data and all stat info ("cp -p src dst").The destination may be a directory."""if os.path.isdir(dst):dst = os.path.join(dst, os.path.basename(src))copyfile(src, dst)copystat(src, dst) shutil.copy2⑦?shutil.copytree(src, dst, symlinks=False, ignore=None)?遞歸的去拷貝文件 拷貝多層目錄
def ignore_patterns(*patterns):"""Function that can be used as copytree() ignore parameter.Patterns is a sequence of glob-style patternsthat are used to exclude files"""def _ignore_patterns(path, names):ignored_names = []for pattern in patterns:ignored_names.extend(fnmatch.filter(names, pattern))return set(ignored_names)return _ignore_patternsdef copytree(src, dst, symlinks=False, ignore=None):"""Recursively copy a directory tree using copy2().The destination directory must not already exist.If exception(s) occur, an Error is raised with a list of reasons.If the optional symlinks flag is true, symbolic links in thesource tree result in symbolic links in the destination tree; ifit is false, the contents of the files pointed to by symboliclinks are copied.The optional ignore argument is a callable. If given, itis called with the `src` parameter, which is the directorybeing visited by copytree(), and `names` which is the list of`src` contents, as returned by os.listdir():callable(src, names) -> ignored_namesSince copytree() is called recursively, the callable will becalled once for each directory that is copied. It returns alist of names relative to the `src` directory that shouldnot be copied.XXX Consider this example code rather than the ultimate tool."""names = os.listdir(src)if ignore is not None:ignored_names = ignore(src, names)else:ignored_names = set()os.makedirs(dst)errors = []for name in names:if name in ignored_names:continuesrcname = os.path.join(src, name)dstname = os.path.join(dst, name)try:if symlinks and os.path.islink(srcname):linkto = os.readlink(srcname)os.symlink(linkto, dstname)elif os.path.isdir(srcname):copytree(srcname, dstname, symlinks, ignore)else:# Will raise a SpecialFileError for unsupported file types copy2(srcname, dstname)# catch the Error from the recursive copytree so that we can# continue with other filesexcept Error, err:errors.extend(err.args[0])except EnvironmentError, why:errors.append((srcname, dstname, str(why)))try:copystat(src, dst)except OSError, why:if WindowsError is not None and isinstance(why, WindowsError):# Copying file access times may fail on Windowspasselse:errors.append((src, dst, str(why)))if errors:raise Error, errors shutil.copytree⑧?shutil.rmtree(path[, ignore_errors[, onerror]])?遞歸的去刪除文件
def rmtree(path, ignore_errors=False, οnerrοr=None):"""Recursively delete a directory tree.If ignore_errors is set, errors are ignored; otherwise, if onerroris set, it is called to handle the error with arguments (func,path, exc_info) where func is os.listdir, os.remove, or os.rmdir;path is the argument to that function that caused it to fail; andexc_info is a tuple returned by sys.exc_info(). If ignore_errorsis false and onerror is None, an exception is raised."""if ignore_errors:def onerror(*args):passelif onerror is None:def onerror(*args):raisetry:if os.path.islink(path):# symlinks to directories are forbidden, see bug #1669raise OSError("Cannot call rmtree on a symbolic link")except OSError:onerror(os.path.islink, path, sys.exc_info())# can't continue even if onerror hook returnsreturnnames = []try:names = os.listdir(path)except os.error, err:onerror(os.listdir, path, sys.exc_info())for name in names:fullname = os.path.join(path, name)try:mode = os.lstat(fullname).st_modeexcept os.error:mode = 0if stat.S_ISDIR(mode):rmtree(fullname, ignore_errors, onerror)else:try:os.remove(fullname)except os.error, err:onerror(os.remove, fullname, sys.exc_info())try:os.rmdir(path)except os.error:onerror(os.rmdir, path, sys.exc_info()) shutil.rmtree⑨?shutil.move(src, dst)?遞歸的去移動文件
def move(src, dst):"""Recursively move a file or directory to another location. This issimilar to the Unix "mv" command.If the destination is a directory or a symlink to a directory, the sourceis moved inside the directory. The destination path must not alreadyexist.If the destination already exists but is not a directory, it may beoverwritten depending on os.rename() semantics.If the destination is on our current filesystem, then rename() is used.Otherwise, src is copied to the destination and then removed.A lot more could be done here... A look at a mv.c shows a lot ofthe issues this implementation glosses over."""real_dst = dstif os.path.isdir(dst):if _samefile(src, dst):# We might be on a case insensitive filesystem,# perform the rename anyway. os.rename(src, dst)returnreal_dst = os.path.join(dst, _basename(src))if os.path.exists(real_dst):raise Error, "Destination path '%s' already exists" % real_dsttry:os.rename(src, real_dst)except OSError:if os.path.isdir(src):if _destinsrc(src, dst):raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)copytree(src, real_dst, symlinks=True)rmtree(src)else:copy2(src, real_dst)os.unlink(src) shutil.move⑩?shutil.make_archive(base_name, format,...)?創建壓縮包并返回文件路徑,例如:zip、tar
- base_name: 壓縮包的文件名,也可以是壓縮包的路徑。只是文件名時,則保存至當前目錄,否則保存至指定路徑,
如:www ? ? ? ? ? ? ? ? ? ? ? ?=>保存至當前路徑
如:/Users/wupeiqi/www =>保存至/Users/wupeiqi/
- format: 壓縮包種類,“zip”, “tar”, “bztar”,“gztar”
- root_dir: 要壓縮的文件夾路徑(默認當前目錄)
- owner: 用戶,默認當前用戶
- group: 組,默認當前組
- logger: 用于記錄日志,通常是logging.Logger對象
shutil 對壓縮包的處理是調用 ZipFile 和 TarFile 兩個模塊來進行的,詳細:
import zipfile# 壓縮 z = zipfile.ZipFile('laxi.zip', 'w') z.write('a.log') z.write('data.data') z.close()# 解壓 z = zipfile.ZipFile('laxi.zip', 'r') z.extractall() z.close()zipfile 壓縮解壓 zipfile 壓縮解壓 import tarfile# 壓縮 tar = tarfile.open('your.tar','w') tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip') tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip') tar.close()# 解壓 tar = tarfile.open('your.tar','r') tar.extractall() # 可設置解壓地址 tar.close()tarfile 壓縮解壓 tarfile 壓縮解壓 class ZipFile(object):""" Class with methods to open, read, write, close, list zip files.z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)file: Either the path to the file, or a file-like object.If it is a path, the file will be opened and closed by ZipFile.mode: The mode can be either read "r", write "w" or append "a".compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).allowZip64: if True ZipFile will create files with ZIP64 extensions whenneeded, otherwise it will raise an exception when this wouldbe necessary."""fp = None # Set here since __del__ checks itdef __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):"""Open the ZIP file with mode read "r", write "w" or append "a"."""if mode not in ("r", "w", "a"):raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')if compression == ZIP_STORED:passelif compression == ZIP_DEFLATED:if not zlib:raise RuntimeError,\"Compression requires the (missing) zlib module"else:raise RuntimeError, "That compression method is not supported"self._allowZip64 = allowZip64self._didModify = Falseself.debug = 0 # Level of printing: 0 through 3self.NameToInfo = {} # Find file info given nameself.filelist = [] # List of ZipInfo instances for archiveself.compression = compression # Method of compressionself.mode = key = mode.replace('b', '')[0]self.pwd = Noneself._comment = ''# Check if we were passed a file-like objectif isinstance(file, basestring):self._filePassed = 0self.filename = filemodeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}try:self.fp = open(file, modeDict[mode])except IOError:if mode == 'a':mode = key = 'w'self.fp = open(file, modeDict[mode])else:raiseelse:self._filePassed = 1self.fp = fileself.filename = getattr(file, 'name', None)try:if key == 'r':self._RealGetContents()elif key == 'w':# set the modified flag so central directory gets written# even if no files are added to the archiveself._didModify = Trueelif key == 'a':try:# See if file is a zip file self._RealGetContents()# seek to start of directory and overwrite self.fp.seek(self.start_dir, 0)except BadZipfile:# file is not a zip file, just appendself.fp.seek(0, 2)# set the modified flag so central directory gets written# even if no files are added to the archiveself._didModify = Trueelse:raise RuntimeError('Mode must be "r", "w" or "a"')except:fp = self.fpself.fp = Noneif not self._filePassed:fp.close()raisedef __enter__(self):return selfdef __exit__(self, type, value, traceback):self.close()def _RealGetContents(self):"""Read in the table of contents for the ZIP file."""fp = self.fptry:endrec = _EndRecData(fp)except IOError:raise BadZipfile("File is not a zip file")if not endrec:raise BadZipfile, "File is not a zip file"if self.debug > 1:print endrecsize_cd = endrec[_ECD_SIZE] # bytes in central directoryoffset_cd = endrec[_ECD_OFFSET] # offset of central directoryself._comment = endrec[_ECD_COMMENT] # archive comment# "concat" is zero, unless zip was concatenated to another fileconcat = endrec[_ECD_LOCATION] - size_cd - offset_cdif endrec[_ECD_SIGNATURE] == stringEndArchive64:# If Zip64 extension structures are present, account for themconcat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)if self.debug > 2:inferred = concat + offset_cdprint "given, inferred, offset", offset_cd, inferred, concat# self.start_dir: Position of start of central directoryself.start_dir = offset_cd + concatfp.seek(self.start_dir, 0)data = fp.read(size_cd)fp = cStringIO.StringIO(data)total = 0while total < size_cd:centdir = fp.read(sizeCentralDir)if len(centdir) != sizeCentralDir:raise BadZipfile("Truncated central directory")centdir = struct.unpack(structCentralDir, centdir)if centdir[_CD_SIGNATURE] != stringCentralDir:raise BadZipfile("Bad magic number for central directory")if self.debug > 2:print centdirfilename = fp.read(centdir[_CD_FILENAME_LENGTH])# Create ZipInfo instance to store file informationx = ZipInfo(filename)x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET](x.create_version, x.create_system, x.extract_version, x.reserved,x.flag_bits, x.compress_type, t, d,x.CRC, x.compress_size, x.file_size) = centdir[1:12]x.volume, x.internal_attr, x.external_attr = centdir[15:18]# Convert date/time code to (year, month, day, hour, min, sec)x._raw_time = tx.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )x._decodeExtra()x.header_offset = x.header_offset + concatx.filename = x._decodeFilename()self.filelist.append(x)self.NameToInfo[x.filename] = x# update total bytes read from central directorytotal = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]+ centdir[_CD_EXTRA_FIELD_LENGTH]+ centdir[_CD_COMMENT_LENGTH])if self.debug > 2:print "total", totaldef namelist(self):"""Return a list of file names in the archive."""l = []for data in self.filelist:l.append(data.filename)return ldef infolist(self):"""Return a list of class ZipInfo instances for files in thearchive."""return self.filelistdef printdir(self):"""Print a table of contents for the zip file."""print "%-46s %19s %12s" % ("File Name", "Modified ", "Size")for zinfo in self.filelist:date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)def testzip(self):"""Read all the files and check the CRC."""chunk_size = 2 ** 20for zinfo in self.filelist:try:# Read by chunks, to avoid an OverflowError or a# MemoryError with very large embedded files.with self.open(zinfo.filename, "r") as f:while f.read(chunk_size): # Check CRC-32passexcept BadZipfile:return zinfo.filenamedef getinfo(self, name):"""Return the instance of ZipInfo given 'name'."""info = self.NameToInfo.get(name)if info is None:raise KeyError('There is no item named %r in the archive' % name)return infodef setpassword(self, pwd):"""Set default password for encrypted files."""self.pwd = pwd@propertydef comment(self):"""The comment text associated with the ZIP file."""return self._comment@comment.setterdef comment(self, comment):# check for valid comment lengthif len(comment) > ZIP_MAX_COMMENT:import warningswarnings.warn('Archive comment is too long; truncating to %d bytes'% ZIP_MAX_COMMENT, stacklevel=2)comment = comment[:ZIP_MAX_COMMENT]self._comment = commentself._didModify = Truedef read(self, name, pwd=None):"""Return file bytes (as a string) for name."""return self.open(name, "r", pwd).read()def open(self, name, mode="r", pwd=None):"""Return file-like object for 'name'."""if mode not in ("r", "U", "rU"):raise RuntimeError, 'open() requires mode "r", "U", or "rU"'if not self.fp:raise RuntimeError, \"Attempt to read ZIP archive that was already closed"# Only open a new file for instances where we were not# given a file object in the constructorif self._filePassed:zef_file = self.fpshould_close = Falseelse:zef_file = open(self.filename, 'rb')should_close = Truetry:# Make sure we have an info objectif isinstance(name, ZipInfo):# 'name' is already an info objectzinfo = nameelse:# Get info object for namezinfo = self.getinfo(name)zef_file.seek(zinfo.header_offset, 0)# Skip the file header:fheader = zef_file.read(sizeFileHeader)if len(fheader) != sizeFileHeader:raise BadZipfile("Truncated file header")fheader = struct.unpack(structFileHeader, fheader)if fheader[_FH_SIGNATURE] != stringFileHeader:raise BadZipfile("Bad magic number for file header")fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])if fheader[_FH_EXTRA_FIELD_LENGTH]:zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])if fname != zinfo.orig_filename:raise BadZipfile, \'File name in directory "%s" and header "%s" differ.' % (zinfo.orig_filename, fname)# check for encrypted flag & handle passwordis_encrypted = zinfo.flag_bits & 0x1zd = Noneif is_encrypted:if not pwd:pwd = self.pwdif not pwd:raise RuntimeError, "File %s is encrypted, " \"password required for extraction" % namezd = _ZipDecrypter(pwd)# The first 12 bytes in the cypher stream is an encryption header# used to strengthen the algorithm. The first 11 bytes are# completely random, while the 12th contains the MSB of the CRC,# or the MSB of the file time depending on the header type# and is used to check the correctness of the password.bytes = zef_file.read(12)h = map(zd, bytes[0:12])if zinfo.flag_bits & 0x8:# compare against the file type from extended local headerscheck_byte = (zinfo._raw_time >> 8) & 0xffelse:# compare against the CRC otherwisecheck_byte = (zinfo.CRC >> 24) & 0xffif ord(h[11]) != check_byte:raise RuntimeError("Bad password for file", name)return ZipExtFile(zef_file, mode, zinfo, zd,close_fileobj=should_close)except:if should_close:zef_file.close()raisedef extract(self, member, path=None, pwd=None):"""Extract a member from the archive to the current working directory,using its full name. Its file information is extracted as accuratelyas possible. `member' may be a filename or a ZipInfo object. You canspecify a different directory using `path'."""if not isinstance(member, ZipInfo):member = self.getinfo(member)if path is None:path = os.getcwd()return self._extract_member(member, path, pwd)def extractall(self, path=None, members=None, pwd=None):"""Extract all members from the archive to the current workingdirectory. `path' specifies a different directory to extract to.`members' is optional and must be a subset of the list returnedby namelist()."""if members is None:members = self.namelist()for zipinfo in members:self.extract(zipinfo, path, pwd)def _extract_member(self, member, targetpath, pwd):"""Extract the ZipInfo object 'member' to a physicalfile on the path targetpath."""# build the destination pathname, replacing# forward slashes to platform specific separators.arcname = member.filename.replace('/', os.path.sep)if os.path.altsep:arcname = arcname.replace(os.path.altsep, os.path.sep)# interpret absolute pathname as relative, remove drive letter or# UNC path, redundant separators, "." and ".." components.arcname = os.path.splitdrive(arcname)[1]arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)if x not in ('', os.path.curdir, os.path.pardir))if os.path.sep == '\\':# filter illegal characters on Windowsillegal = ':<>|"?*'if isinstance(arcname, unicode):table = {ord(c): ord('_') for c in illegal}else:table = string.maketrans(illegal, '_' * len(illegal))arcname = arcname.translate(table)# remove trailing dotsarcname = (x.rstrip('.') for x in arcname.split(os.path.sep))arcname = os.path.sep.join(x for x in arcname if x)targetpath = os.path.join(targetpath, arcname)targetpath = os.path.normpath(targetpath)# Create all upper directories if necessary.upperdirs = os.path.dirname(targetpath)if upperdirs and not os.path.exists(upperdirs):os.makedirs(upperdirs)if member.filename[-1] == '/':if not os.path.isdir(targetpath):os.mkdir(targetpath)return targetpathwith self.open(member, pwd=pwd) as source, \file(targetpath, "wb") as target:shutil.copyfileobj(source, target)return targetpathdef _writecheck(self, zinfo):"""Check for errors before writing a file to the archive."""if zinfo.filename in self.NameToInfo:import warningswarnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)if self.mode not in ("w", "a"):raise RuntimeError, 'write() requires mode "w" or "a"'if not self.fp:raise RuntimeError, \"Attempt to write ZIP archive that was already closed"if zinfo.compress_type == ZIP_DEFLATED and not zlib:raise RuntimeError, \"Compression requires the (missing) zlib module"if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):raise RuntimeError, \"That compression method is not supported"if not self._allowZip64:requires_zip64 = Noneif len(self.filelist) >= ZIP_FILECOUNT_LIMIT:requires_zip64 = "Files count"elif zinfo.file_size > ZIP64_LIMIT:requires_zip64 = "Filesize"elif zinfo.header_offset > ZIP64_LIMIT:requires_zip64 = "Zipfile size"if requires_zip64:raise LargeZipFile(requires_zip64 +" would require ZIP64 extensions")def write(self, filename, arcname=None, compress_type=None):"""Put the bytes from filename into the archive under the namearcname."""if not self.fp:raise RuntimeError("Attempt to write to ZIP archive that was already closed")st = os.stat(filename)isdir = stat.S_ISDIR(st.st_mode)mtime = time.localtime(st.st_mtime)date_time = mtime[0:6]# Create ZipInfo instance to store file informationif arcname is None:arcname = filenamearcname = os.path.normpath(os.path.splitdrive(arcname)[1])while arcname[0] in (os.sep, os.altsep):arcname = arcname[1:]if isdir:arcname += '/'zinfo = ZipInfo(arcname, date_time)zinfo.external_attr = (st[0] & 0xFFFF) << 16L # Unix attributesif compress_type is None:zinfo.compress_type = self.compressionelse:zinfo.compress_type = compress_typezinfo.file_size = st.st_sizezinfo.flag_bits = 0x00zinfo.header_offset = self.fp.tell() # Start of header bytes self._writecheck(zinfo)self._didModify = Trueif isdir:zinfo.file_size = 0zinfo.compress_size = 0zinfo.CRC = 0zinfo.external_attr |= 0x10 # MS-DOS directory flag self.filelist.append(zinfo)self.NameToInfo[zinfo.filename] = zinfoself.fp.write(zinfo.FileHeader(False))returnwith open(filename, "rb") as fp:# Must overwrite CRC and sizes with correct data laterzinfo.CRC = CRC = 0zinfo.compress_size = compress_size = 0# Compressed size can be larger than uncompressed sizezip64 = self._allowZip64 and \zinfo.file_size * 1.05 > ZIP64_LIMITself.fp.write(zinfo.FileHeader(zip64))if zinfo.compress_type == ZIP_DEFLATED:cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,zlib.DEFLATED, -15)else:cmpr = Nonefile_size = 0while 1:buf = fp.read(1024 * 8)if not buf:breakfile_size = file_size + len(buf)CRC = crc32(buf, CRC) & 0xffffffffif cmpr:buf = cmpr.compress(buf)compress_size = compress_size + len(buf)self.fp.write(buf)if cmpr:buf = cmpr.flush()compress_size = compress_size + len(buf)self.fp.write(buf)zinfo.compress_size = compress_sizeelse:zinfo.compress_size = file_sizezinfo.CRC = CRCzinfo.file_size = file_sizeif not zip64 and self._allowZip64:if file_size > ZIP64_LIMIT:raise RuntimeError('File size has increased during compressing')if compress_size > ZIP64_LIMIT:raise RuntimeError('Compressed size larger than uncompressed size')# Seek backwards and write file header (which will now include# correct CRC and file sizes)position = self.fp.tell() # Preserve current position in file self.fp.seek(zinfo.header_offset, 0)self.fp.write(zinfo.FileHeader(zip64))self.fp.seek(position, 0)self.filelist.append(zinfo)self.NameToInfo[zinfo.filename] = zinfodef writestr(self, zinfo_or_arcname, bytes, compress_type=None):"""Write a file into the archive. The contents is the string'bytes'. 'zinfo_or_arcname' is either a ZipInfo instance orthe name of the file in the archive."""if not isinstance(zinfo_or_arcname, ZipInfo):zinfo = ZipInfo(filename=zinfo_or_arcname,date_time=time.localtime(time.time())[:6])zinfo.compress_type = self.compressionif zinfo.filename[-1] == '/':zinfo.external_attr = 0o40775 << 16 # drwxrwxr-xzinfo.external_attr |= 0x10 # MS-DOS directory flagelse:zinfo.external_attr = 0o600 << 16 # ?rw-------else:zinfo = zinfo_or_arcnameif not self.fp:raise RuntimeError("Attempt to write to ZIP archive that was already closed")if compress_type is not None:zinfo.compress_type = compress_typezinfo.file_size = len(bytes) # Uncompressed sizezinfo.header_offset = self.fp.tell() # Start of header bytes self._writecheck(zinfo)self._didModify = Truezinfo.CRC = crc32(bytes) & 0xffffffff # CRC-32 checksumif zinfo.compress_type == ZIP_DEFLATED:co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,zlib.DEFLATED, -15)bytes = co.compress(bytes) + co.flush()zinfo.compress_size = len(bytes) # Compressed sizeelse:zinfo.compress_size = zinfo.file_sizezip64 = zinfo.file_size > ZIP64_LIMIT or \zinfo.compress_size > ZIP64_LIMITif zip64 and not self._allowZip64:raise LargeZipFile("Filesize would require ZIP64 extensions")self.fp.write(zinfo.FileHeader(zip64))self.fp.write(bytes)if zinfo.flag_bits & 0x08:# Write CRC and file sizes after the file datafmt = '<LQQ' if zip64 else '<LLL'self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,zinfo.file_size))self.fp.flush()self.filelist.append(zinfo)self.NameToInfo[zinfo.filename] = zinfodef __del__(self):"""Call the "close()" method in case the user forgot."""self.close()def close(self):"""Close the file, and for mode "w" and "a" write the endingrecords."""if self.fp is None:returntry:if self.mode in ("w", "a") and self._didModify: # write ending recordspos1 = self.fp.tell()for zinfo in self.filelist: # write central directorydt = zinfo.date_timedosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)extra = []if zinfo.file_size > ZIP64_LIMIT \or zinfo.compress_size > ZIP64_LIMIT:extra.append(zinfo.file_size)extra.append(zinfo.compress_size)file_size = 0xffffffffcompress_size = 0xffffffffelse:file_size = zinfo.file_sizecompress_size = zinfo.compress_sizeif zinfo.header_offset > ZIP64_LIMIT:extra.append(zinfo.header_offset)header_offset = 0xffffffffLelse:header_offset = zinfo.header_offsetextra_data = zinfo.extraif extra:# Append a ZIP64 field to the extra'sextra_data = struct.pack('<HH' + 'Q'*len(extra),1, 8*len(extra), *extra) + extra_dataextract_version = max(45, zinfo.extract_version)create_version = max(45, zinfo.create_version)else:extract_version = zinfo.extract_versioncreate_version = zinfo.create_versiontry:filename, flag_bits = zinfo._encodeFilenameFlags()centdir = struct.pack(structCentralDir,stringCentralDir, create_version,zinfo.create_system, extract_version, zinfo.reserved,flag_bits, zinfo.compress_type, dostime, dosdate,zinfo.CRC, compress_size, file_size,len(filename), len(extra_data), len(zinfo.comment),0, zinfo.internal_attr, zinfo.external_attr,header_offset)except DeprecationWarning:print >>sys.stderr, (structCentralDir,stringCentralDir, create_version,zinfo.create_system, extract_version, zinfo.reserved,zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,zinfo.CRC, compress_size, file_size,len(zinfo.filename), len(extra_data), len(zinfo.comment),0, zinfo.internal_attr, zinfo.external_attr,header_offset)raiseself.fp.write(centdir)self.fp.write(filename)self.fp.write(extra_data)self.fp.write(zinfo.comment)pos2 = self.fp.tell()# Write end-of-zip-archive recordcentDirCount = len(self.filelist)centDirSize = pos2 - pos1centDirOffset = pos1requires_zip64 = Noneif centDirCount > ZIP_FILECOUNT_LIMIT:requires_zip64 = "Files count"elif centDirOffset > ZIP64_LIMIT:requires_zip64 = "Central directory offset"elif centDirSize > ZIP64_LIMIT:requires_zip64 = "Central directory size"if requires_zip64:# Need to write the ZIP64 end-of-archive recordsif not self._allowZip64:raise LargeZipFile(requires_zip64 +" would require ZIP64 extensions")zip64endrec = struct.pack(structEndArchive64, stringEndArchive64,44, 45, 45, 0, 0, centDirCount, centDirCount,centDirSize, centDirOffset)self.fp.write(zip64endrec)zip64locrec = struct.pack(structEndArchive64Locator,stringEndArchive64Locator, 0, pos2, 1)self.fp.write(zip64locrec)centDirCount = min(centDirCount, 0xFFFF)centDirSize = min(centDirSize, 0xFFFFFFFF)centDirOffset = min(centDirOffset, 0xFFFFFFFF)endrec = struct.pack(structEndArchive, stringEndArchive,0, 0, centDirCount, centDirCount,centDirSize, centDirOffset, len(self._comment))self.fp.write(endrec)self.fp.write(self._comment)self.fp.flush()finally:fp = self.fpself.fp = Noneif not self._filePassed:fp.close()ZipFile ZipFile 源碼 class TarFile(object):"""The TarFile Class provides an interface to tar archives."""debug = 0 # May be set from 0 (no msgs) to 3 (all msgs) dereference = False # If true, add content of linked file to the# tar file, else the link. ignore_zeros = False # If true, skips empty or invalid blocks and# continues processing. errorlevel = 1 # If 0, fatal errors only appear in debug# messages (if debug >= 0). If > 0, errors# are passed to the caller as exceptions. format = DEFAULT_FORMAT # The format to use when creating an archive. encoding = ENCODING # Encoding for 8-bit character strings. errors = None # Error handler for unicode conversion. tarinfo = TarInfo # The default TarInfo class to use. fileobject = ExFileObject # The default ExFileObject class to use.def __init__(self, name=None, mode="r", fileobj=None, format=None,tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,errors=None, pax_headers=None, debug=None, errorlevel=None):"""Open an (uncompressed) tar archive `name'. `mode' is either 'r' toread from an existing archive, 'a' to append data to an existingfile or 'w' to create a new file overwriting an existing one. `mode'defaults to 'r'.If `fileobj' is given, it is used for reading or writing data. If itcan be determined, `mode' is overridden by `fileobj's mode.`fileobj' is not closed, when TarFile is closed."""modes = {"r": "rb", "a": "r+b", "w": "wb"}if mode not in modes:raise ValueError("mode must be 'r', 'a' or 'w'")self.mode = modeself._mode = modes[mode]if not fileobj:if self.mode == "a" and not os.path.exists(name):# Create nonexistent files in append mode.self.mode = "w"self._mode = "wb"fileobj = bltn_open(name, self._mode)self._extfileobj = Falseelse:if name is None and hasattr(fileobj, "name"):name = fileobj.nameif hasattr(fileobj, "mode"):self._mode = fileobj.modeself._extfileobj = Trueself.name = os.path.abspath(name) if name else Noneself.fileobj = fileobj# Init attributes.if format is not None:self.format = formatif tarinfo is not None:self.tarinfo = tarinfoif dereference is not None:self.dereference = dereferenceif ignore_zeros is not None:self.ignore_zeros = ignore_zerosif encoding is not None:self.encoding = encodingif errors is not None:self.errors = errorselif mode == "r":self.errors = "utf-8"else:self.errors = "strict"if pax_headers is not None and self.format == PAX_FORMAT:self.pax_headers = pax_headerselse:self.pax_headers = {}if debug is not None:self.debug = debugif errorlevel is not None:self.errorlevel = errorlevel# Init datastructures.self.closed = Falseself.members = [] # list of members as TarInfo objectsself._loaded = False # flag if all members have been readself.offset = self.fileobj.tell()# current position in the archive fileself.inodes = {} # dictionary caching the inodes of# archive members already addedtry:if self.mode == "r":self.firstmember = Noneself.firstmember = self.next()if self.mode == "a":# Move to the end of the archive,# before the first empty block.while True:self.fileobj.seek(self.offset)try:tarinfo = self.tarinfo.fromtarfile(self)self.members.append(tarinfo)except EOFHeaderError:self.fileobj.seek(self.offset)breakexcept HeaderError, e:raise ReadError(str(e))if self.mode in "aw":self._loaded = Trueif self.pax_headers:buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())self.fileobj.write(buf)self.offset += len(buf)except:if not self._extfileobj:self.fileobj.close()self.closed = Trueraisedef _getposix(self):return self.format == USTAR_FORMATdef _setposix(self, value):import warningswarnings.warn("use the format attribute instead", DeprecationWarning,2)if value:self.format = USTAR_FORMATelse:self.format = GNU_FORMATposix = property(_getposix, _setposix)#--------------------------------------------------------------------------# Below are the classmethods which act as alternate constructors to the# TarFile class. The open() method is the only one that is needed for# public use; it is the "super"-constructor and is able to select an# adequate "sub"-constructor for a particular compression using the mapping# from OPEN_METH.# # This concept allows one to subclass TarFile without losing the comfort of# the super-constructor. A sub-constructor is registered and made available# by adding it to the mapping in OPEN_METH. @classmethoddef open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):"""Open a tar archive for reading, writing or appending. Returnan appropriate TarFile class.mode:'r' or 'r:*' open for reading with transparent compression'r:' open for reading exclusively uncompressed'r:gz' open for reading with gzip compression'r:bz2' open for reading with bzip2 compression'a' or 'a:' open for appending, creating the file if necessary'w' or 'w:' open for writing without compression'w:gz' open for writing with gzip compression'w:bz2' open for writing with bzip2 compression'r|*' open a stream of tar blocks with transparent compression'r|' open an uncompressed stream of tar blocks for reading'r|gz' open a gzip compressed stream of tar blocks'r|bz2' open a bzip2 compressed stream of tar blocks'w|' open an uncompressed stream for writing'w|gz' open a gzip compressed stream for writing'w|bz2' open a bzip2 compressed stream for writing"""if not name and not fileobj:raise ValueError("nothing to open")if mode in ("r", "r:*"):# Find out which *open() is appropriate for opening the file.for comptype in cls.OPEN_METH:func = getattr(cls, cls.OPEN_METH[comptype])if fileobj is not None:saved_pos = fileobj.tell()try:return func(name, "r", fileobj, **kwargs)except (ReadError, CompressionError), e:if fileobj is not None:fileobj.seek(saved_pos)continueraise ReadError("file could not be opened successfully")elif ":" in mode:filemode, comptype = mode.split(":", 1)filemode = filemode or "r"comptype = comptype or "tar"# Select the *open() function according to# given compression.if comptype in cls.OPEN_METH:func = getattr(cls, cls.OPEN_METH[comptype])else:raise CompressionError("unknown compression type %r" % comptype)return func(name, filemode, fileobj, **kwargs)elif "|" in mode:filemode, comptype = mode.split("|", 1)filemode = filemode or "r"comptype = comptype or "tar"if filemode not in ("r", "w"):raise ValueError("mode must be 'r' or 'w'")stream = _Stream(name, filemode, comptype, fileobj, bufsize)try:t = cls(name, filemode, stream, **kwargs)except:stream.close()raiset._extfileobj = Falsereturn telif mode in ("a", "w"):return cls.taropen(name, mode, fileobj, **kwargs)raise ValueError("undiscernible mode")@classmethoddef taropen(cls, name, mode="r", fileobj=None, **kwargs):"""Open uncompressed tar archive name for reading or writing."""if mode not in ("r", "a", "w"):raise ValueError("mode must be 'r', 'a' or 'w'")return cls(name, mode, fileobj, **kwargs)@classmethoddef gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):"""Open gzip compressed tar archive name for reading or writing.Appending is not allowed."""if mode not in ("r", "w"):raise ValueError("mode must be 'r' or 'w'")try:import gzipgzip.GzipFileexcept (ImportError, AttributeError):raise CompressionError("gzip module is not available")try:fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)except OSError:if fileobj is not None and mode == 'r':raise ReadError("not a gzip file")raisetry:t = cls.taropen(name, mode, fileobj, **kwargs)except IOError:fileobj.close()if mode == 'r':raise ReadError("not a gzip file")raiseexcept:fileobj.close()raiset._extfileobj = Falsereturn t@classmethoddef bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):"""Open bzip2 compressed tar archive name for reading or writing.Appending is not allowed."""if mode not in ("r", "w"):raise ValueError("mode must be 'r' or 'w'.")try:import bz2except ImportError:raise CompressionError("bz2 module is not available")if fileobj is not None:fileobj = _BZ2Proxy(fileobj, mode)else:fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)try:t = cls.taropen(name, mode, fileobj, **kwargs)except (IOError, EOFError):fileobj.close()if mode == 'r':raise ReadError("not a bzip2 file")raiseexcept:fileobj.close()raiset._extfileobj = Falsereturn t# All *open() methods are registered here.OPEN_METH = {"tar": "taropen", # uncompressed tar"gz": "gzopen", # gzip compressed tar"bz2": "bz2open" # bzip2 compressed tar }#--------------------------------------------------------------------------# The public methods which TarFile provides:def close(self):"""Close the TarFile. In write-mode, two finishing zero blocks areappended to the archive."""if self.closed:returnif self.mode in "aw":self.fileobj.write(NUL * (BLOCKSIZE * 2))self.offset += (BLOCKSIZE * 2)# fill up the end with zero-blocks# (like option -b20 for tar does)blocks, remainder = divmod(self.offset, RECORDSIZE)if remainder > 0:self.fileobj.write(NUL * (RECORDSIZE - remainder))if not self._extfileobj:self.fileobj.close()self.closed = Truedef getmember(self, name):"""Return a TarInfo object for member `name'. If `name' can not befound in the archive, KeyError is raised. If a member occurs morethan once in the archive, its last occurrence is assumed to be themost up-to-date version."""tarinfo = self._getmember(name)if tarinfo is None:raise KeyError("filename %r not found" % name)return tarinfodef getmembers(self):"""Return the members of the archive as a list of TarInfo objects. Thelist has the same order as the members in the archive."""self._check()if not self._loaded: # if we want to obtain a list ofself._load() # all members, we first have to# scan the whole archive.return self.membersdef getnames(self):"""Return the members of the archive as a list of their names. It hasthe same order as the list returned by getmembers()."""return [tarinfo.name for tarinfo in self.getmembers()]def gettarinfo(self, name=None, arcname=None, fileobj=None):"""Create a TarInfo object for either the file `name' or the fileobject `fileobj' (using os.fstat on its file descriptor). You canmodify some of the TarInfo's attributes before you add it usingaddfile(). If given, `arcname' specifies an alternative name for thefile in the archive."""self._check("aw")# When fileobj is given, replace name by# fileobj's real name.if fileobj is not None:name = fileobj.name# Building the name of the member in the archive.# Backward slashes are converted to forward slashes,# Absolute paths are turned to relative paths.if arcname is None:arcname = namedrv, arcname = os.path.splitdrive(arcname)arcname = arcname.replace(os.sep, "/")arcname = arcname.lstrip("/")# Now, fill the TarInfo object with# information specific for the file.tarinfo = self.tarinfo()tarinfo.tarfile = self# Use os.stat or os.lstat, depending on platform# and if symlinks shall be resolved.if fileobj is None:if hasattr(os, "lstat") and not self.dereference:statres = os.lstat(name)else:statres = os.stat(name)else:statres = os.fstat(fileobj.fileno())linkname = ""stmd = statres.st_modeif stat.S_ISREG(stmd):inode = (statres.st_ino, statres.st_dev)if not self.dereference and statres.st_nlink > 1 and \inode in self.inodes and arcname != self.inodes[inode]:# Is it a hardlink to an already# archived file?type = LNKTYPElinkname = self.inodes[inode]else:# The inode is added only if its valid.# For win32 it is always 0.type = REGTYPEif inode[0]:self.inodes[inode] = arcnameelif stat.S_ISDIR(stmd):type = DIRTYPEelif stat.S_ISFIFO(stmd):type = FIFOTYPEelif stat.S_ISLNK(stmd):type = SYMTYPElinkname = os.readlink(name)elif stat.S_ISCHR(stmd):type = CHRTYPEelif stat.S_ISBLK(stmd):type = BLKTYPEelse:return None# Fill the TarInfo object with all# information we can get.tarinfo.name = arcnametarinfo.mode = stmdtarinfo.uid = statres.st_uidtarinfo.gid = statres.st_gidif type == REGTYPE:tarinfo.size = statres.st_sizeelse:tarinfo.size = 0Ltarinfo.mtime = statres.st_mtimetarinfo.type = typetarinfo.linkname = linknameif pwd:try:tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]except KeyError:passif grp:try:tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]except KeyError:passif type in (CHRTYPE, BLKTYPE):if hasattr(os, "major") and hasattr(os, "minor"):tarinfo.devmajor = os.major(statres.st_rdev)tarinfo.devminor = os.minor(statres.st_rdev)return tarinfodef list(self, verbose=True):"""Print a table of contents to sys.stdout. If `verbose' is False, onlythe names of the members are printed. If it is True, an `ls -l'-likeoutput is produced."""self._check()for tarinfo in self:if verbose:print filemode(tarinfo.mode),print "%s/%s" % (tarinfo.uname or tarinfo.uid,tarinfo.gname or tarinfo.gid),if tarinfo.ischr() or tarinfo.isblk():print "%10s" % ("%d,%d" \% (tarinfo.devmajor, tarinfo.devminor)),else:print "%10d" % tarinfo.size,print "%d-%02d-%02d %02d:%02d:%02d" \% time.localtime(tarinfo.mtime)[:6],print tarinfo.name + ("/" if tarinfo.isdir() else ""),if verbose:if tarinfo.issym():print "->", tarinfo.linkname,if tarinfo.islnk():print "link to", tarinfo.linkname,printdef add(self, name, arcname=None, recursive=True, exclude=None, filter=None):"""Add the file `name' to the archive. `name' may be any type of file(directory, fifo, symbolic link, etc.). If given, `arcname'specifies an alternative name for the file in the archive.Directories are added recursively by default. This can be avoided bysetting `recursive' to False. `exclude' is a function that shouldreturn True for each filename to be excluded. `filter' is a functionthat expects a TarInfo object argument and returns the changedTarInfo object, if it returns None the TarInfo object will beexcluded from the archive."""self._check("aw")if arcname is None:arcname = name# Exclude pathnames.if exclude is not None:import warningswarnings.warn("use the filter argument instead",DeprecationWarning, 2)if exclude(name):self._dbg(2, "tarfile: Excluded %r" % name)return# Skip if somebody tries to archive the archive...if self.name is not None and os.path.abspath(name) == self.name:self._dbg(2, "tarfile: Skipped %r" % name)returnself._dbg(1, name)# Create a TarInfo object from the file.tarinfo = self.gettarinfo(name, arcname)if tarinfo is None:self._dbg(1, "tarfile: Unsupported type %r" % name)return# Change or exclude the TarInfo object.if filter is not None:tarinfo = filter(tarinfo)if tarinfo is None:self._dbg(2, "tarfile: Excluded %r" % name)return# Append the tar header and data to the archive.if tarinfo.isreg():with bltn_open(name, "rb") as f:self.addfile(tarinfo, f)elif tarinfo.isdir():self.addfile(tarinfo)if recursive:for f in os.listdir(name):self.add(os.path.join(name, f), os.path.join(arcname, f),recursive, exclude, filter)else:self.addfile(tarinfo)def addfile(self, tarinfo, fileobj=None):"""Add the TarInfo object `tarinfo' to the archive. If `fileobj' isgiven, tarinfo.size bytes are read from it and added to the archive.You can create TarInfo objects using gettarinfo().On Windows platforms, `fileobj' should always be opened with mode'rb' to avoid irritation about the file size."""self._check("aw")tarinfo = copy.copy(tarinfo)buf = tarinfo.tobuf(self.format, self.encoding, self.errors)self.fileobj.write(buf)self.offset += len(buf)# If there's data to follow, append it.if fileobj is not None:copyfileobj(fileobj, self.fileobj, tarinfo.size)blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)if remainder > 0:self.fileobj.write(NUL * (BLOCKSIZE - remainder))blocks += 1self.offset += blocks * BLOCKSIZEself.members.append(tarinfo)def extractall(self, path=".", members=None):"""Extract all members from the archive to the current workingdirectory and set owner, modification time and permissions ondirectories afterwards. `path' specifies a different directoryto extract to. `members' is optional and must be a subset of thelist returned by getmembers()."""directories = []if members is None:members = selffor tarinfo in members:if tarinfo.isdir():# Extract directories with a safe mode. directories.append(tarinfo)tarinfo = copy.copy(tarinfo)tarinfo.mode = 0700self.extract(tarinfo, path)# Reverse sort directories.directories.sort(key=operator.attrgetter('name'))directories.reverse()# Set correct owner, mtime and filemode on directories.for tarinfo in directories:dirpath = os.path.join(path, tarinfo.name)try:self.chown(tarinfo, dirpath)self.utime(tarinfo, dirpath)self.chmod(tarinfo, dirpath)except ExtractError, e:if self.errorlevel > 1:raiseelse:self._dbg(1, "tarfile: %s" % e)def extract(self, member, path=""):"""Extract a member from the archive to the current working directory,using its full name. Its file information is extracted as accuratelyas possible. `member' may be a filename or a TarInfo object. You canspecify a different directory using `path'."""self._check("r")if isinstance(member, basestring):tarinfo = self.getmember(member)else:tarinfo = member# Prepare the link target for makelink().if tarinfo.islnk():tarinfo._link_target = os.path.join(path, tarinfo.linkname)try:self._extract_member(tarinfo, os.path.join(path, tarinfo.name))except EnvironmentError, e:if self.errorlevel > 0:raiseelse:if e.filename is None:self._dbg(1, "tarfile: %s" % e.strerror)else:self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))except ExtractError, e:if self.errorlevel > 1:raiseelse:self._dbg(1, "tarfile: %s" % e)def extractfile(self, member):"""Extract a member from the archive as a file object. `member' may bea filename or a TarInfo object. If `member' is a regular file, afile-like object is returned. If `member' is a link, a file-likeobject is constructed from the link's target. If `member' is none ofthe above, None is returned.The file-like object is read-only and provides the followingmethods: read(), readline(), readlines(), seek() and tell()"""self._check("r")if isinstance(member, basestring):tarinfo = self.getmember(member)else:tarinfo = memberif tarinfo.isreg():return self.fileobject(self, tarinfo)elif tarinfo.type not in SUPPORTED_TYPES:# If a member's type is unknown, it is treated as a# regular file.return self.fileobject(self, tarinfo)elif tarinfo.islnk() or tarinfo.issym():if isinstance(self.fileobj, _Stream):# A small but ugly workaround for the case that someone tries# to extract a (sym)link as a file-object from a non-seekable# stream of tar blocks.raise StreamError("cannot extract (sym)link as file object")else:# A (sym)link's file object is its target's file object.return self.extractfile(self._find_link_target(tarinfo))else:# If there's no data associated with the member (directory, chrdev,# blkdev, etc.), return None instead of a file object.return Nonedef _extract_member(self, tarinfo, targetpath):"""Extract the TarInfo object tarinfo to a physicalfile called targetpath."""# Fetch the TarInfo object for the given name# and build the destination pathname, replacing# forward slashes to platform specific separators.targetpath = targetpath.rstrip("/")targetpath = targetpath.replace("/", os.sep)# Create all upper directories.upperdirs = os.path.dirname(targetpath)if upperdirs and not os.path.exists(upperdirs):# Create directories that are not part of the archive with# default permissions. os.makedirs(upperdirs)if tarinfo.islnk() or tarinfo.issym():self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))else:self._dbg(1, tarinfo.name)if tarinfo.isreg():self.makefile(tarinfo, targetpath)elif tarinfo.isdir():self.makedir(tarinfo, targetpath)elif tarinfo.isfifo():self.makefifo(tarinfo, targetpath)elif tarinfo.ischr() or tarinfo.isblk():self.makedev(tarinfo, targetpath)elif tarinfo.islnk() or tarinfo.issym():self.makelink(tarinfo, targetpath)elif tarinfo.type not in SUPPORTED_TYPES:self.makeunknown(tarinfo, targetpath)else:self.makefile(tarinfo, targetpath)self.chown(tarinfo, targetpath)if not tarinfo.issym():self.chmod(tarinfo, targetpath)self.utime(tarinfo, targetpath)#--------------------------------------------------------------------------# Below are the different file methods. They are called via# _extract_member() when extract() is called. They can be replaced in a# subclass to implement other functionality.def makedir(self, tarinfo, targetpath):"""Make a directory called targetpath."""try:# Use a safe mode for the directory, the real mode is set# later in _extract_member().os.mkdir(targetpath, 0700)except EnvironmentError, e:if e.errno != errno.EEXIST:raisedef makefile(self, tarinfo, targetpath):"""Make a file called targetpath."""source = self.extractfile(tarinfo)try:with bltn_open(targetpath, "wb") as target:copyfileobj(source, target)finally:source.close()def makeunknown(self, tarinfo, targetpath):"""Make a file from a TarInfo object with an unknown typeat targetpath."""self.makefile(tarinfo, targetpath)self._dbg(1, "tarfile: Unknown file type %r, " \"extracted as regular file." % tarinfo.type)def makefifo(self, tarinfo, targetpath):"""Make a fifo called targetpath."""if hasattr(os, "mkfifo"):os.mkfifo(targetpath)else:raise ExtractError("fifo not supported by system")def makedev(self, tarinfo, targetpath):"""Make a character or block device called targetpath."""if not hasattr(os, "mknod") or not hasattr(os, "makedev"):raise ExtractError("special devices not supported by system")mode = tarinfo.modeif tarinfo.isblk():mode |= stat.S_IFBLKelse:mode |= stat.S_IFCHRos.mknod(targetpath, mode,os.makedev(tarinfo.devmajor, tarinfo.devminor))def makelink(self, tarinfo, targetpath):"""Make a (symbolic) link called targetpath. If it cannot be created(platform limitation), we try to make a copy of the referenced fileinstead of a link."""if hasattr(os, "symlink") and hasattr(os, "link"):# For systems that support symbolic and hard links.if tarinfo.issym():if os.path.lexists(targetpath):os.unlink(targetpath)os.symlink(tarinfo.linkname, targetpath)else:# See extract().if os.path.exists(tarinfo._link_target):if os.path.lexists(targetpath):os.unlink(targetpath)os.link(tarinfo._link_target, targetpath)else:self._extract_member(self._find_link_target(tarinfo), targetpath)else:try:self._extract_member(self._find_link_target(tarinfo), targetpath)except KeyError:raise ExtractError("unable to resolve link inside archive")def chown(self, tarinfo, targetpath):"""Set owner of targetpath according to tarinfo."""if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:# We have to be root to do so.try:g = grp.getgrnam(tarinfo.gname)[2]except KeyError:g = tarinfo.gidtry:u = pwd.getpwnam(tarinfo.uname)[2]except KeyError:u = tarinfo.uidtry:if tarinfo.issym() and hasattr(os, "lchown"):os.lchown(targetpath, u, g)else:if sys.platform != "os2emx":os.chown(targetpath, u, g)except EnvironmentError, e:raise ExtractError("could not change owner")def chmod(self, tarinfo, targetpath):"""Set file permissions of targetpath according to tarinfo."""if hasattr(os, 'chmod'):try:os.chmod(targetpath, tarinfo.mode)except EnvironmentError, e:raise ExtractError("could not change mode")def utime(self, tarinfo, targetpath):"""Set modification time of targetpath according to tarinfo."""if not hasattr(os, 'utime'):returntry:os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))except EnvironmentError, e:raise ExtractError("could not change modification time")#--------------------------------------------------------------------------def next(self):"""Return the next member of the archive as a TarInfo object, whenTarFile is opened for reading. Return None if there is no moreavailable."""self._check("ra")if self.firstmember is not None:m = self.firstmemberself.firstmember = Nonereturn m# Read the next block. self.fileobj.seek(self.offset)tarinfo = Nonewhile True:try:tarinfo = self.tarinfo.fromtarfile(self)except EOFHeaderError, e:if self.ignore_zeros:self._dbg(2, "0x%X: %s" % (self.offset, e))self.offset += BLOCKSIZEcontinueexcept InvalidHeaderError, e:if self.ignore_zeros:self._dbg(2, "0x%X: %s" % (self.offset, e))self.offset += BLOCKSIZEcontinueelif self.offset == 0:raise ReadError(str(e))except EmptyHeaderError:if self.offset == 0:raise ReadError("empty file")except TruncatedHeaderError, e:if self.offset == 0:raise ReadError(str(e))except SubsequentHeaderError, e:raise ReadError(str(e))breakif tarinfo is not None:self.members.append(tarinfo)else:self._loaded = Truereturn tarinfo#--------------------------------------------------------------------------# Little helper methods:def _getmember(self, name, tarinfo=None, normalize=False):"""Find an archive member by name from bottom to top.If tarinfo is given, it is used as the starting point."""# Ensure that all members have been loaded.members = self.getmembers()# Limit the member search list up to tarinfo.if tarinfo is not None:members = members[:members.index(tarinfo)]if normalize:name = os.path.normpath(name)for member in reversed(members):if normalize:member_name = os.path.normpath(member.name)else:member_name = member.nameif name == member_name:return memberdef _load(self):"""Read through the entire archive file and look for readablemembers."""while True:tarinfo = self.next()if tarinfo is None:breakself._loaded = Truedef _check(self, mode=None):"""Check if TarFile is still open, and if the operation's modecorresponds to TarFile's mode."""if self.closed:raise IOError("%s is closed" % self.__class__.__name__)if mode is not None and self.mode not in mode:raise IOError("bad operation for mode %r" % self.mode)def _find_link_target(self, tarinfo):"""Find the target member of a symlink or hardlink member in thearchive."""if tarinfo.issym():# Always search the entire archive.linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))limit = Noneelse:# Search the archive before the link, because a hard link is# just a reference to an already archived file.linkname = tarinfo.linknamelimit = tarinfomember = self._getmember(linkname, tarinfo=limit, normalize=True)if member is None:raise KeyError("linkname %r not found" % linkname)return memberdef __iter__(self):"""Provide an iterator object."""if self._loaded:return iter(self.members)else:return TarIter(self)def _dbg(self, level, msg):"""Write debugging output to sys.stderr."""if level <= self.debug:print >> sys.stderr, msgdef __enter__(self):self._check()return selfdef __exit__(self, type, value, traceback):if type is None:self.close()else:# An exception occurred. We must not call close() because# it would try to write end-of-archive blocks and padding.if not self._extfileobj:self.fileobj.close()self.closed = True # class TarFile TarFile TarFile 源碼?
6、json 和?pickle模塊
文件只能存二進制或字符串,不能存其他類型,所以用到了用于序列化的兩個模塊:
json,用于字符串和python數據類型間進行轉換,將數據通過特殊的形式轉換為所有語言都認識的字符串(字典,變量,列表)
pickle,用于python特有的類型和python的數據類型間進行轉換,將數據通過特殊的形式轉換為只有python認識的字符串(函數,類)
① json模塊:
#json 序列化和反序列化 import jsoninfo ={ #字典"name":"lzl","age":"18" }with open("test","w") as f:f.write(json.dumps(info)) #用json把info寫入到文件test中with open("test","r") as f:info = json.loads(f.read())print(info["name"])#lzl② pickle模塊:
#pickle 序列化和反序列化 import pickle #pickle支持python特有的所有類型def func(): #函數info ={"name":"lzl","age":"18"}print(info,type(info))func() #{'age': '18', 'name': 'lzl'} <class 'dict'>with open("test","wb") as f:f.write(pickle.dumps(func)) #用pickle把func寫入到文件test中 如果用json此時會報錯with open("test","rb") as f:func_new = pickle.loads(f.read())func_new() #{'age': '18', 'name': 'lzl'} <class 'dict'>更多內容-》》http://openskill.cn/article/472
7、shelve模塊
shelve模塊內部對pickle進行了封裝,shelve模塊是一個簡單的k,v將內存數據通過文件持久化的模塊,可以持久化任何pickle可支持的python數據格式
#!/usr/bin/env python # -*- coding:utf-8 -*- #-Author-Lianimport shelve# k,v方式存儲數據 s = shelve.open("shelve_test") # 打開一個文件 tuple = (1, 2, 3, 4) list = ['a', 'b', 'c', 'd'] info = {"name": "lzl", "age": 18} s["tuple"] = tuple # 持久化元組 s["list"] = list s["info"] = info s.close()# 通過key獲取value值 d = shelve.open("shelve_test") # 打開一個文件 print(d["tuple"]) # 讀取 print(d.get("list")) print(d.get("info"))# (1, 2, 3, 4) # ['a', 'b', 'c', 'd'] # {'name': 'lzl', 'age': 18} d.close()# 循環打印key值 s = shelve.open("shelve_test") # 打開一個文件 for k in s.keys(): # 循環key值print(k)# list # tuple # info s.close()# 更新key的value值 s = shelve.open("shelve_test") # 打開一個文件 s.update({"list":[22,33]}) #重新賦值或者s["list"] = [22,33] print(s["list"])#[22, 33] s.close()
8、xml模塊
xml是實現不同語言或程序之間進行數據交換的協議,跟json差不多,但json使用起來更簡單,不過,古時候,在json還沒誕生的黑暗年代,大家只能選擇用xml呀,至今很多傳統公司如金融行業的很多系統的接口還主要是xml
xml的格式如下,就是通過<>節點來區別數據結構的:
<?xml version="1.0"?> <data><country name="Liechtenstein"><rank updated="yes">2</rank><year>2008</year><gdppc>141100</gdppc><neighbor name="Austria" direction="E"/><neighbor name="Switzerland" direction="W"/></country><country name="Singapore"><rank updated="yes">5</rank><year>2011</year><gdppc>59900</gdppc><neighbor name="Malaysia" direction="N"/></country><country name="Panama"><rank updated="yes">69</rank><year>2011</year><gdppc>13600</gdppc><neighbor name="Costa Rica" direction="W"/><neighbor name="Colombia" direction="E"/></country> </data> xml格式xml協議在各個語言里的都是支持的,在python中可以用以下模塊操作xml?
import xml.etree.ElementTree as ETtree = ET.parse("xmltest.xml") root = tree.getroot() print(root.tag)#遍歷xml文檔 for child in root:print(child.tag, child.attrib)for i in child:print(i.tag,i.text)#只遍歷year 節點 for node in root.iter('year'):print(node.tag,node.text)修改和刪除xml文檔內容
import xml.etree.ElementTree as ETtree = ET.parse("xmltest.xml") root = tree.getroot()#修改 for node in root.iter('year'):new_year = int(node.text) + 1node.text = str(new_year)node.set("updated","yes")tree.write("xmltest.xml")#刪除node for country in root.findall('country'):rank = int(country.find('rank').text)if rank > 50:root.remove(country)tree.write('output.xml')自己創建xml文檔
import xml.etree.ElementTree as ETnew_xml = ET.Element("namelist") name = ET.SubElement(new_xml, "name", attrib={"enrolled": "yes"}) age = ET.SubElement(name, "age", attrib={"checked": "no"}) sex = ET.SubElement(name, "sex") sex.text = '33' name2 = ET.SubElement(new_xml, "name", attrib={"enrolled": "no"}) age = ET.SubElement(name2, "age") age.text = '19'et = ET.ElementTree(new_xml) # 生成文檔對象 et.write("test.xml", encoding="utf-8", xml_declaration=True)ET.dump(new_xml) # 打印生成的格式string = ET.tostring(new_xml) # xml對象轉換為str字符串?
9、configparser模塊
用于生成和修改常見配置文檔,當前模塊的名稱在 python 3.x 版本中變更為 configparser
來看一個好多軟件的常見文檔格式如下:
[DEFAULT] ServerAliveInterval = 45 Compression = yes CompressionLevel = 9 ForwardX11 = yes[bitbucket.org] User = hg[topsecret.server.com] Port = 50022 ForwardX11 = no 配置文件如果想用python生成一個這樣的文檔怎么做呢?
import configparserconfig = configparser.ConfigParser() config["DEFAULT"] = {'ServerAliveInterval': '45','Compression': 'yes','CompressionLevel': '9'}config['bitbucket.org'] = {} config['bitbucket.org']['User'] = 'hg' config['topsecret.server.com'] = {} topsecret = config['topsecret.server.com'] topsecret['Host Port'] = '50022' # mutates the parser topsecret['ForwardX11'] = 'no' # same here config['DEFAULT']['ForwardX11'] = 'yes' with open('example.ini', 'w') as configfile:config.write(configfile)寫完了還可以再讀出來哈:
>>> import configparser >>> config = configparser.ConfigParser() >>> config.sections() [] >>> config.read('example.ini') ['example.ini'] >>> config.sections() ['bitbucket.org', 'topsecret.server.com'] >>> 'bitbucket.org' in config True >>> 'bytebong.com' in config False >>> config['bitbucket.org']['User'] 'hg' >>> config['DEFAULT']['Compression'] 'yes' >>> topsecret = config['topsecret.server.com'] >>> topsecret['ForwardX11'] 'no' >>> topsecret['Port'] '50022' >>> for key in config['bitbucket.org']: print(key) ... user compressionlevel serveraliveinterval compression forwardx11 >>> config['bitbucket.org']['ForwardX11'] 'yes'configparser增刪改查語法:
[section1] k1 = v1 k2:v2[section2] k1 = v1import ConfigParserconfig = ConfigParser.ConfigParser() config.read('i.cfg')# ########## 讀 ########## #secs = config.sections() #print secs #options = config.options('group2') #print options#item_list = config.items('group2') #print item_list#val = config.get('group1','key') #val = config.getint('group1','key')# ########## 改寫 ########## #sec = config.remove_section('group1') #config.write(open('i.cfg', "w"))#sec = config.has_section('wupeiqi') #sec = config.add_section('wupeiqi') #config.write(open('i.cfg', "w"))#config.set('group2','k1',11111) #config.write(open('i.cfg', "w"))#config.remove_option('group2','age') #config.write(open('i.cfg', "w"))?
10、hashlib模塊
用于加密相關的操作,3.x里代替了md5模塊和sha模塊,主要提供?SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法
import hashlibm = hashlib.md5() m.update(b"Hello") m.update(b"It's me") print(m.digest()) m.update(b"It's been a long time since last time we ...")print(m.digest()) #2進制格式hash print(len(m.hexdigest())) #16進制格式hash ''' def digest(self, *args, **kwargs): # real signature unknown""" Return the digest value as a string of binary data. """passdef hexdigest(self, *args, **kwargs): # real signature unknown""" Return the digest value as a string of hexadecimal digits. """pass''' import hashlib# ######## md5 ########hash = hashlib.md5() hash.update('admin') print(hash.hexdigest())# ######## sha1 ########hash = hashlib.sha1() hash.update('admin') print(hash.hexdigest())# ######## sha256 ########hash = hashlib.sha256() hash.update('admin') print(hash.hexdigest())# ######## sha384 ########hash = hashlib.sha384() hash.update('admin') print(hash.hexdigest())# ######## sha512 ########hash = hashlib.sha512() hash.update('admin') print(hash.hexdigest())還不夠吊?python 還有一個 hmac 模塊,它內部對我們創建 key 和 內容 再進行處理然后再加密
import hmac h = hmac.new('wueiqi') h.update('hellowo') print h.hexdigest()
11、re模塊
re模塊用于對python的正則表達式的操作
'.' 默認匹配除\n之外的任意一個字符,若指定flag DOTALL,則匹配任意字符,包括換行 '^' 匹配字符開頭,若指定flags MULTILINE,這種也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE) '$' 匹配字符結尾,或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以 '*' 匹配*號前的字符0次或多次,re.findall("ab*","cabb3abcbbac") 結果為['abb', 'ab', 'a'] '+' 匹配前一個字符1次或多次,re.findall("ab+","ab+cd+abb+bba") 結果['ab', 'abb'] '?' 匹配前一個字符1次或0次 '{m}' 匹配前一個字符m次 '{n,m}' 匹配前一個字符n到m次,re.findall("ab{1,3}","abb abc abbcbbb") 結果'abb', 'ab', 'abb'] '|' 匹配|左或|右的字符,re.search("abc|ABC","ABCBabcCD").group() 結果'ABC' '(...)' 分組匹配,re.search("(abc){2}a(123|456)c", "abcabca456c").group() 結果 abcabca456c '[a-z]' 匹配a到z任意一個字符 '[^()]' 匹配除()以外的任意一個字符 r' ' 轉義引號里的字符 針對\字符 詳情查看⑦ '\A' 只從字符開頭匹配,re.search("\Aabc","alexabc") 是匹配不到的 '\Z' 匹配字符結尾,同$ '\d' 匹配數字0-9 '\D' 匹配非數字 '\w' 匹配[A-Za-z0-9] '\W' 匹配非[A-Za-z0-9] '\s' 匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 '\t''(?P<name>...)' 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{'province': '3714', 'city': '81', 'birthday': '1993'} re.IGNORECASE 忽略大小寫 re.search('(\A|\s)red(\s+|$)',i,re.IGNORECASE)?標志位:
# flags I = IGNORECASE = sre_compile.SRE_FLAG_IGNORECASE # ignore case L = LOCALE = sre_compile.SRE_FLAG_LOCALE # assume current 8-bit locale U = UNICODE = sre_compile.SRE_FLAG_UNICODE # assume unicode locale M = MULTILINE = sre_compile.SRE_FLAG_MULTILINE # make anchors look for newline S = DOTALL = sre_compile.SRE_FLAG_DOTALL # make dot match newline X = VERBOSE = sre_compile.SRE_FLAG_VERBOSE # ignore whitespace and comments flags flags①、match
從起始位置開始根據模型去字符串中匹配指定內容:
#match import re obj = re.match('\d+', '123uua123sf') #從第一個字符開始匹配一個到多個數字 print(obj) #<_sre.SRE_Match object; span=(0, 3), match='123'>if obj: #如果有匹配到字符則執行,為空不執行print(obj.group()) #打印匹配到的內容 #123匹配ip地址:
import reip = '255.255.255.253' result=re.match(r'^([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.'r'([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])$',ip) print(result) # <_sre.SRE_Match object; span=(0, 15), match='255.255.255.253'>②、search
根據模型去字符串中匹配指定內容(不一定是最開始位置),匹配最前
#search import re obj = re.search('\d+', 'a123uu234asf') #從數字開始匹配一個到多個數字 print(obj) #<_sre.SRE_Match object; span=(1, 4), match='123'>if obj: #如果有匹配到字符則執行,為空不執行print(obj.group()) #打印匹配到的內容 #123import re obj = re.search('\([^()]+\)', 'sdds(a1fwewe2(3uusfdsf2)34as)f') #匹配最里面()的內容 print(obj) #<_sre.SRE_Match object; span=(13, 24), match='(3uusfdsf2)'>if obj: #如果有匹配到字符則執行,為空不執行print(obj.group()) #打印匹配到的內容 #(3uusfdsf2)③、group與groups的區別
#group與groups的區別 import re a = "123abc456" b = re.search("([0-9]*)([a-z]*)([0-9]*)", a) print(b) #<_sre.SRE_Match object; span=(0, 9), match='123abc456'> print(b.group()) #123abc456 print(b.group(0)) #123abc456 print(b.group(1)) #123 print(b.group(2)) #abc print(b.group(3)) #456 print(b.groups()) #('123', 'abc', '456')④、findall
上述兩中方式均用于匹配單值,即:只能匹配字符串中的一個,如果想要匹配到字符串中所有符合條件的元素,則需要使用?findall;findall沒有group用法
⑤、sub
用于替換匹配的字符串(pattern, repl, string, count=0, flags=0)
#sub import recontent = "123abc456" new_content = re.sub('\d+', 'ABC', content) print(new_content) #ABCabcABC⑥、split
根據指定匹配進行分組(pattern, string, maxsplit=0, flags=0)
#split import recontent = "1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )" new_content = re.split('\*', content) #用*進行分割,分割為列表 print(new_content) #['1 - 2 ', ' ((60-30+1', '(9-2', '5/3+7/3', '99/4', '2998+10', '568/14))-(-4', '3)/(16-3', '2) )']content = "'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'" new_content = re.split('[\+\-\*\/]+', content) # new_content = re.split('\*', content, 1) print(new_content) #["'1 ", ' 2 ', ' ((60', '30', '1', '(9', '2', '5', '3', '7', '3', '99', '4', '2998', '10', '568', '14))', # '(', '4', '3)', '(16', '3', "2) )'"]inpp = '1-2*((60-30 +(-40-5)*(9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2))' inpp = re.sub('\s*','',inpp) #把空白字符去掉 print(inpp) new_content = re.split('\(([\+\-\*\/]?\d+[\+\-\*\/]?\d+){1}\)', inpp, 1) print(new_content) #['1-2*((60-30+', '-40-5', '*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2))']⑦、補充r' ' 轉義
fdfdsfds\fds sfdsfds& @$ lzl.txt首先要清楚,程序讀取文件里的\字符時,添加到列表里面的是\\:
import re,sys li = [] with open('lzl.txt','r',encoding="utf-8") as file:for line in file:li.append(line) print(li) # 注意:文件中的單斜杠,讀出來后會變成雙斜杠 # ['fdfdsfds\\fds\n', 'sfdsfds& @$'] print(li[0]) # print打印的時候還是單斜杠 # fdfdsfds\fdsr字符的意義,對字符\進行轉義,\只做為字符出現:
import re,sys li = [] with open('lzl.txt','r',encoding="utf-8") as file:for line in file:print(re.findall(r's\\f', line)) #第一種方式匹配# print(re.findall('\\\\', line)) #第二種方式匹配li.append(line) print(li) # 注意:文件中的單斜杠,讀出來后會變成雙斜杠 # ['s\\f'] # [] # ['fdfdsfds\\fds\n', 'sfdsfds& @$']補充:看完下面的代碼你可能更懵了
import re re.findall(r'\\', line) # 正則中只能這樣寫 不能寫成 r'\' 這樣 print(r'\\') # 只能這樣寫 不能寫成r'\' \只能是雙數 # \\ 結果 # 如果想值打印單個\ 寫成如下 print('\\') # 只能是雙數 # \ 結果總結:文件中的單斜杠\,讀出到程序中時是雙斜杠\\,print打印出來是單斜杠\;正則匹配文件但斜杠\時,用r'\\'雙斜杠去匹配,或者不用r直接用'\\\\'四個斜杠去匹配
⑧、compile函數
說明:
Python通過re模塊提供對正則表達式的支持。使用re的一般步驟是先使用re.compile()函數,將正則表達式的字符串形式編譯為Pattern實例, 然后使用Pattern實例處理文本并獲得匹配結果(一個Match實例),最后使用Match實例獲得信息,進行其他的操作舉一個簡單的例子,在尋找一個字符串中所有的英文字符:
import re pattern = re.compile('[a-zA-Z]') result = pattern.findall('as3SiOPdj#@23awe') print(result) # ['a', 's', 'S', 'i', 'O', 'P', 'd', 'j', 'a', 'w', 'e']匹配IP地址(255.255.255.255):
import repattern = re.compile(r'^(([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.){3}([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])$') result = pattern.match('255.255.255.255') print(result) # <_sre.SRE_Match object; span=(0, 15), match='255.255.255.255'>更新版本-》點擊?
?
12、urllib模塊
在能使用的各種網絡函數庫中,功能最強大的可能是urllib和urllib2(python2.0)了。通過他們在網絡上訪問文件,就好像訪問本地電腦的文件一樣,通過一個簡單的函數調用,幾乎可以把任何URL執行的東西用做程序的輸入,想象一下這個模塊和re模塊集合:可以下載web頁面,提前信息,以及自動生成報告等。
#!/usr/bin/env python # -*- coding:utf-8 -*- #-Author-Lianimport urllib.requestdef getdata():url = "http://www.baidu.com"data = urllib.request.urlopen(url).read()data = data.decode("utf-8")print(data)getdata()urlopen返回的類文件對象支持close、read、readline、和readlines方法
?
更多:
logging模塊-》》http://www.cnblogs.com/lianzhilei/p/6016543.html
?
?
?
轉載于:https://www.cnblogs.com/lianzhilei/p/5794402.html
總結
以上是生活随笔為你收集整理的Python开发【第五章】:常用模块的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 基于python-opencv和PIL的
- 下一篇: 各种游戏特效(持续更新)