(上)python3 selenium3 从框架实现代码学习selenium让你事半功倍
本文感謝以下文檔或說明提供的參考。
Selenium-Python中文文檔
Selenium Documentation
Webdriver 參考
如有錯誤歡迎在評論區(qū)指出,作者將即時更改。
環(huán)境說明
- 操作系統(tǒng):Windows7 SP1 64
- python 版本:3.7.7
- 瀏覽器:谷歌瀏覽器
- 瀏覽器版本: 80.0.3987 (64 位)
- 谷歌瀏覽器驅(qū)動:驅(qū)動版本需要對應(yīng)瀏覽器版本,不同的瀏覽器使用對應(yīng)不同版本的驅(qū)動,點(diǎn)擊下載
- 如果是使用火狐瀏覽器,查看火狐瀏覽器版本,點(diǎn)擊 GitHub火狐驅(qū)動下載地址 下載(英文不好的同學(xué)右鍵一鍵翻譯即可,每個版本都有對應(yīng)瀏覽器版本的使用說明,看清楚下載即可)
簡介
Selenium是一個涵蓋了一系列工具和庫的總體項(xiàng)目,這些工具和庫支持Web瀏覽器的自動化。并且在執(zhí)行自動化時,所進(jìn)行的操作會像真實(shí)用戶操作一樣。
Selenium有3個版本,分別是 Selenium 1.0、Selenium2.0、Selenium3.0;
Selenium 1.0 主要是調(diào)用JS注入到瀏覽器;最開始Selenium的作者Jason Huggins開發(fā)了JavaScriptTestRunner作為測試工具,當(dāng)時向多位同事進(jìn)行了展示(這個作者也是個很有趣的靈魂)。從這個測試工具的名字上可以看出,是基于JavaScript進(jìn)行的測試。這個工具也就是Selenium的“前身”。
Selenium 2.0 基于 WebDriver 提供的API,進(jìn)行瀏覽器的元素操作。WebDriver 是一個測試框架也可以說是一個集成的API接口庫。
Selenium 3.0 基于 Selenium 2.0 進(jìn)行擴(kuò)展,基本差別不大;本文將以Selenium 3.0 版本進(jìn)行技術(shù)說明。
在官方介紹中介紹了有關(guān)支持瀏覽器的說明:“通過WebDriver,Selenium支持市場上所有主流瀏覽器,例如Chrom(ium),Firefox,Internet Explorer,Opera和Safari。”
簡單開始
安裝好環(huán)境后,簡單的使用selenium讓瀏覽器打開CSDN官網(wǎng)。
在環(huán)境配置時需要注意:必須把驅(qū)動給配置到系統(tǒng)環(huán)境,或者丟到你python的根目錄下。
首先引入 webdriver :
from selenium.webdriver import Chrome當(dāng)然也可以:
from selenium import webdriver引入方式因人而異,之后使用不同的方法新建不同的實(shí)例。
from selenium.webdriver import Chrome driver = Chrome()或者
from selenium import webdriver driver = webdriver.Chrome()一般性的python語法將不會在下文贅述。
之前所提到,需要把驅(qū)動配置到系統(tǒng)環(huán)境之中,但不外乎由于其它原因?qū)е碌牟荒茯?qū)動路徑不能加入到系統(tǒng)環(huán)境中,在這里提供一個解決方法:
這里使用 executable_path 指定驅(qū)動地址,這個地址是我驅(qū)動所存放的位置。當(dāng)然這個位置可以根據(jù)自己需求制定,并且以更加靈活;本文為了更好說明,所以使用了絕對路徑傳入。
火狐瀏覽器:
from selenium import webdriverdriver = webdriver.Firefox() driver.get("http://www.csdn.net")谷歌瀏覽器:
from selenium import webdriverdriver = webdriver.Chrome() driver.get("http://www.csdn.net")火狐瀏覽器與谷歌瀏覽器只有實(shí)例化方法不同,其它的操作方法均一致。
在代碼最開頭引入 webdriver ,在代碼中實(shí)例化瀏覽器對象后,使用get方法請求網(wǎng)址,打開所需要的網(wǎng)址。
實(shí)現(xiàn)剖析
查看 webdriver.py 實(shí)現(xiàn)(from selenium import webdriver):
import warningsfrom selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver from .remote_connection import ChromeRemoteConnection from .service import Service from .options import Optionsclass WebDriver(RemoteWebDriver):"""Controls the ChromeDriver and allows you to drive the browser.You will need to download the ChromeDriver executable fromhttp://chromedriver.storage.googleapis.com/index.html"""def __init__(self, executable_path="chromedriver", port=0,options=None, service_args=None,desired_capabilities=None, service_log_path=None,chrome_options=None, keep_alive=True):"""Creates a new instance of the chrome driver.Starts the service and then creates new instance of chrome driver.:Args:- executable_path - path to the executable. If the default is used it assumes the executable is in the $PATH- port - port you would like the service to run, if left as 0, a free port will be found.- options - this takes an instance of ChromeOptions- service_args - List of args to pass to the driver service- desired_capabilities - Dictionary object with non-browser specificcapabilities only, such as "proxy" or "loggingPref".- service_log_path - Where to log information from the driver.- chrome_options - Deprecated argument for options- keep_alive - Whether to configure ChromeRemoteConnection to use HTTP keep-alive."""if chrome_options:warnings.warn('use options instead of chrome_options',DeprecationWarning, stacklevel=2)options = chrome_optionsif options is None:# desired_capabilities stays as passed inif desired_capabilities is None:desired_capabilities = self.create_options().to_capabilities()else:if desired_capabilities is None:desired_capabilities = options.to_capabilities()else:desired_capabilities.update(options.to_capabilities())self.service = Service(executable_path,port=port,service_args=service_args,log_path=service_log_path)self.service.start()try:RemoteWebDriver.__init__(self,command_executor=ChromeRemoteConnection(remote_server_addr=self.service.service_url,keep_alive=keep_alive),desired_capabilities=desired_capabilities)except Exception:self.quit()raiseself._is_remote = Falsedef launch_app(self, id):"""Launches Chrome app specified by id."""return self.execute("launchApp", {'id': id})def get_network_conditions(self):return self.execute("getNetworkConditions")['value']def set_network_conditions(self, **network_conditions):self.execute("setNetworkConditions", {'network_conditions': network_conditions})def execute_cdp_cmd(self, cmd, cmd_args):return self.execute("executeCdpCommand", {'cmd': cmd, 'params': cmd_args})['value']def quit(self):try:RemoteWebDriver.quit(self)except Exception:# We don't care about the message because something probably has gone wrongpassfinally:self.service.stop()def create_options(self):return Options()從注釋中表明這是 “創(chuàng)建chrome驅(qū)動程序的新實(shí)例,并且創(chuàng)建chrome驅(qū)動程序的實(shí)例”。
在此只列出本篇文章使用到的參數(shù):
- executable_path:可執(zhí)行文件的路徑。如果使用默認(rèn)值,則假定可執(zhí)行文件位于PATH中;其中的PATH為系統(tǒng)環(huán)境根目錄
在 selenium 實(shí)現(xiàn)自動化過程中,必要的一步是啟動服務(wù),查看 init初始化方法中,發(fā)現(xiàn)了以下代碼:
self.service = Service(executable_path,port=port,service_args=service_args,log_path=service_log_path) self.service.start()以上代碼實(shí)例化了Service類,并且傳入相關(guān)參數(shù),之后啟動服務(wù);在這里最主要的參數(shù)為 executable_path,也就是啟動驅(qū)動。查看 Service 類(selenium.service):
from selenium.webdriver.common import serviceclass Service(service.Service):"""Object that manages the starting and stopping of the ChromeDriver"""def __init__(self, executable_path, port=0, service_args=None,log_path=None, env=None):"""Creates a new instance of the Service:Args:- executable_path : Path to the ChromeDriver- port : Port the service is running on- service_args : List of args to pass to the chromedriver service- log_path : Path for the chromedriver service to log to"""self.service_args = service_args or []if log_path:self.service_args.append('--log-path=%s' % log_path)service.Service.__init__(self, executable_path, port=port, env=env,start_error_message="Please see https://sites.google.com/a/chromium.org/chromedriver/home")def command_line_args(self):return ["--port=%d" % self.port] + self.service_args查看基類 start 方法實(shí)現(xiàn)(由于基類過長不全部展出,基類在selenium.webdriver.common import service 中):
def start(self):"""Starts the Service.:Exceptions:- WebDriverException : Raised either when it can't start the serviceor when it can't connect to the service"""try:cmd = [self.path]cmd.extend(self.command_line_args())self.process = subprocess.Popen(cmd, env=self.env,close_fds=platform.system() != 'Windows',stdout=self.log_file,stderr=self.log_file,stdin=PIPE)except TypeError:raiseexcept OSError as err:if err.errno == errno.ENOENT:raise WebDriverException("'%s' executable needs to be in PATH. %s" % (os.path.basename(self.path), self.start_error_message))elif err.errno == errno.EACCES:raise WebDriverException("'%s' executable may have wrong permissions. %s" % (os.path.basename(self.path), self.start_error_message))else:raiseexcept Exception as e:raise WebDriverException("The executable %s needs to be available in the path. %s\n%s" %(os.path.basename(self.path), self.start_error_message, str(e)))count = 0while True:self.assert_process_still_running()if self.is_connectable():breakcount += 1time.sleep(1)if count == 30:raise WebDriverException("Can not connect to the Service %s" % self.path)其中發(fā)現(xiàn):
try:cmd = [self.path]cmd.extend(self.command_line_args())self.process = subprocess.Popen(cmd, env=self.env,close_fds=platform.system() != 'Windows',stdout=self.log_file,stderr=self.log_file,stdin=PIPE) except TypeError:raiseexcept OSError as err:if err.errno == errno.ENOENT:raise WebDriverException("'%s' executable needs to be in PATH. %s" % (os.path.basename(self.path), self.start_error_message))elif err.errno == errno.EACCES:raise WebDriverException("'%s' executable may have wrong permissions. %s" % (os.path.basename(self.path), self.start_error_message))else:raiseexcept Exception as e:raise WebDriverException("The executable %s needs to be available in the path. %s\n%s" %(os.path.basename(self.path), self.start_error_message, str(e)))count = 0while True:self.assert_process_still_running()if self.is_connectable():breakcount += 1time.sleep(1)if count == 30:raise WebDriverException("Can not connect to the Service %s" % self.path)啟動子進(jìn)程開啟驅(qū)動。在出現(xiàn)異常時接收拋出異常并且報(bào)錯。開啟驅(qū)動打開瀏覽器。
在異常拋出檢測到此已知道了selenium如何啟動服務(wù)。接下來查看get請求網(wǎng)址的實(shí)現(xiàn)流程。
查看webdriver基類(selenium.webdriver.remote.webdriver),找到get方法:
通過get方法得知,調(diào)用了 execute 方法,傳入了 Command.GET 與 url。
查看Command.GET的類Command(selenium.webdriver.remote.command)得知,Command為標(biāo)準(zhǔn)WebDriver命令的常量;找到GET常量:
從文件上,應(yīng)該是執(zhí)行命令方式的類文件。
首先整理一下流程:
- 啟動服務(wù)→調(diào)用get方法
其中g(shù)et方法具體流程:
- get方法調(diào)用execute方法,傳入?yún)?shù)為 Command.GET與url,查看Command的值是標(biāo)準(zhǔn)常量。 在execute方法中,
其中 execute 的實(shí)現(xiàn)為:
def execute(self, driver_command, params=None):"""Sends a command to be executed by a command.CommandExecutor.:Args:- driver_command: The name of the command to execute as a string.- params: A dictionary of named parameters to send with the command.:Returns:The command's JSON response loaded into a dictionary object."""if self.session_id is not None:if not params:params = {'sessionId': self.session_id}elif 'sessionId' not in params:params['sessionId'] = self.session_idparams = self._wrap_value(params)response = self.command_executor.execute(driver_command, params)if response:self.error_handler.check_response(response)response['value'] = self._unwrap_value(response.get('value', None))return response# If the server doesn't send a response, assume the command was# a successreturn {'success': 0, 'value': None, 'sessionId': self.session_id}其中核心代碼為:
params = self._wrap_value(params) response = self.command_executor.execute(driver_command, params) if response:self.error_handler.check_response(response)response['value'] = self._unwrap_value(response.get('value', None))return response主要查看:
self.command_executor.execute(driver_command, params)其中 command_executor 為初始化后實(shí)例,查看派生類 webdriver(selenium import webdriver) command_executor 的實(shí)例化為:
RemoteWebDriver.__init__(self,command_executor=ChromeRemoteConnection(remote_server_addr=self.service.service_url,keep_alive=keep_alive),desired_capabilities=desired_capabilities)查看 ChromeRemoteConnection 類(selenium import remote_connection):
from selenium.webdriver.remote.remote_connection import RemoteConnectionclass ChromeRemoteConnection(RemoteConnection):def __init__(self, remote_server_addr, keep_alive=True):RemoteConnection.__init__(self, remote_server_addr, keep_alive)self._commands["launchApp"] = ('POST', '/session/$sessionId/chromium/launch_app')self._commands["setNetworkConditions"] = ('POST', '/session/$sessionId/chromium/network_conditions')self._commands["getNetworkConditions"] = ('GET', '/session/$sessionId/chromium/network_conditions')self._commands['executeCdpCommand'] = ('POST', '/session/$sessionId/goog/cdp/execute')得知調(diào)用的是基類初始化方法,查看得知 execute 方法實(shí)現(xiàn)為:
def execute(self, command, params):"""Send a command to the remote server.Any path subtitutions required for the URL mapped to the command should beincluded in the command parameters.:Args:- command - A string specifying the command to execute.- params - A dictionary of named parameters to send with the command asits JSON payload."""command_info = self._commands[command]assert command_info is not None, 'Unrecognised command %s' % commandpath = string.Template(command_info[1]).substitute(params)if hasattr(self, 'w3c') and self.w3c and isinstance(params, dict) and 'sessionId' in params:del params['sessionId']data = utils.dump_json(params)url = '%s%s' % (self._url, path)return self._request(command_info[0], url, body=data)def _request(self, method, url, body=None):"""Send an HTTP request to the remote server.:Args:- method - A string for the HTTP method to send the request with.- url - A string for the URL to send the request to.- body - A string for request body. Ignored unless method is POST or PUT.:Returns:A dictionary with the server's parsed JSON response."""LOGGER.debug('%s %s %s' % (method, url, body))parsed_url = parse.urlparse(url)headers = self.get_remote_connection_headers(parsed_url, self.keep_alive)resp = Noneif body and method != 'POST' and method != 'PUT':body = Noneif self.keep_alive:resp = self._conn.request(method, url, body=body, headers=headers)statuscode = resp.statuselse:http = urllib3.PoolManager(timeout=self._timeout)resp = http.request(method, url, body=body, headers=headers)statuscode = resp.statusif not hasattr(resp, 'getheader'):if hasattr(resp.headers, 'getheader'):resp.getheader = lambda x: resp.headers.getheader(x)elif hasattr(resp.headers, 'get'):resp.getheader = lambda x: resp.headers.get(x)data = resp.data.decode('UTF-8')try:if 300 <= statuscode < 304:return self._request('GET', resp.getheader('location'))if 399 < statuscode <= 500:return {'status': statuscode, 'value': data}content_type = []if resp.getheader('Content-Type') is not None:content_type = resp.getheader('Content-Type').split(';')if not any([x.startswith('image/png') for x in content_type]):try:data = utils.load_json(data.strip())except ValueError:if 199 < statuscode < 300:status = ErrorCode.SUCCESSelse:status = ErrorCode.UNKNOWN_ERRORreturn {'status': status, 'value': data.strip()}# Some of the drivers incorrectly return a response# with no 'value' field when they should return null.if 'value' not in data:data['value'] = Nonereturn dataelse:data = {'status': 0, 'value': data}return datafinally:LOGGER.debug("Finished Request")resp.close()從以上實(shí)現(xiàn)得知,execute 為向遠(yuǎn)程服務(wù)器發(fā)送請求;execute中調(diào)用的_request方法為發(fā)送http請求并且返回相關(guān)結(jié)果,請求結(jié)果通過瀏覽器進(jìn)行響應(yīng)。
官方說明中說明了請求原理:
At its minimum, WebDriver talks to a browser through a driver.
Communication is two way: WebDriver passes commands to the browser through the driver, and receives information back via the same route.
The driver is specific to the browser, such as ChromeDriver for Google’s Chrome/Chromium, GeckoDriver for Mozilla’s Firefox, etc. Thedriver runs on the same system as the browser. This may, or may not be, the same system where the tests themselves are executing.
This simple example above is direct communication. Communication to the browser may also be remote communication through Selenium Server or RemoteWebDriver. RemoteWebDriver runs on the same system as the driver and the browser.
言而總之我們通過webdriver與瀏覽器進(jìn)行對話,從而瀏覽器進(jìn)行響應(yīng)。
通過以上實(shí)例得知,使用 execute 向遠(yuǎn)程服務(wù)器發(fā)送請求會通過 webdriver 與瀏覽器交互,且發(fā)送已定義的命令常量可獲得一些相關(guān)信息。
由于在代碼中我們實(shí)例的是 webdriver 實(shí)例,去 webdriver基類(selenium.webdriver.remote.webdriver)中查詢相關(guān)信息,是否有相關(guān)函數(shù)可以獲取信息。發(fā)現(xiàn)以下函數(shù):
def title(self):"""Returns the title of the current page.:Usage:title = driver.title"""resp = self.execute(Command.GET_TITLE)return resp['value'] if resp['value'] is not None else "" @property def current_url(self):"""Gets the URL of the current page.:Usage:driver.current_url"""return self.execute(Command.GET_CURRENT_URL)['value'] @property def page_source(self):"""Gets the source of the current page.:Usage:driver.page_source"""return self.execute(Command.GET_PAGE_SOURCE)['value']以上并沒有列全,我們簡單的嘗試以上函數(shù)的使用方法,使用方法在函數(shù)中已經(jīng)說明。嘗試獲取 title(標(biāo)題)、current_url(當(dāng)前url)、page_source(網(wǎng)頁源代碼):
from selenium import webdriver driver = webdriver.Chrome() driver.get("http://www.csdn.net") print(driver.title) print(driver.current_url) print('作者博客:https://blog.csdn.net/A757291228') #支持原創(chuàng),轉(zhuǎn)載請貼上原文鏈接 # print(driver.page_source)結(jié)果成功獲取到網(wǎng)頁標(biāo)題以及當(dāng)前網(wǎng)址:
試試 page_source:
成功獲取:
原創(chuàng)不易,看到這里點(diǎn)個贊支持一下唄!謝謝
總結(jié)
以上是生活随笔為你收集整理的(上)python3 selenium3 从框架实现代码学习selenium让你事半功倍的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 还不会制作游戏脚本解放双手?那是你不会超
- 下一篇: (下)python3 selenium3