當前位置：首頁 > 前端技术 > HTML >内容正文

HTML

golang操作chromedp模拟浏览器基础入门

發布時間：2023/12/20 HTML 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 golang操作chromedp模拟浏览器基础入门小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

源起

最近有個項目要用到headless，以前用過python Selenium。最近想試下go版本的。但是剛開始時候，發現一個坑。網上有部分代碼是老版本的chromedp。新版本不通用，就記錄下我的學習過程

chromedp 是什么?

廣泛使用的headless browser解決方案PhantomJS已經宣布不再繼續維護,轉而推薦使用headless chrome

那么headless chrome究竟是什么呢,Headless Chrome 是 Chrome 瀏覽器的無界面形態,可以在不打開瀏覽器的前提下,使用所有 Chrome 支持的特性運行您的程序。可以像在其他現代瀏覽器里一樣渲染目標網頁,并能進行網頁截圖,獲取cookie,獲取html等操作.

想要在golang程序里使用headless chrome,需要借助一些開源庫,實現和headless chrome交互的庫有很多,這里選擇chromedp,接口和Selenium類似,易上手。

普通模式

普通模式會在電腦上彈出瀏覽器窗口，可以在瀏覽器中看到代碼執行的效果，調用完成之后需要關閉掉瀏覽器。

chrome headless模式

chrome headless模式不會彈出瀏覽器窗口，并且你多次go run main.go的時候, go 代碼運行中斷導致后臺chrome headless不能退出,導致第二次本地調試失敗, 此時解決方案就是自己手動結束chrome進程。
因此在調試go代碼的時候不建議使用chrome headless模式。

一些瀏覽器參數

--no-first-run 第一次不運行
---default-browser-check 不檢查默認瀏覽器
--disable-gpu 關閉gpu,服務器一般沒有顯卡
remote-debugging-port chrome-debug工具的端口(golang chromepd 默認端口是9222,建議不要修改)
--no-sandbox 不開啟沙盒模式可以減少對服務器的資源消耗,但是服務器安全性降低,配和參數 --remote-debugging-address=127.0.0.1 一起使用
--disable-plugins 關閉chrome插件
--remote-debugging-address 遠程調試地址 0.0.0.0 可以外網調用但是安全性低,建議使用默認值 127.0.0.1
--window-size 窗口尺寸

使用代碼:

opts := append(chromedp.DefaultExecAllocatorOptions[:],chromedp.Flag("headless", false), // 不開啟圖像界面chromedp.ProxyServer("http://10.10.1.1:21869"), // 設置代理訪問chromedp.Flag("mute-audio", false), // 關閉聲音)allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)defer cancel()

選擇器:

熟悉最常用的幾個方法：chromedp.NewContext() 初始化chromedp的上下文，后續這個頁面都使用這個上下文進行操作chromedp.Run() 運行一個chrome的一系列操作chromedp.Navigate() 將瀏覽器導航到某個頁面chromedp.WaitVisible() 等候某個元素可見，再繼續執行。chromedp.Click() 模擬鼠標點擊某個元素chromedp.Value() 獲取某個元素的value值chromedp.ActionFunc() 再當前頁面執行某些自定義函數chromedp.Text() 讀取某個元素的text值chromedp.Evaluate() 執行某個js，相當于控制臺輸入jsnetwork.SetExtraHTTPHeaders() 截取請求，額外增加header頭chromedp.SendKeys() 模擬鍵盤操作，輸入字符chromedp.Nodes() 根據xpath獲取某些元素，并存儲進入數組chromedp.NewRemoteAllocatorchromedp.OuterHTML() 獲取元素的outer htmlchromedp.Screenshot() 根據某個元素截圖page.CaptureScreenshot() 截取整個頁面的元素chromedp.Submit() 提交某個表單chromedp.WaitNotPresent() 等候某個元素不存在，比如“正在搜索。。。”

簡單說就是1. 設定參數后調起瀏覽器 2. 瀏覽器根據你設定的事件進行操作。下面我們直接看一個代碼案例

案例:啟動訪問某個網站

package mainimport ("context""log""time""github.com/chromedp/chromedp" )func main() {// 禁用chrome headlessopts := append(chromedp.DefaultExecAllocatorOptions[:],chromedp.Flag("headless", false),)allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)defer cancel()// create chrome instancectx, cancel := chromedp.NewContext(allocCtx,chromedp.WithLogf(log.Printf),)defer cancel()// create a timeoutctx, cancel = context.WithTimeout(ctx, 5*time.Second)defer cancel()// navigate to a page, wait for an element, clickvar example stringsel := `//*[@id="username"]`err := chromedp.Run(ctx,chromedp.Navigate(`https://github.com/awake1t`),chromedp.WaitVisible("body"),//緩一緩chromedp.Sleep(2*time.Second),chromedp.SendKeys(sel, "username", chromedp.BySearch), //匹配xpath)if err != nil {log.Fatal(err)}log.Printf("Go's time.After example:\n%s", example)}

訪問網站并且截圖

package mainimport ("context""io/ioutil""log""math""time""github.com/chromedp/cdproto/emulation""github.com/chromedp/cdproto/page""github.com/chromedp/chromedp" )func main() {// 禁用chrome headlessopts := append(chromedp.DefaultExecAllocatorOptions[:],chromedp.Flag("headless", false),)allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)defer cancel()// create chrome instancectx, cancel := chromedp.NewContext(allocCtx,chromedp.WithLogf(log.Printf),)defer cancel()// create a timeoutctx, cancel = context.WithTimeout(ctx, 15*time.Second)defer cancel()// navigate to a page, wait for an element, click// capture screenshot of an elementvar buf []byte// capture entire browser viewport, returning png with quality=90if err := chromedp.Run(ctx, fullScreenshot(`https://github.com/awake1t`, 90, &buf)); err != nil {log.Fatal(err)}if err := ioutil.WriteFile("./Screenshot.png", buf, 0644); err != nil {log.Fatal(err)}log.Println("圖片寫入完成")}// fullScreenshot takes a screenshot of the entire browser viewport. // Liberally copied from puppeteer's source. // Note: this will override the viewport emulation settings. func fullScreenshot(urlstr string, quality int64, res *[]byte) chromedp.Tasks {return chromedp.Tasks{chromedp.Navigate(urlstr),chromedp.ActionFunc(func(ctx context.Context) error {// get layout metrics_, _, contentSize, err := page.GetLayoutMetrics().Do(ctx)if err != nil {return err}width, height := int64(math.Ceil(contentSize.Width)), int64(math.Ceil(contentSize.Height))// force viewport emulationerr = emulation.SetDeviceMetricsOverride(width, height, 1, false).WithScreenOrientation(&emulation.ScreenOrientation{Type: emulation.OrientationTypePortraitPrimary,Angle: 0,}).Do(ctx)if err != nil {return err}// capture screenshot*res, err = page.CaptureScreenshot().WithQuality(quality).WithClip(&page.Viewport{X: contentSize.X,Y: contentSize.Y,Width: contentSize.Width,Height: contentSize.Height,Scale: 1,}).Do(ctx)if err != nil {return err}return nil}),} }

更多操作：

https://github.com/chromedp/examples

官方文檔：

https://godoc.org/github.com/chromedp/chromedp

總結

以上是生活随笔為你收集整理的golang操作chromedp模拟浏览器基础入门的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： c语言程序设计教程第二版李春葆,C语言程
下一篇： C#窗体控件—pictureBox使用