关于Jsoup解析https网页的问题
生活随笔
收集整理的這篇文章主要介紹了
关于Jsoup解析https网页的问题
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
針對Jsoup解析https網頁,網上的一段源碼執行后并不能實現成功訪問。
import java.net.MalformedURLException; import java.net.URL; import java.security.SecureRandom; import java.security.cert.CertificateException; import java.security.cert.X509Certificate; import java.util.Map; import javax.net.ssl.HostnameVerifier; import javax.net.ssl.HttpsURLConnection; import javax.net.ssl.SSLContext; import javax.net.ssl.SSLSession; import javax.net.ssl.X509TrustManager; import org.jsoup.Connection; import org.jsoup.helper.HttpConnection;public class HTTPCommonUtil {public static void trustEveryone() { try { HttpsURLConnection.setDefaultHostnameVerifier(new HostnameVerifier() { public boolean verify(String hostname, SSLSession session) { return true; } }); SSLContext context = SSLContext.getInstance("TLS"); context.init(null, new X509TrustManager[] { new X509TrustManager() { public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException { } public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException { } public X509Certificate[] getAcceptedIssuers() { return new X509Certificate[0]; } } }, new SecureRandom()); HttpsURLConnection.setDefaultSSLSocketFactory(context.getSocketFactory()); } catch (Exception e) { e.printStackTrace(); } } public static Object getHttpHeaders(URL url, int timeout) { try { trustEveryone(); Connection conn = HttpConnection.connect(url); conn.timeout(timeout); conn.header("Accept-Encoding", "gzip,deflate,sdch"); conn.header("Connection", "close"); conn.get(); //String result=conn.response().body();Map<String, String> result = conn.response().headers(); result.put("title", conn.response().parse().title()); return result; } catch (Exception e) { e.printStackTrace(); } return null; } public static void main(String[] args) { try { URL url = new URL("https", "www.icbc-axa.com", -1, ""); System.out.println(getHttpHeaders(url, 10000)); } catch (MalformedURLException e) { e.printStackTrace(); } } }執行結果:{Content-Length=187, Connection=close, Pragma=no-cache, Cache-Control=no-cache, title=Request Rejected}
需要進一步尋找方案,通過java自帶HttpsURLConnection可實現https訪問,實際上上面這段代碼的trustEveryone()函數也是通過HttpsURLConnection,只是如何結合到jsoup尚未找到有效辦法,先轉到htmlparser來實現。
總結
以上是生活随笔為你收集整理的关于Jsoup解析https网页的问题的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Java开发-Redis客户端Jedis
- 下一篇: httpclient解析https网页