Java编写抓取用户信息代码_[代码全屏查看]-一个基于JAVA的知乎爬虫,抓取知乎用户基本信息...
[1].[代碼] [Java]代碼
作者:臥顏沉默
鏈接:https://www.zhihu.com/question/36909173/answer/97643000
來源:知乎
著作權歸作者所有。商業轉載請聯系作者獲得授權,非商業轉載請注明出處。
/**
*
* @param httpClient Http客戶端
* @param context Http上下文
* @return
*/
public boolean login(CloseableHttpClient httpClient, HttpClientContext context){
String yzm = null;
String loginState = null;
HttpGet getRequest = new HttpGet("https://www.zhihu.com/#signin");
HttpClientUtil.getWebPage(httpClient,context, getRequest, "utf-8", false);
HttpPost request = new HttpPost("https://www.zhihu.com/login/email");
List formParams = new ArrayList();
yzm = yzm(httpClient, context,"https://www.zhihu.com/captcha.gif?type=login");//肉眼識別驗證碼
formParams.add(new BasicNameValuePair("captcha", yzm));
formParams.add(new BasicNameValuePair("_xsrf", ""));//這個參數可以不用
formParams.add(new BasicNameValuePair("email", "郵箱"));
formParams.add(new BasicNameValuePair("password", "密碼"));
formParams.add(new BasicNameValuePair("remember_me", "true"));
UrlEncodedFormEntity entity = null;
try {
entity = new UrlEncodedFormEntity(formParams, "utf-8");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
request.setEntity(entity);
loginState = HttpClientUtil.getWebPage(httpClient,context, request, "utf-8", false);//登錄
JSONObject jo = new JSONObject(loginState);
if(jo.get("r").toString().equals("0")){
System.out.println("登錄成功");
getRequest = new HttpGet("https://www.zhihu.com");
HttpClientUtil.getWebPage(httpClient,context ,getRequest, "utf-8", false);//訪問首頁
HttpClientUtil.serializeObject(context.getCookieStore(),"resources/zhihucookies");//序列化知乎Cookies,下次登錄直接通過該cookies登錄
return true;
}else{
System.out.println("登錄失敗" + loginState);
return false;
}
}
/**
* 肉眼識別驗證碼
* @param httpClient Http客戶端
* @param context Http上下文
* @param url 驗證碼地址
* @return
*/
public String yzm(CloseableHttpClient httpClient,HttpClientContext context, String url){
HttpClientUtil.downloadFile(httpClient, context, url, "d:/test/", "1.gif",true);
Scanner sc = new Scanner(System.in);
String yzm = sc.nextLine();
return yzm;
}
總結
以上是生活随笔為你收集整理的Java编写抓取用户信息代码_[代码全屏查看]-一个基于JAVA的知乎爬虫,抓取知乎用户基本信息...的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: shadowplay要下载java_Ja
- 下一篇: JAVA继承类phone_JAVA(9)