如果出现这样的提示,说明IP已经被拉黑了。
那么即使不是恶意的访问(几秒一次不算吧),也得上代理。
//创建无Chrome无头参数 ChromeOptions options=new ChromeOptions(); //chromeOptions.addArguments("-headless"); String proxyServer = "93.170.6.26:8080"; // proxy Proxy proxy = new Proxy().setHttpProxy(proxyServer).setSslProxy(proxyServer); options.setProxy(proxy);
WebDriver wd = new ChromeDriver(options);
好了,又可以愉快的访问了。
如果要对Chrome设置参数,比如下载路径。或者不下载图片。可以自定义参数
我自己的百度的感受就是,基本上搜索结果里面Python的占了主流。Java不是做爬虫的好方式。
还有个问题,如果发现代理IP失效了,怎么动态替换呢?
比如:Caused by: java.net.ConnectException: Failed to connect to localhost/0:0:0:0:0:0:0:1:6012
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:247)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:165)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:257)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200)
at okhttp3.RealCall.execute(RealCall.java:77)
at org.openqa.selenium.remote.internal.OkHttpClient.execute(OkHttpClient.java:105)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:155)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
... 34 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)