Why can’t you grab tickets when you travel on holiday?Reveal the key technology of 12306 how to ensure the ticket is not oversold>>>
There is a requirement that the database has a table with nearly thousands of URL records, and each record is an image. I need to ask them to get each image and save it locally. In the beginning, I wrote this (pseudo code)
import requests
for url in urls:
try:
r = requests.get(url).content
save_ image(r)
except Exception, e:
print str(e)
However, when running on the server, you will find that every other request will report an error similar to the following:
HTTPConnectionPool(host='wx.qlogo.cn', port=80):
Max retries exceeded with url: /mmopen/aTVWntpJLCAr2pichIUx8XMevb3SEbktTuLkxJLHWVTwGfkprKZ7rkEYDrKRr5icyDGIvU4iasoyRrqsffbe3UUQXT5EfMEbYKg/0 (
Caused by <class 'socket.error'>: [Errno 104] Connection reset by peer)
The reason, probably because I frequently request, the server closed the department request connection
import requests
for url in urls:
for i in range(10):
try:
r = requests.get(url).content
except Exception, e:
if i >= 9:
do_ some_ log()
else:
time.sleep(0.5)
else:
time.sleep(0.1)
break
save_ image(r)
The code is very simple, but it can illustrate the general solution. Increasing the delay between each request can reduce most of the requests rejected
However, there are still some requests that have been rejected, so after those requests have been rejected, a retrial is initiated
After being rejected for 10 times, he was willing to give up (recorded in the log)
In the actual request, with a delay of 0.1s, the rejection rate is much less
The maximum number of rejected retries is 3, and all the pictures are successfully removed
Similar Posts:
- [Solved] Python request Error: requests.exceptions.ReadTimeout: HTTPSConnectionPool (xxxx)
- What does HTTP status code 304 mean
- [Solved] Python urlib2gaierror: [Errno 11004] getaddrinfo failed
- HTTP status code 499 [How to Solve]
- Using fastcgi_ finish_ Request to achieve asynchronous processing and improve the response speed of the page
- When deploying Django project on centos7, there will be an error of importerror: cannot import name middlewaremin
- Nginx Timeout Error: upstream timed out (110: Connection timed out) while reading response header from ups…
- [Solved] Nginx Error: connect() failed (110: Connection timed out) while connecting to upstream
- [Solved] requests.exceptions.InvalidSchema: No connection adapters were found for
- This content should also be served over HTTPS