Tag Archives: Python Error

[Solved] Python Error: a bytes-like object is required, not ‘str’

Core code:

def ipPools(numPage):
    headers = randomHeads()
    url = 'http://www.xicidaili.com/nn/'
    saveFsvFile = open('ips.csv', 'wb')
    writer = csv.writer(saveFsvFile)
    for num in range(1, numPage + 1):
        full_url = url + str(num)
        re = requests.get(full_url, headers=headers)
        soup = BeautifulSoup(re.text, 'lxml')
        res = soup.find(id="ip_list").find_all('tr')
        for item in res:
            try:
                temp = []
                tds = item.find_all('td')
                proxyIp = tds[1].text.encode("utf-8")
                proxyPort = tds[2].text.encode("utf-8")
                temp.append(proxyIp)
                temp.append(proxyPort)
                writer.writerow(temp)
                print('保存为excel成功!')
            except IndexError:
                pass
Points to note.
   Be sure to convert str to bytes :
   str.encode("utf-8")
   python36 file method to open
   open('ips.csv', 'wb') change wb to w I got an error right here.  If there is the same error can, as a reference it!

[Solved] Python Error: /usr/bin/python^M: bad interpreter: No such file or directory

The main reason is ^ M

This is caused by different system coding formats: the. Sh. Py file edited in Windows system may have invisible characters, so the above abnormal information will be reported when executing in Linux system. It is usually caused by the different identification of the end of windows line and Linux line

Solution:

1) Conversion in Windows:

Use some editors, such as UltraEdit or EDITPLUS, to encode and convert scripts first, and then put them into Linux for execution. The conversion method is as follows (UltraEdit): File — > Conversions–> DOS-> UNIX is fine

2) Direct replacement under Linux

Sed – I’s/^ m// g ‘file name

3) It can also be converted in Linux

First, make sure that the file has executable permissions

#sh> chmod a+x filename

Then change the file format

#sh> vi filename

Use the following command to view the file format

: set FF or: set fileformat

You can see the following information

Fileformat = DOS or fileformat = UNIX

Use the following command to modify the file format

: set FF = UNIX or: set fileformat = UNIX

: WQ (save and exit)

Finally, execute the file

#sh>./ filename

How to Solve Python Error: “HTTP Error 403: Forbidden”

Question:

When the following statement is executed

1 def set_IPlsit():
2     url = 'https://www.whatismyip.com/'
3     response = urllib.request.urlopen(url)
4     html = response.read().decode('utf-8')

The following exception occurred:

C:\Users\54353\AppData\Local\Programs\Python\Python36\python.exe "C:/Users/54353/PycharmProjects/untitled/爬虫/图片 - 某网站.py"
Traceback (most recent call last):
  File "C:/Users/54353/PycharmProjects/untitled/crawler/pic.py", line 100, in <module>
    ip = set_IPlsit2()
  File "C:/Users/54353/PycharmProjects/untitled/crawler/pic.py", line 95, in set_IPlsit2
    response = ure.urlopen(url)
  File "C:\Users\54353\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Users\54353\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 532, in open
    response = meth(req, response)
  File "C:\Users\54353\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 642, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Users\54353\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 570, in error
    return self._call_chain(*args)
  File "C:\Users\54353\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 504, in _call_chain
    result = func(*args)
  File "C:\Users\54353\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 650, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Process finished with exit code 1

analysis:

The reason for the above exception is that if you open a URL in urllib.request.urlopen mode, the server will only receive a simple request for accessing the page, but the server does not know the browser, operating system, hardware platform and other information used to send the request, and the request without such information is often abnormal access, such as crawler

In order to prevent this kind of abnormal access, some websites will verify the user agent in the request information. If the user agent is abnormal or does not exist, the request will be rejected

Solution:

Add the user agent to the request, and the code is as follows

1 headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0'}  
2 req = urllib.request.Request(url=chaper_url, headers=headers)  
3 urllib.request.urlopen(req).read()

Python Error – (sklearn) ImportError: No module named cross_validation

from sklearn.cross_validation import train_test_split
ERROR:
ImportError: No module named sklearn.cross_validation

Solution:
it must relate to therenaming and depreaction ofcross_validationsubmodule tomodel_selection. Try substitutingcross_validation->model_selection

train_test_split is now in model_selection. Just type:

from sklearn.model_selection import train_test_split

Reference:
https://stackoverflow.com/questions/30667525/importerror-no-module-named-sklearn-cross-validation

How to Solve Python Error: crawler uses proxy anti blocking IP: http error 403: forbidden

When writing crawler crawling data, we often encounter the following prompt

HTTP Error 403: Forbidden

I have written a note using multiple headers before, but this kind of note only has one IP, which is just disguised as a different browser. Therefore, in order to further prevent being blocked, I still need to change different IP in time. Let’s record the process of Python using proxy crawling. PS: try not to say it too often

Go straight to the code:

proxy_list=[#This is the proxy IP I used at the time, please update the IP that can be used
    '202.106.169.142:80',   
    '220.181.35.109:8080',  
    '124.65.163.10:8080',
    '117.79.131.109:8080',
    '58.30.233.200:8080',
    '115.182.92.87:8080',
    '210.75.240.62:3128',
    '211.71.20.246:3128',
    '115.182.83.38:8080',
    '121.69.8.234:8080',
        ]

#Next, in the code you use to urllib2, bind a certain IP, as follows.
proxy       = random.choice(proxy_list)
urlhandle   = urllib2.ProxyHandler({'http':proxy})
opener      = urllib2.build_opener(urlhandle)        
urllib2.install_opener(opener) 

#Normal use of urllib
req         = urllib2.Request(listurl,headers=headers)
content     = urllib2.urlopen(req).read()

According to the specific use experience of crawling time.com and Douban movies: explain
– the free agent is not very stable. If you crawl a lot of time for a long time, you’d better spend a little money, it’s very cheap
– find the free proxy IP and use the high hidden proxy IP. Recommend this site

How to Solve Python Error: python __file__ is not defined

 

python __ file__ Is not defined solution

__ file__ It is a variable generated when Python module is imported__ file__ Can’t be used, but what should I do to get the path of the current file

Method 1

import inspect, os.path

filename = inspect.getframeinfo(inspect.currentframe()).filename
path     = os.path.dirname(os.path.abspath(filename))

Method 2

import inspect
import os

os.path.abspath(inspect.getsourcefile(lambda:0))

Link: https://stackoverflow.com/questions/2632199/how-do-i-get-the-path-of-the-current-executed-file-in-python/18489147#18489147

How to Solve Python Error: slice indices must be integers or None or have

My code:

The content is treated as a string

content[len(content)/2:len(content)/2+5]

Error:

TypeError: slice indices must be integers or None or have an __ index__ method

Looking through a lot of data, we find that Python may be converted to floating-point number when dividing. You need to change the “/” in it to “/ /” to run it

Python Error-TypeError:takes 2 positional arguments but 3 were given

Error:

Today, when I write a simple Python class definition code, I encountered the problem of reporting an error: typeerror: drive() takes 2 positional arguments but 3 were given

The code is as follows

class Car:
    speed = 0
    def drive(self,distance):
        time = distance/self.speed
        print(time)

bike = Car()
bike.speed=60
bike.drive(60,80)

After investigation, it was found that it was the self parameter in the def drive (self, distance) method in the class definition

Now let’s take a brief look at the basic information of self in Python

self , which means that the created class instance itself and the method itself can bind various attributes to self, because self points to the created instance itself. When creating an instance, you can’t pass in empty parameters. You must pass in parameters that match the method, but you don’t need to pass in self. The Python interpreter will pass in instance variables by itself

so there are two solutions

method 1: transfer only one parameter. If you want to transfer two parameters, look at method 2

class Car:
    speed = 0
    def drive(self,distance):
        time = distance/self.speed
        print(time)

bike = Car()
bike.speed=60
bike.drive(80)

Method 2:

class Car:
    speed = 0
    def drive(self,distance,speed):
        time = distance/speed
        print(time)
bike = Car()
bike.drive(80,50)

Python Error:Exception Value:can only concatenate str (not “bytes”) to str

error source code:

#Receive request data
def search(request):
    request.encoding = 'utf-8'
    if 'q' in request.GET:
        message = 'You searched for: ' +request.GET['q'].encode('utf-8')
    else:
        message = 'You submitted an empty form'
    return HttpResponse(message)

code marked red position, we can see that encode function is used to transcode, because encode transcode returns data of type Bytes, can not be directly added with data of type STR.

Since the request request has been transcoded in the first sentence of the function, we remove the following encode function here, and the error can be solved.

The updated code is:

#Receive request data
def search(request):
    request.encoding = 'utf-8'
    if 'q' in request.GET:
        message = 'You searched for: ' +request.GET['q']
    else:
        message = 'You submitted an empty form'
    return HttpResponse(message)