Tag Archives: Scrapy

Super detail: command not found: the scratch solution (add the scratch environment variable to Zsh under MAC)

Background: originally, I planned to create a crawler project with scratch, but it showed Zsh: command not found: scratch . After reading many blogs, I solved the problem and decided to record it.

Main reference Blogs:

https://www.jianshu.com/p/51196153f804

https://stackoverflow.com/questions/17607944/scrapy-not-installed-correctly-on-mac

Problem analysis:

When I reinstall the script, I show:

WARNING: The script scrapy is installed in 
'/Library/Frameworks/Python.framework/Versions/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH 
  or, if you prefer to suppress this warning, use --no-warn-script-location.

Note /library/frameworks/python.framework/versions/3.9/bin is not in the path. We need to add this path to the environment variable (consumer adding this directory to path).

terms of settlement:

Step 1: add source ~ /. Bash at the end of the. Zshrc file_ profile。

Open the finder, press Command + Shift + G, enter. Zshrc, open the. Zshrc file, and then write source ~ /. Bash at the end of the file_ profile。

Press Command + s to save

Open the terminal, enter source ~ /. Zshrc , and execute the file.

Step 2: in. Bash_ Add environment variables to the profile file

Open the finder, press Command + Shift + G at the same time, and enter. Bash_ Profile, open. Bash_ Profile file.

Write on the last line:

export PATH="/Library/Frameworks/Python.framework/Versions/3.9/bin:$PATH"

be careful ⚠️： The number after versions is the python version number. It should be modified according to your own Python version. If the version is 2.7, it should be changed to:

export PATH="/Library/Frameworks/Python.framework/Versions/2.7/bin:$PATH"

Press Command + s to save.

Open the terminal and enter source ~ /. Bash_ Profile , execute the file.

Finally, you can enter echo $path on the terminal to see if the environment variable is added.

You can see that /library/frameworks/python.framework/versions/2.7/bin has been added ( ~~and many ).~~

~~Finally, enter scapy and you can finally use it!!!~~

~~This entry was posted in Error and tagged :command not found:scrapy, ), Mac, Scrapy, zsh on September 22, 2021 by Robins.~~

Spider Error: Scratch processing timeout [How to Solve]

Previously, the timeout exception was handled in download middleware, but it was always very laborious

Today, I checked the document and found that it can be processed in the errback callback

from scrapy.spidermiddlewares.httperror import HttpError from twisted.internet.error import DNSLookupError from twisted.internet.error import TimeoutError, TCPTimedOutError yield scrapy.Request(url=full_url, errback=self.error_httpbin, dont_filter=True, callback=self.parse_list, meta={"hd": header}) def error_httpbin(self, failure): # failure.request is the Request object, if you need to retry, directly yield can # if failure.check(HttpError): # these exceptions come from HttpError spider middleware # you can get the non-200 response # response = failure.value.response # self.logger.error('HttpError on %s', response.url) if failure.check(DNSLookupError): print("DNSLookupError------->") # this is the original request request = failure.request yield request # self.logger.error('DNSLookupError on %s', request.url) elif failure.check(TimeoutError, TCPTimedOutError): print("timeout------->") request = failure.request yield request # self.logger.error('TimeoutError on %s', request.url)

It is hereby recorded that the timeout exception has not been handled in this way before

This entry was posted in Python and tagged Scrapy, spider on August 29, 2021 by Robins.

Installing scrapy in window — solving the problem of error reporting

The system is win10 64 bit
Python is 3.5.2
install PIP install scrapy today To install
Microsoft Visual C + + 14.0 is required

It is found that Microsoft Visual C + + 14.0 is actually installed on the computer, but it cannot be installed successfully anyway

Later, the solution was to use files to install

1. Download the scrapy installation file

2. Install this using the command PIP install wheel

3. Switch CMD to the directory where the scrapy file is located and install it by PIP install *********************************************************

4. You can use PIP install scrape to check whether the installation is successful

By the way, post a teaching article: introduction to slapy

This entry was posted in Error and tagged Scrapy, window on July 30, 2021 by Robins.

Solution to Anaconda installation scene error

Today, I encountered a pit when installing the script with anaconda. Now I’ll send out the solution for your reference:

Problem Description:

Anaconda installs the sweep and uses the CONDA install sweep command. After the installation is completed, execute the scratch prompt on the command line and report an error, as shown in the figure:

Installation under windows is like this… DLL load failed

Solution:

When you install directly using scratch, you will be prompted that the lxml module is not installed properly. Manual reinstallation is required

1. Find lxml file

Address: https://www.lfd.uci.edu/ ~gohlke/pythonlibs/#lxml

2. Download the corresponding xlml file. I downloaded lxml ‑ 4.2.4 ‑ cp36 ‑ cp36m ‑ win for windows 64 bit_amd64.whl

3. After downloading, open CMD to enter the file directory for execution

　　　　pip install lxml‑4.2.4‑cp36‑cp36m‑win_amd64.whl

　　

OK, see the prompt that the installation is successful

4. Verify the results and execute the script again:

　　

　　ok。 Success

5. Create a crawler project and try the following:

　　

Done

　　

Attached:

Anaconda image source address of Tsinghua University:

　　 https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/

Download address of lxml and other installation packages:

　　 https://www.lfd.uci.edu/ ~gohlke/pythonlibs/#lxml

This entry was posted in Error and tagged anaconda, Scrapy on July 30, 2021 by Robins.

Python PIP installs scrapy with an error of twisted

Scrapy relies on the following packages:
lxml: an efficient XML and HTML parser
w3lib: a multifunctional assistant for handling URL and web page coding
twisted: an asynchronous network framework
cryptography and pyopenssl: handling various network level security requirements
——————
1. Run PIP security first Install PIP install scrape
2. After the installation, except for the error twisted, the other dependent packages should be installed

Then download twisted by yourself. Note: it should correspond to your Python version number and the number of digits of the computer system
I use Python 37 and the system is 64 bits
https://www.lfd.uci.edu/ ~gohlke/pythonlibs/

3. After downloading, install pip. PIP install [file path] \ twisted-18.9.0-cp37-cp37m-win_AMD64. WHL
4. Run the PIP installation of scripy again for the last time to install successfully

——————
copyright notice: This article is the original article of CSDN blogger “Sagittarius 32”, which follows the CC 4.0 by-sa copyright agreement. Please attach the source link of the original text and this notice for reprint
original link: https://blog.csdn.net/sagittarius32/article/details/85345142

This entry was posted in Error and tagged python pip, Scrapy, Twisted on July 29, 2021 by Robins.

How to Solve Scrapy Run Error: ImportError: No module named win32api

Script runtime error: importerror: no module named win32API

Solution:

pip install pypiwin32

This entry was posted in Python and tagged ImportError: No module named win32api, Scrapy, Scrapy Run Error on June 17, 2021 by Robins.

DebugAH

How to Solve Your Programmer Error

Tag Archives: Scrapy

Super detail: command not found: the scratch solution (add the scratch environment variable to Zsh under MAC)

Spider Error: Scratch processing timeout [How to Solve]

Installing scrapy in window — solving the problem of error reporting

Solution to Anaconda installation scene error

Python PIP installs scrapy with an error of twisted

How to Solve Scrapy Run Error: ImportError: No module named win32api

Script runtime error: importerror: no module named win32API

pip install pypiwin32