Scrapy no such host crawler -
I am using this crawler as my crawler
This 404 error is to catch the domain Designed to and save them. I wanted to modify it a bit and wanted to find it "no such host" error, which is error 12002.
However, with this code, scrappy is not receiving any response (because there is not a host back to a response) and when scrapi gives it such a domain encounter.
>not found: [Errno 11001] getaddrinfo failed.
How can I not find this error and save the domain name?
Exception, and method, through the request and reaction objects, during processing of a request Is controlled.
All exceptions to the following (where IgnoreRequest
has been raised) will be logged in the log file
class ExceptionLog (object): def process_exception , Request, exception, spider): with open ('exceptions.log', 'a'): f.write (str (exception) + "\ n")
always To use the signals to call it, extend it to spider_opened for better file handling, or for settings from your
) . settings.py
file (such as a custom EXCEPTIONS_LOG =) ()
and spider_closed ()
...
Add this to your DOWNLOADER_MIDDLEWARES
in your settings file where you note it while keeping it in the range of middleware, though! To stop the engine, and you can control the logging exceptions elsewhere. To remove from the engine, and you can try again the exceptions which can be tried again or otherwise resolved. Where you put it, will be based on where you will need it.
Comments
Post a Comment