How To Fix Scrapy Error Response.xpath Syntax Error Near Unexpected Token

Scrapy shell is useful to browse a web URL and extract the web element data in an interactive console, but when I use the Scrapy shell to extract a web page item with XPath recently, I meet an error message -bash: syntax error near unexpected token. This article will tell you how to fix it.

1. The Character Of Error Response.xpath Syntax Error Near Unexpected Token.

  1. Below is the detail error message. We can see that when we run the command scrapy shell https://www.indeed.com/jobs?q=python+developer&l=&ts=1610810564897&rq=1&rsIdx=6 , it will not display the prompt character ( >>>) as normal.
    $ scrapy shell https://www.indeed.com/jobs?q=python+developer&l=&ts=1610810564897&rq=1&rsIdx=6
    ......
    [s]   request    <GET https://www.indeed.com/jobs?q=python+developer>
    [s]   response   <200 https://www.indeed.com/q-python-developer-jobs.html>
    [s]   settings   <scrapy.settings.Settings object at 0x7f82254c23c8>
    [s]   spider     <DefaultSpider 'default' at 0x7f82267634a8>
    [s] Useful shortcuts:
    [s]   fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
    [s]   fetch(req)                  Fetch a scrapy.Request and update local objects 
    [s]   shelp()           Shell help (print this help)
    [s]   view(response)    View response in a browser
    
    
  2. And when we run the command response.xpath('/html/head/title/text()').extract() in above Scrapy shell, it will prompt the error message -bash: syntax error near unexpected token `’/html/head/title/text()”. And then the Scrapy shell will exit.
    [s]   view(response)    View response in a browser
    response.xpath('/html/head/title/text()').extract()
    -bash: syntax error near unexpected token `'/html/head/title/text()''
    
    [7]+  Stopped                 scrapy shell https://www.indeed.com/jobs?q=python+developer

2. How To Fix -Bash: Syntax Error Near Unexpected Token.

  1. From the above Scrapy shell response message, we can see this log data
    response   <200 https://www.indeed.com/q-python-developer-jobs.html>
  2. I think the original url https://www.indeed.com/jobs?q=python+developer&l=&ts=1610810564897&rq=1&rsIdx=6 has been changed to https://www.indeed.com/q-python-developer-jobs.html.
  3. So I run the command scrapy shell https://www.indeed.com/q-python-developer-jobs.html again, and it will display the prompt character (>>>)
    $ scrapy shell https://www.indeed.com/q-python-developer-jobs.html
    ......
    [s]   scrapy     scrapy module (contains scrapy.Request, scrapy.Selector, etc)
    [s]   crawler    <scrapy.crawler.Crawler object at 0x7f81733f1160>
    [s]   item       {}
    [s]   request    <GET https://www.indeed.com/q-python-developer-jobs.html>
    [s]   response   <200 https://www.indeed.com/q-python-developer-jobs.html>
    [s]   settings   <scrapy.settings.Settings object at 0x7f81733edb70>
    [s]   spider     <DefaultSpider 'default' at 0x7f81737754e0>
    [s] Useful shortcuts:
    [s]   fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
    [s]   fetch(req)                  Fetch a scrapy.Request and update local objects 
    [s]   shelp()           Shell help (print this help)
    [s]   view(response)    View response in a browser
    >>> 
    
  4. Now I run the command response.xpath('/html/head/title/text()').extract() in Scrapy shell console, it will return the correct web element data.
    >>> response.xpath('/html/head/title/text()').extract()
    ['Python Developer Jobs, Employment | Indeed.com']
    >>> 
    
  5.  So the error is because of the web page URL, you should take it carefully.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.