Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I am new to data science and wanted to try doing web scrapping. so basically I am a beginner and learning to implement web scraping using Jupyter notebook.

I tried web scrap gschloar.py file as in one of the websites. But I am getting an error as shown below:

query() takes at least 2 arguments (1 given)

So I had given the 2 arguments as follows:

gscholar.query("my query", allresults=True)

But I still got an error telling as follows:

query() takes at least 2 arguments (2 given)

So I gave another argument. I ended up with a huge error which was of 100's of lines. One of my friends suggested me about the BeautifulSoup. So I tried using it but that didn't give any expected results. I ended up with different errors.

So I just went in search of sorting out my error. I found a code on a website, so I tried it but when I tried it I quickly got blocked by google. Now I am tired of trying. Can anyone help me solve it?

1 Answer

0 votes
by (36.8k points)

I suggest you not to use random libraries which are found on the websites. You need to use the proper test libraries which are approved like BeautifulSoup 

To access browsing information on the website you can use the URL opener class with the agenda of the user. This is a safe way to do web scrapping.

from urllib import FancyURLopener

class MyOpener(FancyURLopener):

    version = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36'

openurl = MyOpener().open

And then download the URL needed.

openurl(url).read()

To retrieve the results of the scholar use the url

http://scholar.google.se/scholar?hl=en&q=${query}

To get a piece of information from the retrieve HTML file you can use the code below:

from bs4 import SoupStrainer, BeautifulSoup

page = BeautifulSoup(openurl(url).read(), parse_only=SoupStrainer('div', id='gs_ab_md'))

Improve your knowledge in data science from scratch by click on the link Data Science

Related questions

Browse Categories

...