How to use requests library to webscrape a list of links already scraped

Question

asked Jul 31, 2019 in Python by Rajesh Malhotra (19.9k points)

I have scraped a set of links off a website (https://www.gmcameetings.co.uk) - all the links including the words meetings, i.e. the meeting papers, which are now contained in 'meeting_links'. I now need to follow each of them links to scrape some more links within them.

I've gone back to using the request library and tried

r2 = requests.get("meeting_links")

But it returns the following error:

MissingSchema: Invalid URL 'list_meeting_links': No schema supplied.
Perhaps you meant http://list_meeting_links?

Which I've changed it to but still no difference.

This is my code so far and how I got the links from the first url that I wanted.

# importing libaries and defining
import requests
import urllib.request
import time
from bs4 import BeautifulSoup as bs
# set url
url = "https://www.gmcameetings.co.uk/"
# grab html
r = requests.get(url)
page = r.text
soup = bs(page,'lxml')
# creating folder to store pfds - if not create seperate folder
folder_location = r'E:\Internship\WORK'
# getting all meeting href off url
meeting_links = soup.find_all('a',href='TRUE')
for link in meeting_links:
print(link['href'])
if link['href'].find('/meetings/')>1:
print("Meeting!")
#second set of links
r2 = requests.get("meeting_links")

Do I need to do something with the 'meeting_links' before I can start using the requests library again? I'm completely lost.

1 Answer

Anirudh Singh · Answer 1 · 2019-07-31T07:33:43+0000

it looks like you are trying to pass a string to the requests method. Request method should look like this:

requests.get('https://example.com')

You should change your code to look like this:

for link in meeting_links:
if link['href'].find('/meetings/')>1:
r2 = requests.get(link['href'])

How to use requests library to webscrape a list of links already scraped

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources