I am writing a python code to scrape the pdfs of meetings off this website: https://www.gmcameetings.co.uk The pdf links are within links, which are also within links. I have the first set of links off the page above, then I need to scrape links within the new urls. When I do this I get the following error:
AttributeError: ResultSet object has no attribute 'find_all'. You're
probably treating a list of items like a single item. Did you call
find_all() when you meant to call find()?
This is my code so far which is all fine and checked in jupyter notebook:
# importing libaries and defining
import requests
import urllib.request
import time
from bs4 import BeautifulSoup as bs
# set url
url = "https://www.gmcameetings.co.uk/"
# grab html
r = requests.get(url)
page = r.text
soup = bs(page,'lxml')
# creating folder to store pfds - if not create seperate folder
folder_location = r'E:\Internship\WORK'
# getting all meeting href off url
meeting_links = soup.find_all('a',href='TRUE')
for link in meeting_links:
print(link['href'])
if link['href'].find('/meetings/')>1:
print("Meeting!")
This is the line that then receives the error:
second_links = meeting_links.find_all('a', href='TRUE')
I have tried the find() as python suggests but that doesn't work either. But I understand that it can't treat meeting_links as a single item.
So basically, how do you search for links within each bit of the new string variable (meeting_links).
I already have code to get the pdfs once I have the second set of urls which seems to work fine but need to obviously get these first. Hopefully this makes sense and I've explained ok - I only properly started using python on Monday so I'm a complete beginner.