I am learning data science using the internet, So I trying to pull the tables from the HTML page to the jupyter notebook. The problem I am facing is when I use the code class= 'table' it is showing all the contents in the tabs and all the tables which is so messy.
This is the code I am using:
import requests
import lxml.html as lh
import pandas as pd
import csv
import requests
from bs4 import BeautifulSoup
url = 'https://www.worldometers.info/coronavirus/#countries'
page = requests.get(url)
print(page.status_code) #Checking the http response status code. Should be 200
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())
all_tables=soup.find_all("table")
right_table = soup.find('table',{'class':'table'})
col_headers = [th.getText() for th in right_table.findAll('th')]
data = [[td.getText() for td in right_table.findAll('td')] for tr in right_table()]
I have 13 columns but when I combine col_headers it is telling I have 2990 columns. Kindly help me solve it.