Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I am trying to extract a value in a span however the span is embedded into another. I was wondering how I get the value of only 1 span rather than both.

from bs4 import BeautifulSoup

some_price = page_soup.find("div", {"class":"price_FHDfG large_3aP7Z"})

some_price.span

# that code returns this:

'''

<span>$289<span class="rightEndPrice_6y_hS">99</span></span>

'''

# BUT I only want the $289 part, not the 99 associated with it

After making this adjustment:

some_price.span.text

the interpreter returns

$28999

Would it be possible to somehow remove the '99' at the end? Or to only extract the first part of the span?

Any help/suggestions would be appreciated!

1 Answer

0 votes
by (41.4k points)

Using this soup.contents attribute, you can access the desired value:

 

from bs4 import BeautifulSoup as soup

html = '''

 <span>$289<span class="rightEndPrice_6y_hS">99</span></span>

'''

result = soup(html, 'html.parser').find('span').contents[0]

This will give the output:

 

'$289'

If you wish to learn more about Python visit this Python Certification.

Browse Categories

...