Get data from HTML page

I have some data from a HTML page as follows

<span class="some class abc-vc"> 123</span>
<span class="some class vde-bc"> 435</span>
<span class="some class v9mo-04mg"> 456 </span>

I would only like to search for

some class 

part of the tag so that I can store the variables one by one

How can I achieve this?

code:

from urllib.request import Request, urlopen
import bs4 
url = 'url'
page = urlopen(url).read()
soup = bs4.BeautifulSoup(page, 'html.parser')
data = soup.find('span',{'class':'some class'})
print (data.text)

You can use regular expression to find specific items.Try below code.

from bs4 import BeautifulSoup
import re

data='''<span class="some class abc-vc"> 123</span>
<span class="some class vde-bc"> 435</span>
<span class="some class v9mo-04mg"> 456 </span>'''
soup=BeautifulSoup(data,'html.parser')

for item in soup.find_all('span',class_=re.compile('some class')):
    print(item.text)

Output:

123
435
456 

HTML Scraping — The Hitchhiker's Guide to Python, from lxml import html import requests. Next we will use requests.get to retrieve the web page with our data, parse it using the html module, and save the results in� In this article you will learn how to get data from an HTML page in a servlet. Tune in FREE to the React Virtual Conference Sep. 11 at 10am ET x LEARN: React Virtual Conference

In HTML, distinct classes are separated by spaces. So that bottom span for example has three classes: some, class, and v9mo-04mg.

To find all tags that contain the class some and the class class, use a list as your dictionary value:

data = soup.find('span', {'class':['some', 'class']})

If you need multiple, then replace the .find() method with .find_all().

Get data from another HTML page, Assuming that you know how to find the correct image in the magazine website's DOM (otherwise, forget it):. the magazine website must� How to retrieve form data sent via GET When you submit a form through the GET method, PHP provides a superglobal variable, called $_GET. PHP uses this $_GET variable to create an associative array with keys to access all the sent information (form data). The keys is created using the element's name attribute values.

They are compound classes. You can join them with "." and pass to select

elements = [item for item in soup.select('.some.class')]

Getting Data from the Web, In this chapter we walk through a very basic example of scraping data from an HTML web page. What is machine-readable data? The goal for most of these� If you want to extract data from a particular column, then you can simply assign value to the variable j in the second loop. Each column or <th> has a unique index value starting from 0. For example, if you want to show values of column Age, then you can assign value 2 to the variable j.

HTML form method Attribute, send form-data (the form-data is sent to the page specified in the action attribute). The form-data can be sent as URL variables (with method="get") or as HTTP� For example, you can extract news headlines from a news portal, or get stock quotes from a web page etc. This process is also known by the term Screen Scraping. The contents of a webpage are inserted inside HTML elements. Every HTML element has a tag and you can identify each element by its tag. To read the content of an element (from Excel

HTML Scraping: How to Scrape any Website, Learn how to scrape HTML websites and how to extract specific HTML attributes A web scraper can help you extract data from any site and also pull any the page and you will be able to select the data you'd like to extract. Tutorial: Analyze webpage data by using Power BI Desktop Connect to a web data source. You can get the UEFA winners data from the Results table on the UEFA European Football Shape data in Power Query Editor. You want to make the data easier to scan by displaying only the years and the Import

Extracting Data from HTML with BeautifulSoup, Import the "requests" library to fetch the page content and bs4 (Beautiful Soup) for parsing the HTML page content. 1 2 from bs4 import� Select Get Data from the Home ribbon menu. In the dialog box that appears, select Other from the categories in the left pane, and then select Web. Select Connect to continue. In From Web, enter the URL of the Web page from which you'd like to extract data.

Comments
  • soup.find('span', class_='some class')
  • Like mentioned here?: stackoverflow.com/q/52816683/4636715
  • Thank you for the answer. A regular expression is actually what I was looking for, but it turns out I was looking for the wrong tag. The correct data that I need is: <span style="font-size:90%"><b>100</b></span> Would you know how to get this?
  • @user3702643 : so you need the value of 100 from this tag right?
  • Yes correct. Each row has the exact same tag as follows <span style="font-size:90%"><b>100</b></span> <span style="font-size:90%"><b>200</b></span> <span style="font-size:90%"><b>300</b></span>
  • you don't need regular expression for that.You can find like that way for item in soup.find_all('span'): print(item.find('b').text)
  • for item in soup.find_all('span',style=re.compile('font-size:90%')): print(item.find('b').text) This should work
  • I am still unable to get the required data. I get AttributeError: 'NoneType' object has no attribute 'text'
  • @user3702643 That means the find was invalid. I tested this with the HTML you posted in your question and it worked fine. Perhaps print(soup) just to make sure it's all working fine
  • You were right. Upon closer inspection of the soup, I realized that the data I needed was actually in the tag below. Any help for this? <span style="font-size:90%"><b>100</b>