VBA web scraping contents without class name or ID
I would like to scrape Dividend Future Prices from HKEX.
I wanted to scrape Prev.Day settlement price of the "Dec-19 Contract" via VBA. However, it doesn't have any class name or id, so I have no idea how to access the information.
<tr> <td>Dec-19</td> <td>-</td> <td>-</td> <td>413.78</td> <td> - <br> - </td> <td>-</td> <td> - <br> - </td> <td>-<td> <td>17,330</td> </tr>
How can I scrape this via VBA?
It's really the hell of an automation to find specific item with no remarkable flag attached to it. However, I've written this script without hardcoding index to the elements. Give this a shot and get your desired values:
Sub Hkex_Data() Dim IE As New InternetExplorer, html As HTMLDocument Dim posts As Object With IE .Visible = False .navigate "http://www.hkex.com.hk/Market-Data/Futures-and-Options-Prices/Equity-Index/HSCEI-Dividend-Futures?sc_lang=en#&product=DHH" Do Until .readyState = READYSTATE_COMPLETE: Loop Set html = .document End With Application.Wait (Now + TimeValue("0:00:05")) For Each posts In html.getElementsByClassName("hsirowcon") Row = Row + 1: Cells(Row, 1) = posts.NextSibling.NextSibling.FirstChild.innerText Cells(Row, 2) = posts.NextSibling.NextSibling.LastChild.innerText Next posts IE.Quit End Sub
Reference to add to the library:
Microsoft internet controls Microsoft Html Object Library
Scraping a website HTML in VBA, I wanted to scrape Prev.Day settlement price of the "Dec-19 Contract" via VBA. However, it doesn't have any class name or id, so I have no idea how to access VBA web scraping contents without class name or ID. it doesn't have any class name or id, so I have no idea how to access the information. vba web-scraping
Use getElementsByTagName. Identify your and then go through each row and each td in rows. Something like that.
Dim objTR As IHTMLElement Dim objTD As IHTMLElement Dim objTable As IHTMLElement For Each objTR In objTable.getElementsByTagName("tr") For Each objTD In objTR 'do something with objtd.innerText Next objTD Next objTR
or you can declare your variables as Object if you prefer late binding.
How to Scrape Web Data Using Class Names with VBA – Free Excel , You can use VBA to extract data from web pages, either as whole Don't forget that websites change all the time, so this code may no Id, We'll find the div tag with class question-summary narrow, and Votes, We'll find the div tag with class name votes, and look at the inner text for this (ie the contents of Below find 2 quick UDF functions (user defined functions) that you can use to scrape html items by id and name. Scrape HTML elements in Excel by ID, name or Regex. Be sure to check out my VBA Web Scraping Kit and the
You could also simply use a CSS selector and no loop:
This method is fragile. If the style on the page changes this may break.
If you observe the relevant part of the page (showing first contract year with headers for context and with chart between contract years removed):
The associated HTML for contract year 2019 is:
Prev.Day Settlement Price is the 4th
td within this i.e. CSS selector
This pattern is repeated for all contract years so you can return a nodeList of all matches to this (i.e. every
td:nth-child(4) with the
Year 2019 is at index position 1; this is the second element in a 0 based indexed nodeList, so you access with
CSS query result - first few results:
Get attribute value of <a> element that has no ID or class using , We can scrape web data using class names with VBA. “The class attribute specifies one or more classnames for an element. The class attribute We can use VBA to retrieve webpages and comb through those pages for data we want. This is known as web scraping. This post will look at getting data from a single web page. I've written another post that deals with getting data from multiple web pages. Here I'm going to use ServerXMLHTTP which provides a means to communicate with websites via VBA.
Common VBA Methods & Properties used in web automation , I'm trying to access a webpage through Access VBA code, but having trouble getting the Here's the example of html I'm trying to access: Link name. trouble getting the "href" form <a> element that don't have a ID or Class. VBA – Web scraping with getElementsByTagName() We already looked at getElementByID and getElementsByClassName as VBA methods for hooking onto web page elements. But when elements (HTML tags) on a page don’t have an ID or class, another common approach for accessing them is using getElementsByTagName.
VBA – Web scraping with getElementsByTagName() – Automate the , Debug.Print ele.ClassName would return: site-info. Debug.Print ele.TagName div (nothing … this element has no id=”” attached to it). Debug. You can use VBA to extract data from web pages, either as whole tables or by parsing the underlying HTML elements. This blog shows you how to code both methods (the technique is often called "web-scraping"). Two ways to get data from websites using Excel VBA. Extracting a table of data from a website using a VBA query.
VBA Getting Elements By ClassName, VBA – Web scraping with getElementsByTagName() But when elements (HTML tags) on a page don't have an ID or class, displays text content of 5th p element on a page. I'll just use a list of last names as example data. scraping with getElementsByTagName() · How to remove your name and This process is also known by the term Screen Scraping. The contents of a webpage are inserted inside HTML elements. Every HTML element has a tag and you can identify each element by its tag. To read the content of an element (from Excel using VBA), you will first have to locate the element by its tag on the webpage.