VBA web scraping contents without class name or ID

vba getelementsbytagname div class
vba get element by class
vba list all html elements
vba get child element

I would like to scrape Dividend Future Prices from HKEX.

Here's the URL of this site : http://www.hkex.com.hk/Market-Data/Futures-and-Options-Prices/Equity-Index/HSCEI-Dividend-Futures?sc_lang=en#&product=DHH

I wanted to scrape Prev.Day settlement price of the "Dec-19 Contract" via VBA. However, it doesn't have any class name or id, so I have no idea how to access the information.

<tr>
  <td>Dec-19</td>
  <td>-</td>
  <td>-</td>
  <td>413.78</td>
  <td>
    -
    <br>
    -
  </td>
  <td>-</td>
  <td>
    -
    <br>
    -
  </td>
    <td>-<td>
    <td>17,330</td>
  </tr>

How can I scrape this via VBA?


It's really the hell of an automation to find specific item with no remarkable flag attached to it. However, I've written this script without hardcoding index to the elements. Give this a shot and get your desired values:

Sub Hkex_Data()

    Dim IE As New InternetExplorer, html As HTMLDocument
    Dim posts As Object

    With IE
        .Visible = False
        .navigate "http://www.hkex.com.hk/Market-Data/Futures-and-Options-Prices/Equity-Index/HSCEI-Dividend-Futures?sc_lang=en#&product=DHH"
        Do Until .readyState = READYSTATE_COMPLETE: Loop
        Set html = .document
    End With
    Application.Wait (Now + TimeValue("0:00:05"))

    For Each posts In html.getElementsByClassName("hsirowcon")
        Row = Row + 1: Cells(Row, 1) = posts.NextSibling.NextSibling.FirstChild.innerText
        Cells(Row, 2) = posts.NextSibling.NextSibling.LastChild.innerText
    Next posts

    IE.Quit
End Sub

Result:

19-Dec  17,330

Reference to add to the library:

Microsoft internet controls
Microsoft Html Object Library

Scraping a website HTML in VBA, I wanted to scrape Prev.Day settlement price of the "Dec-19 Contract" via VBA. However, it doesn't have any class name or id, so I have no idea how to access  VBA web scraping contents without class name or ID. it doesn't have any class name or id, so I have no idea how to access the information. vba web-scraping


Use getElementsByTagName. Identify your and then go through each row and each td in rows. Something like that.

Dim objTR As IHTMLElement
Dim objTD As IHTMLElement
Dim objTable As IHTMLElement

For Each objTR In objTable.getElementsByTagName("tr")
    For Each objTD In objTR
        'do something with objtd.innerText
    Next objTD
Next objTR

or you can declare your variables as Object if you prefer late binding.

How to Scrape Web Data Using Class Names with VBA – Free Excel , You can use VBA to extract data from web pages, either as whole Don't forget that websites change all the time, so this code may no Id, We'll find the div tag with class question-summary narrow, and Votes, We'll find the div tag with class name votes, and look at the inner text for this (ie the contents of  Below find 2 quick UDF functions (user defined functions) that you can use to scrape html items by id and name. Scrape HTML elements in Excel by ID, name or Regex. Be sure to check out my VBA Web Scraping Kit and the


You could also simply use a CSS selector and no loop:

html.querySelectorAll("td:nth-child(4)")(1).innerText

This method is fragile. If the style on the page changes this may break.


CSS selector:

If you observe the relevant part of the page (showing first contract year with headers for context and with chart between contract years removed):

The associated HTML for contract year 2019 is:

Prev.Day Settlement Price is the 4th td within this i.e. CSS selector td:nth-child(4).

This pattern is repeated for all contract years so you can return a nodeList of all matches to this (i.e. every td:nth-child(4) with the .querySelectorAll method).

Year 2019 is at index position 1; this is the second element in a 0 based indexed nodeList, so you access with .querySelectorAll("td:nth-child(4)")(1).


CSS query result - first few results:

Get attribute value of <a> element that has no ID or class using , We can scrape web data using class names with VBA. “The class attribute specifies one or more classnames for an element. The class attribute  We can use VBA to retrieve webpages and comb through those pages for data we want. This is known as web scraping. This post will look at getting data from a single web page. I've written another post that deals with getting data from multiple web pages. Here I'm going to use ServerXMLHTTP which provides a means to communicate with websites via VBA.


Common VBA Methods & Properties used in web automation , I'm trying to access a webpage through Access VBA code, but having trouble getting the Here's the example of html I'm trying to access: Link name. trouble getting the "href" form <a> element that don't have a ID or Class. VBA – Web scraping with getElementsByTagName() We already looked at getElementByID and getElementsByClassName as VBA methods for hooking onto web page elements. But when elements (HTML tags) on a page don’t have an ID or class, another common approach for accessing them is using getElementsByTagName.


VBA – Web scraping with getElementsByTagName() – Automate the , Debug.Print ele.ClassName would return: site-info. Debug.Print ele.TagName div (nothing … this element has no id=”” attached to it). Debug. You can use VBA to extract data from web pages, either as whole tables or by parsing the underlying HTML elements. This blog shows you how to code both methods (the technique is often called "web-scraping"). Two ways to get data from websites using Excel VBA. Extracting a table of data from a website using a VBA query.


VBA Getting Elements By ClassName, VBA – Web scraping with getElementsByTagName() But when elements (​HTML tags) on a page don't have an ID or class, displays text content of 5th p element on a page. I'll just use a list of last names as example data. scraping with getElementsByTagName() · How to remove your name and  This process is also known by the term Screen Scraping. The contents of a webpage are inserted inside HTML elements. Every HTML element has a tag and you can identify each element by its tag. To read the content of an element (from Excel using VBA), you will first have to locate the element by its tag on the webpage.