VBA HTML Listing Info Pull
vba html object library
vba getelementsbytagname div class
automatically pull data from website into excel vba
vba parse html file
vba read html file
extract data from website to excel vba
I am looking to follow a series of URL's that are found in column A (example: https://www.ebay.com/itm/Apple-iPhone-7-GSM-Unlocked-Verizon-AT-T-TMobile-Sprint-32GB-128GB-256GB/352381131997?epid=225303158&hash=item520b8d5cdd:m:mWgYDe4a79NeLuAlV-RmAQA:rk:7:pf:0) and pull the following information from them: - Title - Price - Description
I think there are multiple issues with my code... For one, I can't get the program to follow specific URL's listed in the Excel (only if I specify one within the code). Also, pulling multiple fields has given me issues.
Option Explicit Public Sub ListingInfo() Dim ie As New InternetExplorer, ws As Worksheet, t As Date Dim i As Integer i = 0 Do While Worksheets("Sheet1").Cells(i, 1).Value <> "" Const MAX_WAIT_SEC As Long = 10 Set ws = ThisWorkbook.Worksheets("Sheet1") With ie .Visible = True .Navigate2 Worksheets("Sheet1").Cells(i, 1).Value While .Busy Or .readyState < 4: DoEvents: Wend Dim Links As Object, i As Long, count As Long t = Timer Do On Error Resume Next Set Title = .document.querySelectorAll("it-ttl") Set price = .document.querySelectorAll("notranslate") Set Description = .document.querySelectorAll("ds_div") count = Links.Length On Error GoTo 0 If Timer - t > MAX_WAIT_SEC Then Exit Do Loop While count = 0 For i = 0 To Title.Length - 1 ws.Cells(i + 1, 1) = Title.item(i) ws.Cells(i + 1, 2) = price.item(i) ws.Cells(i + 1, 3) = Description.item(i) Next .Quit i = i + 1 Loop End With End Sub
I would use late binding for
MSXML2.XMLHTTP and set a reference to the Microsoft HTML Object Library for the HTMLDocument.
querySelector() references the first item it finds that matches its search string.
Here is the short version:
Public Sub ListingInfo() Dim cell As Range With ThisWorkbook.Worksheets("Sheet1") For Each cell In .Range("A1", .Cells(.Rows.Count, 1).End(xlUp)) Dim Document As MSHTML.HTMLDocument With CreateObject("MSXML2.XMLHTTP") .Open "GET", cell.Value, False .send Set Document = New MSHTML.HTMLDocument Document.body.innerHTML = .responseText End With cell.Offset(0, 1).Value = Document.getElementByID("itemTitle").innerText cell.Offset(0, 2).Value = Document.getElementByID("prcIsum").innerText If Not Document.querySelector(".viSNotesCnt") Is Nothing Then cell.Offset(0, 3).Value = Document.querySelector(".viSNotesCnt").innerText Else 'Try Something Else End If Next End With End Sub
A more elaborate solution would be to break the code up into smaller routines and load the data into an Array. The main advantage of this is that you can test each subroutine separately.
Option Explicit Public Type tListingInfo Description As String Price As Currency Title As String End Type Public Sub ListingInfo() Dim source As Range Dim data As Variant With ThisWorkbook.Worksheets("Sheet1") Set source = .Range("A1:D1", .Cells(.Rows.count, 1).End(xlUp)) data = source.Value End With Dim r As Long Dim record As tListingInfo Dim url As String For r = 1 To UBound(data) record = getListingInfo() url = data(r, 1) record = getListingInfo(url) With record data(r, 2) = .Description data(r, 3) = .Price data(r, 4) = .Title End With Next source.Value = data End Sub Public Function getListingInfo(url As String) As tListingInfo Dim ListingInfo As tListingInfo Dim Document As MSHTML.HTMLDocument Set Document = getHTMLDocument(url) With ListingInfo .Description = Document.getElementByID("itemTitle").innerText .Price = Split(Document.getElementByID("prcIsum").innerText)(1) .Title = Document.querySelectorAll(".viSNotesCnt")(0).innerText Debug.Print .Description, .Price, .Title End With End Function Public Function getHTMLDocument(url As String) As MSHTML.HTMLDocument Const READYSTATE_COMPLETE As Long = 4 Dim Document As MSHTML.HTMLDocument With CreateObject("MSXML2.XMLHTTP") .Open "GET", url, False .send If .readyState = READYSTATE_COMPLETE And .Status = 200 Then Set Document = New MSHTML.HTMLDocument Document.body.innerHTML = .responseText Set getHTMLDocument = Document Else MsgBox "URL: " & vbCrLf & "Ready state: " & .readyState & vbCrLf & "HTTP request status: " & .Status, vbInformation, "URL Not Responding" End If End With End Function
Extract or Get data from HTML Element in Excel using VBA, Web queries are one simple feature in Excel that you can use to import data from (it's also called the unordered list element) and it has an id (ids are unique). See this image! Extract HTML Element Contents in WebPage using VBA in Excel. In-addition, to read and extract contents of HTML elements, we’ll have to create few objects using a library. Therefore, first add the Microsoft HTML Object Library reference to the application. From the top menu of your VBA editor, click Tools -> References…. In the References window, find and select Microsoft HTML Object Library and click OK.
There are a lot of things to fix in your code. It is late here so I will just give pointers (and update fully later) and working code below:
- Declare all variables and use appropriate type
For Loopsand how transpose can be used to create a 1d array of urls pulled from sheet to loop over
- Review the difference between querySelector and querySelectorAll methods
- Review CSS selectors (you are specifying everything as type selector when in fact you are not selecting by tag for the elements of interest; nor by your stated text)
- Think about placement of your IE object creation and of your .Navigate2 to make use of existing object
- Make sure to use distinct loop counters
- Be sure not to overwrite values in sheet
Option Explicit Public Sub ListingInfo() Dim ie As New InternetExplorer, ws As Worksheet Dim i As Long, urls(), rowCounter As Long Dim title As Object, price As Object, description As Object Set ws = ThisWorkbook.Worksheets("Sheet1") urls = Application.Transpose(ws.Range("A1:A2").Value) '<= Adjust With ie .Visible = True For i = LBound(urls) To UBound(urls) If InStr(urls(i), "http") > 0 Then rowCounter = rowCounter + 1 .Navigate2 urls(i) While .Busy Or .readyState < 4: DoEvents: Wend Set title = .document.querySelector(".it-ttl") Set price = .document.querySelector("#prcIsum") Set description = .document.querySelector("#viTabs_0_is") ws.Cells(rowCounter, 3) = title.innerText ws.Cells(rowCounter, 4) = price.innerText ws.Cells(rowCounter, 5) = description.innerText Set title = Nothing: Set price = Nothing: Set description = Nothing End If Next .Quit End With End Sub
Scraping a website HTML in VBA, You can use VBA to extract data from web pages, either as whole tables The home page lists out the questions which have been asked most As this piece of HTML is quite unique in the whole HTML content we can simply replace the header between the strong tags with the following regex (.*?). This will extract any string of characters between these tags which are not end of line characters.
Here's an approach using Web Requests, using MSXML. It should be significantly faster than using IE, and I'd encourage you to strongly consider using this approach wherever possible.
You'll need references to Microsoft HTML Object Library and Microsoft XML v6.0 to get this working.
Option Explicit Public Sub SubmitRequest() Dim URLs As Excel.Range Dim URL As Excel.Range Dim LastRow As Long Dim wb As Excel.Workbook: Set wb = ThisWorkbook Dim ws As Excel.Worksheet: Set ws = wb.Worksheets(1) Dim ListingDetail As Variant Dim i As Long Dim j As Long Dim html As HTMLDocument ReDim ListingDetail(0 To 2, 0 To 10000) 'Get URLs With ws LastRow = .Cells(.Rows.Count, 1).End(xlUp).Row Set URLs = .Range(.Cells(1, 1), .Cells(LastRow, 1)) End With 'Update the ListingDetail For Each URL In URLs Set html = getHTML(URL.Value2) ListingDetail(0, i) = html.getElementByID("itemTitle").innertext 'Title ListingDetail(1, i) = html.getElementByID("prcIsum").innertext 'Price ListingDetail(2, i) = html.getElementsByClassName("viSNotesCnt")(0).innertext 'Seller Notes i = i + 1 Next 'Resize array ReDim Preserve ListingDetail(0 To 2, 0 To i - 1) 'Dump in Column T,U,V of existing sheet ws.Range("T1:V" & i).Value = WorksheetFunction.Transpose(ListingDetail) End Sub Private Function getHTML(ByVal URL As String) As HTMLDocument 'Add a reference to Microsoft HTML Object Library Set getHTML = New HTMLDocument With New MSXML2.XMLHTTP60 .Open "GET", URL .send getHTML.body.innerHTML = .responseText End With End Function
How to Use VBA to List Objects in Access Database, How to use Access VBA functions to list all tables, queries, forms and reports in How to use VBA to List Microsoft® Access Database Objects any liability for loss or damage to data to which any techniques, methods or code to Excel Extract dates from strings Form design Form #Error List DB objects List all file names from a folder into worksheet by using VBA code. Using the following VBA to list files in a folder in a worksheet: 1. Open a worksheet, and click to select a cell where you want to put the filenames. 2. Hold down the ALT + F11 keys in Excel, and it opens the Microsoft Visual Basic for Applications window. 3.
VBA: WebScraping to pull data using <ul, <li, <div, <span tags , I need to pull <div class codes which are highlighted in yellow color in the Html code to excel file. The out put that is need is listed below. 64238. Creating Data Validation (Using VBA): Using the code below a drop down list (data validation) will be created in the cell “J2” . The data for the drop down list will come from the range “=A1:A6” in the sheet “Sheet1”.
Multiple List Box Selections in Excel VBA, Next, you can drag a list box on the Userform. When you arrive at the 'Select Type' frame, remember to draw this frame first before you place the three option How to extract data to another worksheet using VBA? We have Company Report with Employee Name, State, Emp Id, Manager & Process. The above data contains more than 250 rows.
Excel VBA Combo Box, A combo box is a drop-down list from where a user can select an item or fill in his/her own choice. To create a combo box in Excel VBA, execute the following VBA – Get Information From a Web Page – Excel Macro – Amazon Example July 13, 2015 / in Excel , Reference , Tutorial / by admin This is a basic VBA code to get started with InternetExplorer.Application object and work with web browsers and data extraction.
- On the first iteration of your
iis going to be equal to zero - also this code shouldn't compile as you have a duplicate declaration in your scope (you're declaring
- You're also using
Option Explicitbut I don't see where you declare
Description- what am I missing here? This sub shouldn't even be able to run.
- I like this! One small issue that just came to light for me... There seems to be some listings without Document.querySelector(".viSNotesCnt").innerText which then causes an error. Any ideas on how to pull these too, or add a column that pulls this description as well and leave the #N/A's as blanks? Example: ebay.com/itm/Apple-iPhone-X-256GB-Space-Gray-Unlocked-GSM/…
- @RCarmody I modified the short version to escape the error. I'm not sure what information that you would want to use to replace it.
- So that worked for the one, but when I try and apply it elsewhere (say, for the prcISum) then I get a type mismatch error. Is this because one is looking at class and the other ID?
- @RCarmody the proper way to do it is to create a variable for each item and test if the variable is
Nothingafter setting it.
- One more question... I got everything working, but this is taking approx 21min per 1000 URLs. I have thousands of URL's to go through - is there any way to speed this up?