The renewal maintenance has officially ended for Progress iMacros effective November 30, 2023.
This Wiki site will also no longer be moderated from the Progress side.
Thank you again for your business and support.
Sincerely, The Progress Team
VBS looping
Looping through a table that spans over several pages (using VBS script)
This tutorial describes how to extract data from lists that span more than one page. The method works well to collect data from typical Master and Details pages (e. g. a search listing where you need to click on the result link, collect some more details, and then go back to the search results).
Here's how it will look like in the end:
(source code is given below)
The site in question
We start with a real estate site that lists certain locations
(Note that we are not in any way affiliated to any of the companies that appear in the screenshots)
Step 1: Record extraction macro
When recording the extraction, we find that the field that contains the text "Date Posted: ... days" can be used as an anchor that allows for addressing the line of results we are interested in. Playing around the POS value of the according TAG command, we find that "1" gives the first line, "4" the second, "7" the next and so on (the POS value increasing by 3 for each line)
Step 2: Minimal script around macro
We translate the macro into a VBS script
which extracts the first entry when run
Step 3: Looping through all results on one page
Now, we put the macro into a loop which
- sets the anchor's POS value by the variable "counter"
- increments the counter (i.e. the anchor's POS value) by 3 after each extraction
- loops until the extraction throws an error (return value iret < 0)
Additionally, we shorten the !TIMEOUT value so the macro does not wait 60s before returning the error when the end of the list is reached.
which then extracts not only the first, but all entries on that page of result
Step 4: Looping through all the pages
Finally we want the script to scrape all pages of result.
On every page, there is a "Next" link:
recording this link leads to the following TAG command:
This command is to be performed, when the loop reaches the end of the recent page (i.e. the extraction macro fails). So we add the following code at the end of the loop, which
- checks whether the end of the list is reached
- tries to move to next page
- in case there is a next page, the loop starts again (counter is reset, return value is not negative)
- in case there is no next page, the TAG fails, and the loop ends
And here we are: The script scrapes the first page of result, then moves on to scraping the next one. And thus fetches the data from all items on all the pages.
The Script's Source Code
Option Explicit Dim iim1,iret Set iim1 = CreateObject("iMacros") iret = iim1.iimOpen("-ng", False) 'connect to open iMacros browser window Dim macro macro = macro & "VERSION BUILD=6110122 " & vbNewLine macro = macro & "TAB T=1" & vbNewLine macro = macro & "TAB CLOSEALLOTHERS" & vbNewLine macro = macro & "SET !TIMEOUT 6" & vbNewLine macro = macro & "TAG POS={{counter}} TYPE=TD ATTR=TXT:*Date<SP>Posted:*days" & vbNewLine macro = macro & "TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT" & vbNewLine macro = macro & "TAG POS=R1 TYPE=TD ATTR=TXT:*Bath* EXTRACT=TXT" & vbNewLine macro = macro & "TAG POS=R2 TYPE=TD ATTR=TXT:* EXTRACT=TXT " & vbNewLine macro = macro & "TAG POS=R1 TYPE=TD ATTR=TXT:$* EXTRACT=TXT" Dim counter counter = 1 Do While Not (iret < 0) iim1.iimSet "counter", counter iret = iim1.iimPlayCode(macro) msgbox (iim1.iimGetExtract()) counter = counter + 3 If (iret < 0) Then 'end of list reached -> next page iret = iim1.iimPlayCode("TAG POS=1 TYPE=A ATTR=TXT:Next") counter = 1 End If Loop