[FREE] Advanced WebScrapper Extension - Scrape Data From Web Sites

WebScrapper

Scrape web sites with selector. Selector -> https://is.gd/selectorkullanimi

💡  Current Version 1.0
📁  File Size 505.83 KB
📦  com.ruwis.WebScrapper
📅   Created On 2021-08-30

Method Blocks


MethodBlock1
GetAttribute - Get attribute from single elemen
element input type any
attributes input type text

MethodBlock2
GetElement - Get elements from single element
element input type any
selector input type text

MethodBlock3
GetElements - Get elements from single elemen
element input type any
selector input type text

MethodBlock4
GetText - Get Text from single elemen
element input type any

MethodBlock5
ScrapeData - Scrape web sites using selector -> Selector (jsoup Java HTML Parser 1.15.3 API)
url input type text
selector input type text

Event Blocks

EventBlock6 *
ErrorOccurred - When error occurred, get error
error output type text

EventBlock7
GotElements - When got data, it returns list
elements output type list


Not working on JS / AJAX sites

com.ruwis.WebScrapper.aix (505.8 KB)
WebScrapperExample.aia (504.8 KB)

3 Likes

Thank you for your work.
I see there is a description of the blocks, but I still don't know how to use them. Could you please create a sample project that will scrape some data from any page?

1 Like

I added an example project :blush:

1 Like

Hey, nice extension but some websites need "turn javascript on"
this is an example website
Is it possible to get source code in this one?


I tried something like this but did not work (got html codes but not exactly source code I wanted. It did not get everything on that page and inside of source code there is "please turn on javascript and reload the page" written.)

Thank you for your work. Still a lot of confusion. Can you help me get cricket scores from cricclubs.com, for example, this page: League: WEST INDIES vs IRELAND - CricClubs
I want to display the scores as:
TOTAL:
WICKETS:
OVERS:
TARGET

Is this allowed according to the TOS? CricClubs-Cricket Like Never Before!

That web page has a link to download an Excel file.

The downloaded file name ends in .do but it can be opened with a text editor like NotePad++ where it can be seen as just text, with NL line delimiters, comma field spacers, and some tab (\t) line prefixes.

You should be able to parse that download if you proceed cautiously, taking care not to exceed row list lengths after csv table conversion.

Hi will this extension scrape Bootstrap 5 Offcanvas components with AI2 ?

Ajax scaping ?

@S11 if you had bothered to read the first post

1 Like