No-Code Web Crawling and Scraping Software and Services

No-Code Web-App

The crwl.io web app is a so-called no-code tool, that allows you to configure crawlers to your needs without any programming knowledge.

Our user interface enables you to define crawling and scraping procedures using pre-made and configurable building blocks known as "Steps".

Once you run your crawler, it performs the steps you defined and eventually provide you with the desired data.

Custom Extensions for Maximum Flexibility

If you need additional functionality for your crawlers that is not already included in the pre-made steps, you have the option to create your own steps and install them as extensions in the web app. *

Instructions for programming your custom steps can be found in the documentation of the crwlr.software open-source library. You can then conveniently share your own code through a (private) GitHub repository. You'll find more detailed information on this feature in the web app.

* This feature is available starting from the S plan and is not included in the XS plan.

More than Just HTTP and HTML

Typically, the term "web scraping" refers to extracting content from (HTML) websites, which is why many services focus solely on that. However, in practice, there are often cases where data needs to be extracted from other formats like JSON, XML, or CSV. With crwl.io, that's no problem.

Javascript Execution or Performance - Maximum Flexibility

Many web crawling and scraping libraries and services offer the sole option of loading websites using a so-called headless browser (an automated, regular web browser without a user interface). However, in most cases, the use of a browser is not actually necessary.

In most situations a simple HTTP client loading only the HTML source code of a website without the linked assets (such as images, CSS, and JavaScript), is sufficient. Consequently, the HTTP client is much more efficient and performant, and it is the default choice in the crwl.io web app. And if needed, you can always switch to using a headless browser.

Scheduling

Of course, you can start your crawlers not only manually on demand but also schedule them to run automatically at the times you prefer. This way, you keep your crawling data up to date continuously.

Flexible Data Export

After a crawler ran successfully, you can easily download the collected data as a JSON, XML, or CSV file. If you wish to integrate crwl.io crawlers into your own or third-party applications, you can also retrieve your data through our REST API. When combined with webhooks, you can fully automate the integration into your applications.

Webhooks

Webhooks are the final ingredient to smoothly integrate the data collected by crwl.io into your own applications. By setting up a webhook URL (a URL that is part of your application) for a crawler, it will notify your application after each successful run. The webhook URL's invocation will transmit the necessary data for retrieving the results of the crawler run.

Built on Open Source Software

The foundation of the crwl.io web app is the free and open-source web crawling and scraping library from crwlr.software. Therefore, you can always see in detail how the crawlers and the steps available in the app work and, if necessary, contribute improvements or changes.

Pricing

monthly

yearly

Limitation/Feature	XS	S	M	L
Requests/Day¹ Requests/Month	5.000 150.000	15.000 450.000	60.000 1.800.000	250.000 7.500.000
Storage²	1 GB	5 GB	20 GB	50 GB
Private Instance ³
Custom Extensions⁴
Price including VAT	€ 36 per month	€ 90 per month	€ 300 per month	€ 900 per month
Price including VAT	€ 396 per year	€ 990 per year	€ 3.300 per year	€ 9.900 per year

1) Refers to HTTP requests executed by your crawlers. It's important to note that requests sent via a headless browser are weighted by a factor of five, as they are demanding significantly more resources. See Javascript Execution . The daily limit is based on the average number of daily HTTP requests within a month. If the limit is exceeded on certain days, it's not an issue, as long as the daily average remains below it.

2) The required storage space for the data collected by the crawlers, as well as the use of the response cache.

3) In the XS plan, all crawlers run on a shared infrastructure. Starting from the S plan and above, each customer gets their own instance of the crwl.io app.

4) For the same reason (shared infrastructure in the XS plan), it is only possible to install custom extensions in the app starting from the S plan.

Automate Collecting DataFrom The Web