Automate Collecting Data
From The Web

crwl.io is a web crawling and scraping service.
Use our no-code tool to configure crawlers and scrapers yourself.
Or simply have your crawlers conveniently created by us.

No-Code Web-App

Symbolic image for the feature paragraph "No-Code Web-App"

The crwl.io web app is a so-called no-code tool, that allows you to configure crawlers to your needs without any programming knowledge.

Our user interface enables you to define crawling and scraping procedures using pre-made and configurable building blocks known as "Steps".

Once you run your crawler, it performs the steps you defined and eventually provide you with the desired data.

Custom Extensions for Maximum Flexibility

Symbolic image for the feature paragraph "Custom Extensions for Maximum Flexibility"

If you need additional functionality for your crawlers that is not already included in the pre-made steps, you have the option to create your own steps and install them as extensions in the web app. *

Instructions for programming your custom steps can be found in the documentation of the crwlr.software open-source library. You can then conveniently share your own code through a (private) GitHub repository. You'll find more detailed information on this feature in the web app.

* This feature is available starting from the S plan and is not included in the XS plan.

More than Just HTTP and HTML

Symbolic image for the feature paragraph "More than Just HTTP and HTML"

Typically, the term "web scraping" refers to extracting content from (HTML) websites, which is why many services focus solely on that. However, in practice, there are often cases where data needs to be extracted from other formats like JSON, XML, or CSV. With crwl.io, that's no problem.

Javascript Execution or Performance - Maximum Flexibility

Symbolic image for the feature paragraph "Javascript Execution or Performance - Maximum Flexibility"

Many web crawling and scraping libraries and services offer the sole option of loading websites using a so-called headless browser (an automated, regular web browser without a user interface). However, in most cases, the use of a browser is not actually necessary.

In most situations a simple HTTP client loading only the HTML source code of a website without the linked assets (such as images, CSS, and JavaScript), is sufficient. Consequently, the HTTP client is much more efficient and performant, and it is the default choice in the crwl.io web app. And if needed, you can always switch to using a headless browser.

Scheduling

Symbolic image for the feature paragraph "Scheduling"

Of course, you can start your crawlers not only manually on demand but also schedule them to run automatically at the times you prefer. This way, you keep your crawling data up to date continuously.

Flexible Data Export

Symbolic image for the feature paragraph "Flexible Data Export"

After a crawler ran successfully, you can easily download the collected data as a JSON, XML, or CSV file. If you wish to integrate crwl.io crawlers into your own or third-party applications, you can also retrieve your data through our REST API. When combined with webhooks, you can fully automate the integration into your applications.

Webhooks

Symbolic image for the feature paragraph "Webhooks"

Webhooks are the final ingredient to smoothly integrate the data collected by crwl.io into your own applications. By setting up a webhook URL (a URL that is part of your application) for a crawler, it will notify your application after each successful run. The webhook URL's invocation will transmit the necessary data for retrieving the results of the crawler run.

Built on Open Source Software

Symbolic image for the feature paragraph "Built on Open Source Software"

The foundation of the crwl.io web app is the free and open-source web crawling and scraping library from crwlr.software. Therefore, you can always see in detail how the crawlers and the steps available in the app work and, if necessary, contribute improvements or changes.

Pricing

monthly
yearly
XS S M L
Requests/Day1
Requests/Month
5.000
150.000
15.000
450.000
60.000
1.800.000
250.000
7.500.000
Storage2 1 GB 5 GB 20 GB 50 GB
Private Instance 3
Extensions4
Price including VAT € 36
per month
€ 72
per month
€ 240
per month
€ 720
per month

1) Refers to HTTP requests executed by your crawlers. It's important to note that requests sent via a headless browser are weighted by a factor of five, as they are demanding significantly more resources. See Javascript Execution . The daily limit is based on the average number of daily HTTP requests within a month. If the limit is exceeded on certain days, it's not an issue, as long as the daily average remains below it.

2) The required storage space for the data collected by the crawlers, as well as the use of the response cache.

3) In the XS plan, all crawlers run on a shared infrastructure. Starting from the S plan and above, each customer gets their own instance of the crwl.io app.

4) For the same reason (shared infrastructure in the XS plan), it is only possible to install custom extensions in the app starting from the S plan.

Beta Signup

The crwl.io app is currently in closed beta.
You can pre-register for an invitation here.