Case studies·

Guide: Web Scraping Multi-Page Websites with uProc and n8n

uProc saved time and technical resources by using n8n to collect and process data from a multi-page website

Guide: Web Scraping Multi-Page Websites with uProc and n8n

uProc saved time and technical resources by using n8n to collect and process data from a multi-page website.

Miquel Colomer describes himself as a "passionate IT, data, and open-source guy." He is also the founder of uProc (and creator of the uProc node), a company that provides data solutions such as collection, cleaning, and automation.

One of his projects involved collecting and processing data from a multi-page website – a task he completed using n8n. Let's explore the common challenges in web scraping, how Miquel built a low-code workflow for this use case, and his advice for getting started with process automation.

Use Case: Collecting Banking Information

uProc creates tools that simplify accessing data, collecting data (about individuals, companies, products, etc.), and using the Internet as a data source. In one project, Miquel had to create two tools to collect bank-related data: financial data by Swift code and Swift code by IBAN account number.

The benefit of using the Internet as a data source is that it provides an enormous amount of data, so Miquel could find the information he needed. However, when using this information in his application, he encountered several challenges.

Challenges in Web-Scraping

When trying to collect data from the Internet, Miquel frequently faced three main challenges:

  • Data scattered across different sources, which makes collection and maintenance difficult.
  • Data available in different formats (e.g., HTML, RSS, CSV, XML), which makes combining and processing complex.
  • Data sometimes outdated, which makes building reliable and useful applications difficult.

Eventually, Miquel found the Swift codes he needed for his application at https://www.theswiftcodes.com. In the next step, he needed to collect this data in a structured way. Initially, he used Python scripts with dedicated web-crawling libraries like Scrapy.

Although the scripts could complete the task, writing the code involved repetitive and time-consuming work, as it included selecting the right tags and selectors, formatting and processing the data to make it usable in the final application.

To avoid writing manual lengthy code, Miquel turned to process automation with n8n.

Low-Code Solution for Multi-Page Website Scraping

Miquel built a 22-node low-code workflow to collect data from paginated static websites. This workflow extracts data from each country page on the website https://www.theswiftcodes.com/browse-by-country/ and stores the collected information in MongoDB.

Figure 1: uProc-web-scraping-workflow.png

To perform the individual tasks in the web scraping process, this workflow uses two regular nodes (MongoDB and uProc) and ten core nodes:

  • Execute Command node (to automatically create a local cache directory before starting the web scraping process and avoid re-scraping already scraped pages)
  • HTTP Request node (to access data from the website https://www.theswiftcodes.com)
  • HTML Extract node (to extract the desired content from the website based on their HTML tags)
  • Function and Function Item nodes (to run custom JavaScript code, for example, to set up additional pages to scrape)
  • Set node (to set required fields before transferring data)
  • IF node (to filter information by conditional logic, for example, checking if a Swift code already exists in the database)
  • Read Binary File and Write Binary File nodes (to read and write data collected from the website)
  • Split In Batches node (to iterate through data)

With this workflow, Miquel not only completed his project but also saved valuable time and resources by automating repetitive programming work.

Getting Started with Process Automation

Process automation tools like n8n help you design powerful automated workflows, increasing productivity and reducing human error. You can combine apps, services, and core functions to automate common small tasks, enhance workflows with a few lines of JavaScript code, and even create products supported by automation.

Use no-code or low-code solutions to create MVPs or quick jobs. I avoid programming, only programming what I need.

Miquel's advice for anyone wanting to use process automation is:

Use your imagination to create side projects. Think of a problem you need to solve and try to solve it with n8n.


Here's a simple guide to get started with N8nHosting:

  1. Visit n8nhosting.app – Open the website to explore hosting options.
  2. Choose a hosting plan that fits your needs – Select from available hosting plans based on your requirements.
  3. Complete registration and payment process – Follow simple steps to sign up and pay to get started.

Once done, you'll be ready to host your n8n workflows and start automating!