If you are looking for cloud-based web scrapers or crawlers, you might have heard of both Import.io and TeraCrawler.io.
While many products in this domain might look more or less the same, but the ideal use case might vary tremendously.
Here are some of the differences between the 2.
The main use case
TeraCrawler is ideal when you just need to download a large amount of data regularly and be able to set rules like crawling depth, URL patterns to download or block, and the types of files/documents you need to download.
Generally, what you will get as an output is a big fat file that you can download and process locally at your convenience.
Import.io is ideal when you want to scrape data directly on the web. It's ideal when you know exactly what you want and want the crawling and scraping to be done on the cloud for final consumption. That way, Import.io skips the downloading and processing of the data part.
Who is it for?
TeraCrawler is ideal for developers who would like to have control over how they scrape the data but dont want to deal with the setting up and monitoring of resources and pitfalls of the crawling of the data.
Import.io is ideally for end customers who are not developers who just want the data in exportable columns and rows format.
Company focus
TeraCrawler is in the business of crawling data efficiently, quickly, predictably, and at scale for developers who deal with the problem of large scale web crawling. Its a fully automated SAAS offering for both mid-market to enterprise-grade customers.
Import.io has shifted its focus to managed data services and is more focused on the enterprise segment.