dexi-io-geo-maps-logoThe Dexi.io web scraping service has remade its functionality by adding [paid plan] addons. Through addons, more features are made available to customers, e.g. more step types/pipe actions. Those features also allow the integration of scrape results to data stores and endpoints like PostgreSQL, MySQL, Amazon S3 and other. 

The addon functions are for unlocking the corresponding functionality for robot (Pipe, Extractor) actions. However, you need a professional plan (all plans) to use most of the addons. Major addon categories are the following:

  • Captcha solving (integration with 3-d party solving services)
  • Extractors
  • Geography (communication with social networks for geo info)
  • Image processing
  • Integrations (AWS, Google Drive, Google Sheets, Box, (S)FTP, Webhook)
  • Machine learning (Text analysis with machine learning – MonkeyLearn service integration)
  • Math
  • Social media
  • Text analysis

We’ll try two addons and you can also contribute your experience of Dexi.io addons in the comments.

Image Manipulation addon

First you need to add an Image Manipulation addon. Note, this addon performs image manipulations only in Pipe robots.

  1. Create an Extractor robot to get images from aliexpress or any other source.
  2. Create Pipes robot and include Execute robot action (or node), linking to the previously created robot.
  3. From Transforms actions choose As Fields node to treat scraped images as data fields.
  4. Now, based on unlocked Image Manipulation functionality, you add Resize image from Images actions set; see picture below.

 

dexi-io-image-functionality-unlocked

Now you run the Pipe robot (creating configuration) and get the execution results.

Google GEO addon

Get Google Geocoding API credentials

To apply this addon you need to check with your Google cloud console and get developer’s API keys.

dexi-maps-geo-addon-google-api-key

Now we can use that API key in an addon configuration.

Configure addon

Open the Google Maps Geocoding addon and edit it, inserting the API key that you’ve gotten from Google Maps Console (Credentials):

dexi-maps-geo-addon-configure

Ok. Now the addon functionality is available among the Pipe actions.

Create data type and load a data set

To test the addon we assemble a list of cities in Canada and save them as a Data set. But before that we need to create a Data type matching to the new Data set.

Create a new data type Cities Simple containing 3 fields: name, latitude, longitude.

Now I created a new data set named Canada cities and loaded data from CSV file into it.

dexi-io-maps-geo-load-data-set

After the import, the data set looks like this:

dexi-io-geo-cities-in-data-set

Pipe robot to process Geo info

Now it’s time to make a new Pipe robot.

Open a new Pipe robot and add new Pipe actions into it.

  1. From Dexi.io actions category choose and add an action: For each row in data set. Configure it adding the Canada cities data set.
  2. Add As Fields action. Thus each row from data set will be treated as corresponding fields.
  3. From Geography actions choose Address lookup.dexi-io-geo-address-lookup
    Configure it with your existing Google Maps Geocoding addon.
    dexi-maps-geo-addon-insert
    and connect only the name field (from As Fields node) to that Address lookup node input called address.
  4. Now let’s get Address lookup rows into fields. Add another As Fields node.

See the full Pipe diagram.

dexi-maps-geo-full-pipe

If we need only the city name and latitude and longitude, then we can add a data type node at the end to restrict the output result to only that type of fields.dexi-io-geo-maps-restrict-output

Save it, close and create a New run. The Dexi will offer to make a new configuration. Name it, and start execution.

The results are excellent (below we showed all non-empty Addon lookup columns/fields).

dexi-io-geo-map-execution-results

Conclusion

The Dexi web scraping tool is an excellent mix of data extraction robots (Extractor, Crawler) along with data post-processing (images processing, geo info, cloud integrations, etc.) and social media retrievals. This is possible thru addon-added features. It works well for both medium-size and enterprise level extractions.