Creating a list of unique sites based on keywords

BlackWidow scans websites (it's a site ripper). It can download an entire website, or download portions of a site.
Post Reply
yamraj
Posts: 2
Joined: Thu May 17, 2012 1:12 pm

Creating a list of unique sites based on keywords

Post by yamraj » Thu May 17, 2012 1:59 pm

Hi support,
I am trying to spider and save a list of all the unique sites based on few keywords.

Basically, I want to start a scan on the list of few sites and then want to follow external links and make a unique list of them. Further, if possible I want to make a decision whether to scan the external site based on few keywords. Is this scenario possible?

Also,
Is there any API/scheduler that I can use to trigger scans? For example, can I start, stop, pause the scans at scheduled times?

Many Thanks!

User avatar
Support
Site Admin
Posts: 1854
Joined: Sun Oct 02, 2011 10:49 am

Re: Creating a list of unique sites based on keywords

Post by Support » Thu May 17, 2012 4:56 pm

Hi,

It's possible. In the Filters window, uncheck "Scan external links", but check "Verify only". This will list all of the external links in the "Ext Links" tab.

As for doing this by keywords, BW doesn't provide such thing, nor does it provide scheduling!
Your support team.
http://SoftByteLabs.com

yamraj
Posts: 2
Joined: Thu May 17, 2012 1:12 pm

Re: Creating a list of unique sites based on keywords

Post by yamraj » Thu May 17, 2012 5:26 pm

Okay, this should work. Thanks for your super quick response. You are the best!

Btw, it will be awesome if there was an API for keywords or scheduler.


****Update*****
I tried unchecking external sites and checking verify. But it only verifies the external links on the target website. It doesn't follow those external links to find for more unique external links.

Can this be achieved through Black Widow ?

User avatar
Support
Site Admin
Posts: 1854
Joined: Sun Oct 02, 2011 10:49 am

Re: Creating a list of unique sites based on keywords

Post by Support » Thu May 17, 2012 6:02 pm

No problem :)

In that case, you can check Ext Links instead and set the depth to 1. But the links will go in the Structure this time, no in the Ext Links list.

Our BrownRecluse is a programmable spider. That one can do exactly what you want. but it needs to be scripted, and there is a way to schedule the run as well.
Your support team.
http://SoftByteLabs.com

Post Reply