restart a crawl?

BlackWidow scans websites (it's a site ripper). It can download an entire website, or download portions of a site.
Post Reply
alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

restart a crawl?

Post by alpha2 » Sat Nov 03, 2012 1:36 pm

Hi,

Crawling now works phantastic. Thanks for the hints.

Sorry - I have another question: The crawl detects a lot of links. As it is a shared computer and internet connection is slow, I cannot wait until everything is crawled. Is there any chance to store the current status including the links already identified and continue the crawling process later - e.g. the next day? I have to shutdown the computer between the crawls. Not only the "to do list" shall be reconstructed, but also the list of already crawled html files.

Best regards,

Alpha2

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: restart a crawl?

Post by Support » Sat Nov 03, 2012 1:58 pm

Gio to the Structure and save the scan to a bw6 file. Later, re-open it (from the Structure) and start the scan, it should resume from where it left off.
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: restart a crawl?

Post by alpha2 » Sat Nov 03, 2012 5:11 pm

It restarted somehow. But in the list there were 400 remaining before. When I restartet it, there were 0 remaining. Somehow the remaining files got lost.

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: restart a crawl?

Post by Support » Sat Nov 03, 2012 6:07 pm

Well, this mean it doesn't resume. but, what if you put your system in sleep mode? This will retain all data and keep all programs open. All you have to do is Pause the scan (do not exit BW), put the computer to sleep, then when you turn it back on, go to the BW browser and login to the site, then click Scan to continue scanning.
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: restart a crawl?

Post by alpha2 » Sat Nov 03, 2012 6:22 pm

So there is no chance to somehow export the "remaining"? I would anyways be interested to have a look at this list. So you neither can show nor export nor import the remaining items?

The sleep mode is not so easy, as I'm not the only user of the PC. Other users might want to shut down both BW and the PC...

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: restart a crawl?

Post by Support » Sat Nov 03, 2012 6:31 pm

I guess not! But I know that BW v5 does save the scan so you can resume. Perhaps it got dropped in v6 because of the fact that most sites now uses session ID, and expires after x number of minutes ifnot used, and resuming will produce nothing but 404 errors. You can install v5 along with v6, into a diffrent folder, and use v5 for this site, and v6 for other sites not requiring a resume. If you own v6, you can get a registration for v5 at no cost.
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: restart a crawl?

Post by alpha2 » Sat Nov 03, 2012 6:54 pm

Where do I get BW version 5? I'm still experimenting with BW in the trial period.

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: restart a crawl?

Post by Support » Sat Nov 03, 2012 7:00 pm

You can get it from our download page...

http://softbytelabs.com/us/downloads.html
Your support team.
http://SoftByteLabs.com

Post Reply