Page 1 of 1

restart a crawl?

Posted: Sat Nov 03, 2012 1:36 pm
by alpha2
Hi,

Crawling now works phantastic. Thanks for the hints.

Sorry - I have another question: The crawl detects a lot of links. As it is a shared computer and internet connection is slow, I cannot wait until everything is crawled. Is there any chance to store the current status including the links already identified and continue the crawling process later - e.g. the next day? I have to shutdown the computer between the crawls. Not only the "to do list" shall be reconstructed, but also the list of already crawled html files.

Best regards,

Alpha2

Re: restart a crawl?

Posted: Sat Nov 03, 2012 1:58 pm
by Support
Gio to the Structure and save the scan to a bw6 file. Later, re-open it (from the Structure) and start the scan, it should resume from where it left off.

Re: restart a crawl?

Posted: Sat Nov 03, 2012 5:11 pm
by alpha2
It restarted somehow. But in the list there were 400 remaining before. When I restartet it, there were 0 remaining. Somehow the remaining files got lost.

Re: restart a crawl?

Posted: Sat Nov 03, 2012 6:07 pm
by Support
Well, this mean it doesn't resume. but, what if you put your system in sleep mode? This will retain all data and keep all programs open. All you have to do is Pause the scan (do not exit BW), put the computer to sleep, then when you turn it back on, go to the BW browser and login to the site, then click Scan to continue scanning.

Re: restart a crawl?

Posted: Sat Nov 03, 2012 6:22 pm
by alpha2
So there is no chance to somehow export the "remaining"? I would anyways be interested to have a look at this list. So you neither can show nor export nor import the remaining items?

The sleep mode is not so easy, as I'm not the only user of the PC. Other users might want to shut down both BW and the PC...

Re: restart a crawl?

Posted: Sat Nov 03, 2012 6:31 pm
by Support
I guess not! But I know that BW v5 does save the scan so you can resume. Perhaps it got dropped in v6 because of the fact that most sites now uses session ID, and expires after x number of minutes ifnot used, and resuming will produce nothing but 404 errors. You can install v5 along with v6, into a diffrent folder, and use v5 for this site, and v6 for other sites not requiring a resume. If you own v6, you can get a registration for v5 at no cost.

Re: restart a crawl?

Posted: Sat Nov 03, 2012 6:54 pm
by alpha2
Where do I get BW version 5? I'm still experimenting with BW in the trial period.

Re: restart a crawl?

Posted: Sat Nov 03, 2012 7:00 pm
by Support
You can get it from our download page...

http://softbytelabs.com/us/downloads.html