How to scan and download page and directory with same name

BlackWidow scans websites (it's a site ripper). It can download an entire website, or download portions of a site.
Post Reply
kingsley
Posts: 1
Joined: Mon Jul 27, 2015 10:21 pm

How to scan and download page and directory with same name

Post by kingsley » Mon Jul 27, 2015 10:39 pm

Hi,

Great utility (I've just found it) and it looks really useful, however I am having an issue scanning one of the sites I run.

I am trying to scan and download a site with the following structure, and due to the clashing names can't work out how to get both the index page and the directories below.

Our site uses an old custom php CMS that produces sections with the following format:

http://site.com.au/ (an alias to http://site.com.au/index.php)
http://site.com.au/section/ (an alias to http://site.com.au/section/index.php)
Then (unfortunately) the pages in each section have the format
http://site.com.au/section/index.php/1
http://site.com.au/section/index.php/2
http://site.com.au/section/index.php/3 etc... depending on the number of pages in that section. (these may skip numbers, and site hierarchy is limited to this level e.g. there is never section/index.php/3/somethingelse )


The issue I am having is that the crawler can't save both the file /section/index.php and the directory /section/index.php/ as the name clashes, so when I save the results the index.php file is skipped.

Is there a way I can tell the crawler to save the index.php files as e.g. index.html and keep the /index.php/ directory to preserve the links within the site?

Happy to provide more information if needed.

Cheers,
Kingsley

User avatar
Support
Site Admin
Posts: 1679
Joined: Sun Oct 02, 2011 10:49 am

Re: How to scan and download page and directory with same name

Post by Support » Tue Jul 28, 2015 11:44 am

Hi,

All you have to do is set the "Default index page name" to any name you like such as index.html which means, any URL ending with / will assume it's actually index.html
Your support team.
http://SoftByteLabs.com

Post Reply