no pictures

BlackWidow scans websites (it's a site ripper). It can download an entire website, or download portions of a site.
Post Reply
alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

no pictures

Post by alpha2 » Tue Nov 20, 2012 4:10 pm

Hi,

(btw. I bought BW in the meantime... ;-) ) I wanted to download from a page vorname then dot com. The files are pretty big. Is there any chance to dowbload the files without the pictures? They are within the html file, not separate.

Regards,

Alpha

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: no pictures

Post by Support » Tue Nov 20, 2012 4:16 pm

Thanks :D

Yes, it's possible. But can you give me the URL of a page because it's all in German and I don't know where to click or what to get!
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: no pictures

Post by alpha2 » Tue Nov 20, 2012 4:38 pm

www dot vorname insert dot com slash name,elias dot again html, but as the site is 99% of pages of that kind, I anyways scan the whole site with the parameters

[BlackWidow v6.00 filters]
URL = ...
[ ] Expert mode
[x] Scan everything
[x] Scan whole site
Local depth: 0
[x] Scan external links
[ ] Only verify external links
External depth: 2
Default index page: index.html
Browser user agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; (R1 1.6); .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 1.1.4322; InfoPath.1; .NET4.0C; .NET4.0E; MS-RTC LM 8; OfficeLiveConnector.1.5; OfficeLivePatch.1.3; .NET CLR 2.0.50727; Creative AutoUpdate v1.41.05)
Startup referrer: http etc www dot vorname dot again com slash jungennamen.html
[x] Slow down by 1:2 seconds
6 threads
[x] Add *vorname*.com/*name* from URL using wildcard
[x] Do not follow * using wildcard
[end]

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: no pictures

Post by Support » Tue Nov 20, 2012 5:02 pm

So you want to download the whole site without the images?
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: no pictures

Post by alpha2 » Tue Nov 20, 2012 5:37 pm

yes. With the limitation of sites containg also name after the slash I try to avoid any surprises

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: no pictures

Post by Support » Tue Nov 20, 2012 7:27 pm

Here are the filters for you. It should scan and list only the ".../name,...html" and nothing else. Copy theat block and click on the Paste Settings button in the filters window...

Code: Select all

[BlackWidow v6.00 filters]
URL = http://www.vorname.com/jungennamen.html
[ ] Expert mode
[ ] Scan everything
[x] Scan whole site
Local depth: 0
[ ] Scan external links
[ ] Only verify external links
External depth: 1
Default index page: 
Startup referrer: 
[ ] Slow down by 2:2 seconds
4 threads
[x] Follow ^http://www\.vorname\.com/jungennamen,.?,\d+\.html$ using regular expression
[x] Add ^http://www\.vorname\.com/name,[^/]+\.html$ from URL using regular expression
[end]
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: no pictures

Post by alpha2 » Wed Nov 21, 2012 2:34 pm

That's not what I meant. I don't have problems finding the links. But the question is, whether I can download only the pages, but not the pictures within the pages. E. g. all pages have a size of 250 kB (on average). But most of these 250 kB are images. I would like to download the page with only the html, so e. g. only 30 kB. The remaining 220 kB are the images, which I don't want.

User avatar
Support
Site Admin
Posts: 1851
Joined: Sun Oct 02, 2011 10:49 am

Re: no pictures

Post by Support » Wed Nov 21, 2012 4:13 pm

If the file is an html of 250kb, that's because the html text is that size, not because it include a picture. Pictures are never part of an html, they are only links, and comes back individually as jpg, png etc. The code above will scan and list only the html, without pictures. Or perhaps I'm still not clear on what you are trying to do.
Your support team.
http://SoftByteLabs.com

alpha2
Posts: 31
Joined: Tue Oct 30, 2012 8:24 am

Re: no pictures

Post by alpha2 » Mon Nov 26, 2012 9:37 am

You are right. Sorry. I didn't understand the size. But when looking into the code, I see, that there is a lot of stuff - even without the pictures.

Post Reply