Page 1 of 1

Hyperion Booklets

Posted: Thu May 03, 2012 2:23 pm
by music_lover
Wondering if you could create a BW web file for http://www.hyperion-records.co.uk/notes/

Clicking here: http://www.hyperion-records.co.uk/notes ... 7580-B.pdf will get me to a PDF version of a CD booklet. I'd like to grab them all.

Thanks!

Re: Hyperion Booklets

Posted: Thu May 03, 2012 2:28 pm
by Support
Clicking on those links you provided gives me an error that I'm 'hot linking'. From the main page, where do I click to get to these 2 links?

Re: Hyperion Booklets

Posted: Thu May 03, 2012 2:35 pm
by music_lover
The first link gives an error. The second link should not. I hope that's enough. I cannot answer your question as I just don't know.

Update: well they seem to have changed things from this morning... might be a result of a keen IT person watching me attempt to get to these files.

Is there anything that can be done?

Another Update: It would seem that getting to them is as easy as this... click on "catalogue indexes" then "artists" then (for example) "singers" then (for example) "all singers" then any one of the singers listed then any one of their CDs. On the left there's a "View sleeve notes/artwork (PDF)" link. That gets you to the PDF.

Does that help?

Re: Hyperion Booklets

Posted: Thu May 03, 2012 2:43 pm
by Support
Can you backtrack the links? I mean, which page are they in?

Re: Hyperion Booklets

Posted: Thu May 03, 2012 2:50 pm
by music_lover
See my edit above and let me know if that's good enough.

Re: Hyperion Booklets

Posted: Thu May 03, 2012 2:54 pm
by Support
ok that works. So you want to scan the entire site for the pdf or just an artist?

Re: Hyperion Booklets

Posted: Thu May 03, 2012 3:12 pm
by music_lover
The entire site for .PDFs.

Thanks!

Re: Hyperion Booklets

Posted: Thu May 03, 2012 3:13 pm
by Support
ok, let me work on it and I'll post the filters here. Give me a few hours.

Re: Hyperion Booklets

Posted: Thu May 03, 2012 5:59 pm
by Support
Here are the filters. Copy them an in the Filters window, click on the "Paste Settings" button and start the scan.

Code: Select all

[BlackWidow v6.00 filters]
URL = http://www.hyperion-records.co.uk/ai.asp?ai=A_Ind_10_1&vw=dc
[ ] Expert mode
[ ] Scan everything
[x] Scan whole site
Local depth: 0
[x] Scan external links
[ ] Only verify external links
External depth: 0
Default index page: 
Startup referrer: 
[ ] Slow down by 10:60 seconds
4 threads
[x] Follow /a\.asp\?a=A[^&]+&vw=dc$ using regular expression
[x] Follow /dc\.asp\?dc=D_[^&]+&vw=dc$ using regular expression
[x] Add \.pdf$ from URL using regular expression
[end]

Re: Hyperion Booklets

Posted: Thu May 03, 2012 6:24 pm
by music_lover
OK thanks... it's doing its thing. I'll let you know how it works out!

Re: Hyperion Booklets

Posted: Thu May 03, 2012 6:51 pm
by music_lover
Well that just didn't work. I do appreciate your efforts, but absolutely nothing downloads. I have download while scanning checked and a valid download folder selected.

It scans and scans and scans... 2153s time but nothing downloads.

Any ideas?

Re: Hyperion Booklets

Posted: Thu May 03, 2012 7:36 pm
by Support
That's because it goes through all the A,B,C...Z and then each has a ton of names listed. Then the PDF. So you need to let it run for a long while!

Re: Hyperion Booklets

Posted: Thu May 03, 2012 8:26 pm
by music_lover
Hmm... I'm pretty sure it stopped doing much of anything after scanning the 2000+ links but I'll try it again ;)

Re: Hyperion Booklets

Posted: Thu May 03, 2012 8:53 pm
by music_lover
OK so I tried again and it scanned 2,153 links and stopped. Nothing downloaded.

What's next?

Re: Hyperion Booklets

Posted: Thu May 03, 2012 10:07 pm
by Support
ok, this one works. Here's how to use it...

In the top/right corner of the Filters window, click on the "Expert" button and paste the script below into it. Then start the scan from this URL...

http://www.hyperion-records.co.uk/ai.as ... 10_1&vw=dc

Code: Select all

case ScannerEvent of

  BeforeFetch:
  begin
    AcceptEvent =
      (DocumentURL ~= '/a\.asp\?a=A[^&]+&vw=dc$') or
      (DocumentURL ~= '/dc\.asp\?dc=D_[^&]+&vw=dc$')
    ;
  end;

  AfterFetch:
  begin
    for each matching('href="(dc\.asp\?dc=D_[^&]+&vw=dc)"') in Document as aLink do begin
      aLink.ResolveRelative(DocumentURL);
      Scanlink(aLink);
    end;
    for each matching("href='([^']*\.pdf)'") in Document as aLink do begin
      aLink.ResolveRelative(DocumentURL);
      Scanlink(aLink);
    end;
  end;

  FoundLink:
  begin
    AcceptEvent =
      (FoundLinkURL ~= '/a\.asp\?a=A[^&]+&vw=dc$') or
      (FoundLinkURL ~= '/dc\.asp\?dc=D_[^&]+&vw=dc$') or
      (FoundLinkURL ~= '\.pdf$')
    ;
  end;

  BeforeAdding:
  begin
    AcceptEvent = (DocumentType ~= 'pdf');
  end;

else
  AcceptEvent = No;

end;

Re: Hyperion Booklets

Posted: Fri May 04, 2012 12:50 pm
by music_lover
Well that worked perfectly. Woke up this morning to a folder filled with pdfs.
Awesome!

Thanks!

Re: Hyperion Booklets

Posted: Fri May 04, 2012 2:23 pm
by Support
You are welcome.