Only download url with *Server*?

BlackWidow scans websites (it's a site ripper). It can download an entire website, or download portions of a site.
Post Reply
Klobbe
Posts: 2
Joined: Mon Apr 08, 2013 6:23 am

Only download url with *Server*?

Post by Klobbe »

Hi!

There is a page that I want to download. It's private with login so I can't show it here. The site mainly uses Javascript and got urls like: http://www.page.com/blablaServer?view=i ... id=1-99999.

So basically I wan't to download every page with the URL matching *Server?* and ignore anything else. I made this:

Code: Select all

case ScannerEvent of

  BeforeParsing:
    begin
      for each matching('(.*)Server?(.*)') in Document as aLink do begin
        aLink.ResolveRelative(DocumentURL); // resolve links like ../foo/bar/
        Scanlink(aLink); // add the link to the scan queue.
      end;
    end;

end;
But it will only download the starting page and never follow the links on it. What is wrong?

User avatar
Support
Site Admin
Posts: 1892
Joined: Sun Oct 02, 2011 10:49 am

Re: Only download url with *Server*?

Post by Support »

You must use \? instead of ? because it is a regular expression character and must be escaped by a backslash. But the .* is wrong here as it will get the content of the html, so you need to check if the link is in single or double quotes and use...

'([^']+server\?[^']+)'

or

"([^"]+server\?[^"]+)"

If you use single quotes, then change the matching('...') to double quotes like this matching("...")
Your support team.
http://SoftByteLabs.com

Post Reply