BeownReclise is a programmable web spider. Scan a web site and retrieve from it the information you need. You could scan a Real Estate web site and collect all of the agent addresses, phone numbers and emails, and place all this data into a tab delimited database file. Then import this data in your Excel application for example.
Support
Site Admin
Posts: 2813 Joined: Sun Oct 02, 2011 10:49 am
Post
by Support » Sun Oct 02, 2011 4:53 pm
Here is a YellowPages.com (
http://www.yellowpages.com/ ) script that will pull the Business name, street address, city, state, zip, phone and website URL from a search criteria.
Code: Select all
// Edit this section as you see fit.
State = 'tx';
Business = 'used-Computers';
OutputFile = ScriptPath + Business + '.txt';
//--------------------------------------------------------------------------------------------------------------------
PerlRegEx = Yes;
Output.Clear;
Link = New(URL);
rx = New(RegEx);
sx = New(RegEx);
OutFile = New(File);
OutFile.Open(OutputFile);
OutFile.Truncate;
OutFile.Write('BusinessName'+tab+'Street'+tab+'City'+tab+'State'+tab+'ZipCode'+tab+'Phone'+tab+'Website'+crlf);
Abort = No;
CurPage = 1;
LastPage = 0;
while not Abort do begin
lnk = 'http://www.yellowpages.com/'+State+'/'+Business+'?page='+CurPage+'&sort=alpha';
Link.Get(lnk);
if LastPage = 0 then begin
LastPage = Val(WildGet(Link.Data, '<a href="[^"]+\?page=(\d+)[^"]*">Last</a>'));
Progress.Maximum = LastPage;
end;
Progress.Position = CurPage;
CurPage = Decode(WildGet(Link.Data, '<a href="[^"]+\?page=(\d+)[^"]+">Next</a>'));
sx.Data = Link.Data;
sx.Mask = '<div class="info">(.*?)</ul>';
while (sx.Match) and (not Abort) do begin
rx.Data = sx.Value[1];
BusName = Trim(Decode(WildGet(rx.Data, '<a\s+[^>]+>(.*?)</a>')));
Street = Trim(Decode(WildGet(rx.Data, 'class="street-address">(.*?)')));
City = Trim(Decode(WildGet(rx.Data, 'class="locality">(.*?)')));
st = Trim(Decode(WildGet(rx.Data, 'class="region">(.*?)')));
ZipCode = Trim(Decode(WildGet(rx.Data, 'class="postal-code">(.*?)')));
Phone = Trim(Decode(WildGet(rx.Data, 'phone">(.*?)')));
Website = Decode(WildGet(rx.Data, '<li><a href="([^"]+)" class="track-visit-website'));
DataLine =
BusName +tab+
Street +tab+
City +tab+
st +tab+
ZipCode +tab+
Phone +tab+
Website
;
Output(DataLine);
OutFile.Write(DataLine+crlf);
end;
if CurPage = Nothing then Break;
end;
function OnStop();
begin
Result = @Abort;
@Abort = Yes;
end;
function OnTerminate();
begin
if @OutFile then @OutFile.Close;
Display('Data saved to...'+crlf+@OutputFile);
end;
Alienizer
Posts: 57 Joined: Sun Oct 02, 2011 11:41 am
Post
by Alienizer » Sun Oct 02, 2011 7:38 pm
Thanks Support for the script. I didn't know BrownRecluse could do that! I only found out today while looking at your new forum
Can we use BrownRecluse to harvest emails as well?
Support
Site Admin
Posts: 2813 Joined: Sun Oct 02, 2011 10:49 am
Post
by Support » Sun Oct 02, 2011 7:45 pm
That's one reason why we upgraded it. The old one was too old.
Yes, you can harvest emails using BrownRecluse, even those with the at word to stop spiders, but it will not stop BrownRecluse. In fact, you can harvest anything you want from any site.
Alienizer
Posts: 57 Joined: Sun Oct 02, 2011 11:41 am
Post
by Alienizer » Sun Oct 02, 2011 7:48 pm
Cool
How hard is it to script this thing? I can program a little bit in javascript, but I'm good at PHP.
Support
Site Admin
Posts: 2813 Joined: Sun Oct 02, 2011 10:49 am
Post
by Support » Sun Oct 02, 2011 7:51 pm
It's very easy if you know PHP. It's like Basic, easier. Have you look at our free scripts?...
http://softbytelabs.com/Support/scripts ... on=default
The best way to learn scripting in BrownRecluse is to look at other scripts. But of course you can ask questions here on the forum.
Alienizer
Posts: 57 Joined: Sun Oct 02, 2011 11:41 am
Post
by Alienizer » Sun Oct 02, 2011 7:54 pm
Yep, I was just looking at them. Didn't pay attention to them for the longest time!
I'm gonna go play with this now, and if I have any questions, I'll let you know.
Thanks for the quick support!
Support
Site Admin
Posts: 2813 Joined: Sun Oct 02, 2011 10:49 am
Post
by Support » Sun Oct 02, 2011 7:56 pm
You are welcome.
Support
Site Admin
Posts: 2813 Joined: Sun Oct 02, 2011 10:49 am
Post
by Support » Tue Dec 18, 2012 8:31 pm
It wasn't designed to work on the one. A new script would have to be made specifically for that one alone.
BRuser
Posts: 1 Joined: Mon Mar 02, 2015 5:24 pm
Post
by BRuser » Mon Mar 02, 2015 5:29 pm
The YP script doesn't seem to be working, are their any plans to update the script? Also the scripts pages on the website don't seem to be working.
Thanks
Support
Site Admin
Posts: 2813 Joined: Sun Oct 02, 2011 10:49 am
Post
by Support » Mon Mar 02, 2015 7:00 pm
The scripts we provide are as-is and wee functional at the time they were made, after many years, they may no longer work and we hope our users update them as it is too much work for us to handle. That's why we removed the page of scripts, most were never updated.