Need a solution to remove server code

With NameWiz, you can rename files en-masse in either a single folder or all sub folders. Change the file extension, make all names lower/upper case, replace, remove, insert, delete, move or swap characters, add prefixes and suffixes, or replace the names altogether with sequential names of your choosing.
Post Reply
avalanch
Posts: 5
Joined: Fri Mar 16, 2012 11:31 am

Need a solution to remove server code

Post by avalanch »

Hi there, I need a solution which can scan htm html & other various html extensions. I need it to be able to scan in excess of over 100K files in each run without crashing and a ability to remove what looks like timestamped server generated code such as.

Code: Select all

<!-- text below generated by server. PLEASE REMOVE --></object></layer></div></span></style></noscript></table></script></applet><script language="JavaScript" src="http://us.i1.yimg.com/us.yimg.com/i/mc/mc.js"></script><script language="JavaScript" src="http://us.js2.yimg.com/us.js.yimg.com/lib/smb/js/hosting/cp/js_source/geov2_001.js"></script><script language="javascript">geovisit();</script><noscript><img src="http://visit.geocities.yahoo.com/visit.gif?us1240811868" alt="setstats" border="0" width="1" height="1"></noscript>

<IMG SRC="http://geo.yahoo.com/serv?s=76001073&t=1240811868&f=us-w6" ALT=1 WIDTH=1 HEIGHT=1>

Code: Select all

<!-- text below generated by server. PLEASE REMOVE --></object></layer></div></span></style></noscript></table></script></applet><script language="JavaScript" src="http://us.i1.yimg.com/us.yimg.com/i/mc/mc.js"></script><script language="JavaScript" src="http://us.js2.yimg.com/us.js.yimg.com/lib/smb/js/hosting/cp/js_source/geov2_001.js"></script><script language="javascript">geovisit();</script><noscript><img src="http://visit.geocities.yahoo.com/visit.gif?us1256469497" alt="setstats" border="0" width="1" height="1"></noscript>

<IMG SRC="http://geo.yahoo.com/serv?s=76001083&t=1256469497&f=us-w8" ALT=1 WIDTH=1 HEIGHT=1>
Ideally I would like it to be able to scan through subfolders as well of course and hopefully have a regex or some filter to match and delete those lines any dynamic text in between them since they are timestamped.

Is there any solution available from softbyte labs or can one be created?

I am working with the geocities torrent which you can find online with a simple search, fully extracted, this thing is over 900GB and consists of many millions of files.

A typical folder would consist of thousands of files and be a gigabyte or so, consisting of the usual static files you would encounter: Mostly .js .css .htm .html .gif .jpeg .jpg and a surprisngly low amount of .png files considering they were originally on geocities.
Attachments
files.png
files.png (360.37 KiB) Viewed 20980 times
User avatar
Support
Site Admin
Posts: 2989
Joined: Sun Oct 02, 2011 10:49 am

Re: Need a solution to remove server code

Post by Support »

We do not have such software but if you know how to program in Pascal, you could use our BrownRecluse software to do just that.
Your support team.
https://SoftByteLabs.com
Post Reply