This is a bot designed to crawl 1001freefonts.com pulling all the windows fonts. This was my first bot, and I wasn’t really big in linux admin yet so it’s incredibly crude.
REQUIREMENTS
- wget: almost every linux distro has it
- php: most bigger distros have it. check php.net to get it if you dont.
Here’s the code. It’s a little messy, but it works. It’s easiest run as php cgi, but if you want to go through all the hassles of permission, you can execute it with apache.
#!/usr/bin/php
<?
function fontbot($startpage)
{
if (trim($startpage) == '')
$startpage = 'http://www.1001freefonts.com/fonts/afonts.htm';
$url = $startpage;
$page = file($url);
echo('url = '.$url);
$page = implode(' ',$page);
$page = explode('<a href=', $page);
foreach($page as $line)
{
if ((strstr($line, 'winfonts')) || (strstr($line, '"NEXT PAGE"')))
{
$parts = explode('>', $line);
$parts[0] = preg_replace(’/”/’, ”, $parts[0]);
if (strstr($parts[0], ‘winfonts’))
{
$url = ‘http://www.1001freefonts.com’.$parts[0];
$list = explode(’/', $parts[0]);
$zipfile = array_pop($list);
exec(’wget ‘.$url);
exec(’unzip -o ‘.$zipfile.’ -d fonts/’);
exec(’rm ‘.$zipfile);
}
else
{
$url = ‘http://www.1001freefonts.com’.$parts[0];
if (!strstr($parts[0], ‘afonts.htm’))
fontbot($url);
}
}
}
}
fontbot(”);
?>