Herself's Webtools

Scripts, HowTos, Templates, Plugins, Widgets, Tips

Archive for November, 2007

Email yourself when someone gets a 404 error on your site

with 3 comments

It used to be all the web hosts gave you full access to log files and error files. Somewhere along the line that changed. Now they all want you to pay extra for log file access. But we have PHP so we can do our own logging.

When a user requests a page that is not on your site he is directed to your 404 page. Most web hosts let you totally customize this page.

On a regular website add the following code to your 404 page:

<?php
$url = $_SERVER[ 'REQUEST_URI' ];
$message = “URL: $url “;
wp_mail( youremail@yahoo.com, ’404 error’, $message);
?>

Change youremail@yahoo.com to where ever you send email from your website. I don’t recommend sending this to your main email account. If some badly written spider gets tangled in your web you don’t want 500 emails in your main email that you use to communicate with friends and family.

If you are using WordPress add the following code to the 404 Template page in your template:

<?php
$url = $_SERVER[ 'REQUEST_URI' ];
$message = “URL: $url “;

wp_mail(get_option(‘admin_email’), sprintf(__(‘[%s] 404 Error’), get_option(‘blogname’)), $message);

?>

Now what happens if you get some spider or human that refuses to believe you do not have a copy of ‘nakedmoviestar.html’ and you no longer want 300 emails every day about this? You could do a 301 redirect to your local church or you can change your code like so:

<?php
$url = $_SERVER[ 'REQUEST_URI' ];
$message = “URL: $url “;
if ( !ereg(“nakedmoviestar.html”, $message )){
wp_mail(get_option(‘admin_email’), sprintf(__(‘[%s] 404 Error’), get_option(‘blogname’)), $message);
}
?>

The if ( !ereg(“nakedmoviestar.html”, $message )){ } can be used on regular as well as WP set ups. Ah, but what if you have more than one idiot out there and you want to block emails about ‘nakedmoviestar.html’ and ‘windowshack.html’? Just add more if statements:

if (( !ereg(“nakedmoviestar.html”, $message )) || ( !ereg(“windowshack.html”, $message))){
wp_mail(get_option(‘admin_email’), sprintf(__(‘[%s] 404 Error’), get_option(‘blogname’)), $message);
}

You can add as many || (!ereg(“filename”)) as you need.

Now when ever someone requests a page not on your server you’ll get an email so you can fix it asap.

Written by Linda MacPhee-Cobb

November 21st, 2007 at 5:00 am

Posted in how to,security

What are those links doing in your log files?

without comments

Ever wonder about them? I did. Years ago I noticed all sorts of urls showing up in my access-log files. Along with them were some strange referrals from sites that had nothing to do with my site’s subject matter. Most of the referrals were from porn sites. I’d go back to the site to see why they were linking to my site and there’d be no link there. Things got quiet but I’ve noticed them starting to show back up again.

What is happening is that access-logs are sometimes not protected and therefore viewable by the public and by search engines. A link is a link is a link so the less reputable sites would stuff your log files with links to themselves so as to improve their search engine ratings. Having your site associated with porn sites does not do you any favors, unless of course you are running a porn site as well.

How is this done? Bots are sent out with user agent strings that are links to the porn sites. 1×1 pixel images are linked to your site so every time a page load happens your site is referenced from them. Scrapers load up every page on your site, leaving a link in your access file to the porn site for each file fetched. Also browsers are hijacked and every time a hijacked browser visits you a link is left in the access file.

The best way to discourage this is to make sure your access-logs are not public and that you have blocked search engines from crawling them, see Robots.

If you have a persistent problem with specific sites use your .htaccess file to block them or send them to somewhere more appropriate say, whitehouse.gov.

Written by Linda MacPhee-Cobb

November 19th, 2007 at 5:00 am