Posted by ljmacphee on August 3, 2007 under blogger, how to, perl, tools |
This is a very simple PERL script that grabs your RSS feed, pulls the link for each page, downloads the text for that page and writes the HTML page to your computer’s hard drive. It creates a separate directory by year and month and stores each HTML page in the directory for the month it was published.
So for this website it would create a root directory ‘herselfswebtools.com’ and directory under that for ‘2007′ and under 2007 it would create directories for ‘01′, ‘02′, ‘03′, ‘04′, ‘05′, ‘06′, ‘07′. The full page including CSS, sidebars, etc will then be written in the proper months directory. As of now it does not download and save images.
This first script is intended to be general and able to back up any rss feed website. ( There are two scripts for blogger on the sidebar as well, details coming on them Monday and Wednesday. Or you can just download they and read the notes in the scripts. )
There are two things you’ll need to change both on this line (63):
$content = get ( “http://www.blogger.com/feeds/9999999999999/posts/default?max-results=500&alt=rss”);
You need to change that series of 99999s to your blog id number and if you have more than 500 posts you’ll want to make that a larger number. Or if you are backing up a non blogger website you should just be able to use the rss feed for that site.
Backup Blogger Posts Perl script
You might need to install a Perl module or two. Just follow the directions if you are not familiar with how to do so.
Posted by ljmacphee on July 23, 2007 under perl |
I’m going to be writing some more PERL scripts to make website maintenance easier. Today I hunted down some information on RSS feeds and using PERL to download them.
Ah but since I had not done much with PERL on this computer I got caught in the seemingly endless ‘Can’t locate BlahBlah/Blah.pm’ Egads. For every one I found and installed another one needed to be hunted down.
The very easiest way to do this and retain your sanity is to use the CPAN module that comes with PERL.
As root so you have permission to install modules in the Library Path type:
perl -MCPAN -e shell;
It will ask if you want to do the manual configuration. Hit enter and for almost every question you’ll just be able to hit enter and agree with the option it chose. There’ll be a couple you have to give it response other than enter so pay attention as you go through the questions.
Once that is done you’ll be dropped into a cpan shell
cpan>
Now all you have to do is try to run your program in one terminal window and install missing modules in the cpan window. For example:
./rss2html.pl
Can’t locate LWP/Simple.pm in @INC blah blah blah
So in the cpan window type
cpan> install LWP::Simple
Just replace slashes with :: and drop the .pm. It will locate, compile and install the module for you. Occasionally you will have to force a module.
cpan> force install LWP::Simple
Perl is case sensitive. This is the most painless and easiest way to collect all the modules you will need to run PERL scripts.
* You will probably need to install a module or two to use the Perl Blogger backup scripts I posted in the top left sidebar last weekend.
More information:
Using RSS News Feeds
Posted by ljmacphee on July 2, 2007 under blogger, how to, perl, tools, useful sites, wordpress |
Google Analytics is yet another free statistic collector for your website. Google Analytics tallies up the data once a day. You’ll get incoming link information, number of visitors, location of visitors and time spent on your site and the pages each visitor viewed.
Analytics will give you more data than most of the free statistics tools out there. But I’m still a fan of StatCounter too. They update data much more often.
I waited a long time to use Google Analytics because you have to enter the JavaScript with your code on every page on your website. Now that TimesToCome is cleaned up and much of it moved to blog format I decided it was time. Still there were about 500 files that needed the code added to them.
So I wrote a small PERL script ( permanent link on top left of page ) to do this. It searches every file in the directory you place it in for ‘</body>’ and adds in the Google Analytics code just before that tag. Read the notes in the script before using it. You’ll need to enter your personal analytics code number to the script.
Of course for your Blogger and Wordpress blogs you need only change the template and enter the code just above the </body> tag. One entry is needed for Blogger. You may have to change index.php, single.php and page.php for Wordpress depending on your theme. Where ever you find </body> in your template files you need to add the script.
Posted by ljmacphee on March 13, 2007 under cgi, perl, useful sites |
I found CGI to be quite confusing to begin. I had thought CGI was a language. CGI (Common Gateway Interface) is a way for information to get passed to and from webpages from computer programs. You may use any computer language. Usually C or PERL is used. You need CGI permission from your web hosting company to use CGI scripts on your website. They are major security risks and not all hosting companies are set up to deal with the security risks. Remember to chmod after you upload your scripts to 755.
PERL (Practical Extraction and Replacement Language) is a scripting language. It is good for small tasks you need done and it is particularly suited to parsing text, sorting, and finding and replacing files and text. You can use PERL for interactive user web pages or for website maintenance. There are several scripts on this site for doing just that.
Obtain Perl here
Off site tutorials
A CGI Tutorial
Beginner’s Guide to CGI Scripting with Perl
Perl Communities
O’Reilly Perl Pages
Perl Monks