Made the Switch

The SPAM finally got to me. I don’t know if WordPress will be any better at handling SPAM or not, but I can dream :). The design is still very much off the shelf, so I’ll have to play with that in the months to come and, of course, I still have to learn about how to use this software. But I’m hoping that it will be much better than MoveableType was. In my brief playing while evaluating and setting it up, I think I like it better. MoveableType 3.0 may have been much better, but I didn’t wanna have to pay for it.

So, if you’re curious, read on to find out how I set it up. It really was quite simple. I first downloaded and installed WordPress using their 5 minute install guide. After that, I followed the instructions for importing from MoveableType. The most difficult part was making sure my permalinks from MovableType (like the archives) still worked. To do that, I created an .htaccess file to leaverage mod_rewrite. It looks like this:

RewriteEngine on
RewriteRule ^archives/0*(\d+).html /~jake/blog/index.php?p=$1
RewriteRule index.rdf /~jake/blog/index.php?feed=rdf
RewriteRule index.xml /~jake/blog/index.php?feed=rss2

So, we’ll have to see how well this really works for me, but at this point I’d have to say that I’ve already committed to it.

Are Spammers Retarded?

It seems that the average spammer must have the IQ of a 1st grader. Nevermind the fact that they constantly are trying to get around people’s spam filters in email, an action that only serves to upset those from whom they want money, I’m not even gonna go there right now. What’s got me upset is the constant fact that they keep hitting me with trackback spam. I never used to get that. I used to get my share of comment spam, but every since I installed the SCode plugin, I’ve been hit with a constant deluge of trackback spams. My guess is that their script noticed the error message when they tried to post a comment spam and switched over to trackback instead. What’s really kinda strange, however, is that when I just had comments off, that didn’t happen. It was only after I took the step of installing SCode and turning old comments back on that this happened. Now, the idiocy comes in because I also have the Bayesian filtering plugin installed and it marks all these trackback pings as spam as soon as they’re posted (In fact, it has a tendency to mark things as spam even when their not… probably because my spam corpus is so much larger than my non-spam). Not one single time has one of these trackback spams actually appeared visible on my site. Not once. You’d think they’d kinda want that to happen in order for it to even be worth their time in constantly spamming me. But what do I know, I’m just a lowly consumer.

Problems with Bayesian Comment Filtering

As much as it sounded like a good idea, my previous solution to comment spams had a few issues. Well, not so much issues as annoyances. The original author of that plugin posted his list of issues, so I knew about those going in, though I was unconcerned. To me, the biggest issue is simply fact that comment spam still gets posted. Sure, it doesn’t appear to the user, nor does it appear to the search engine when it comes to index my site, but I still get an email for each comment and I still have to sort through it on the backend. So, I decided to exclude blind people from being able to post comments on my blog (sorry). I’m now using the SCode plugin instead of (well, truthfully, in addition to) the bayesian filter.

I did run into a couple of problems when trying to install it. After following the instructions in the README file, it still wasn’t working. So, I ran the scodetest.cgi script. It kept telling me that my temporary directory wasn’t writable by the webserver, which I knew to be an incorrect statement. So I did what any reasonable person trying to install a plugin would do: I started looking at the code. It really is a simple module, so it didn’t take long to figure out what was going on. The scodetest.cgi script as well as the mt-scode.cgi image generation script were calling MT::SCode::scode_get($code) to retrieve a security code. There was an if block in that subroutine that would call scode_generate() and return that subroutine’s return value as the return value for scode_get. That seemed like the right thing to do, but it wasn’t. The problem is that scode_generate() doesn’t make any effort whatsoever to save the value that it generated so that it will be associated with the $code originally passed to scode_get(). This is what was causing the scodetest.cgi script to always say that my tmp directory wasn’t writable (that script simply calls the scode_get() routine twice and checks to see if it got the same result both times). So, I modified my scode_get() routine to call scode_create() (which calls scode_generate() and saves the value) instead of scode_generate(). The relavent portion of that subroutine now looks like this:

# Random number back...if have not initialized
if ($code< =0 || $code>$scode_maxtmp || !-e $tmpdir.$code ) {
return scode_create($code);
}

Because I’m still trying to return this random value to whatever called scode_get(), I needed to modify the scode_create() subroutine to return the generated code. My new scode_create() routine looks like:

sub scode_create {
my $code = shift;

return if (-e $tmpdir.$code);

my $scode = scode_generate();
if ($code>0 && $code<=$scode_maxtmp) {
open(OUTFILE,">${tmpdir}${code}");
print OUTFILE $scode;
close(OUTFILE);
}
return $scode
}

And now everything works. The only other issue I noticed was an incorrect usage of the alt tag for an image in the installation instructions. The alt tag is not supposed to be used as a tooltip, but as alternate text to display when the image is unable to be displayed. As such, it should say something like: “Image required in order to post.” The tag that should be used for the text “Please enter the security code you see here” (what the plugin author put in the alt tag) is the title tag.

Bayesian Filtering for Comments

Bayesian filtering for MoveableType? I didn’t even know such a thing existed! I found a plugin written by James Seng. At the time, I was actually looking for one of those little images that aren’t OCR readable and the user has to type whatever’s in the image (I’ve since discovered that’s called captcha and is implemented using the SCode plugin). I knew that this method introduced usability problems for the visually impaired and people who use text only web browsers, but I felt that it was a fair trade off in order to stop SPAM. But, in the process of looking for this, I figured I may as well look at other solutions, too (especially considering that GD isn’t working on my machine right now). Wouldn’t you know it, I found one I liked even better. I’ve liked the theory behind Bayesian filtering every since I’ve read Paul Graham’s A Plan For Spam. Mozilla Thunderbird uses this method of filtering, and while at first it’s quite inaccurate, it gets better with time (actually, with a larger sample of both good and bad stuff). Anyway, I’ve now implemented this method for comments on this site. I should now be able to re-open all my entries to comments, undoing the change I made back in November 2003.

Facelift For My Weblog

Minor though it may be, I’ve now given my blog a bit of a face lift. The design itself is still pretty much the default from MoveableType, but I’ve done a little more with colors. I also added a little bit of Mozilla specific CSS. While the page is certainly still functional in Internet Explorer, it looks better in Mozilla based browsers. CSS really is a beautiful thing once you figure out how it works (I remember it taking quite a while for me to even want to make the transition, let alone actually do it… I probably still haven’t, completely). In addition to the front end improvements, I’ve also made a couple on the semi-backend (eg, not the MT code, but not just CSS). I finally figured out how to make it so links show up in syndicated versions of this blog (this was quite a problem in my last post). In case you’re curious, it was as simple as changing <$MTEntryExcerpt encode_xml="1"$> to <$MTEntryBody encode_xml="1"$> in the RSS templates. That also has the side effect of having the entire post be syndicated (at the option of the syndicating site) instead of just the first few characters. The other backend thing I did was made the new archive pages (’03 and ’04) much easier. Using a template modules and include statements, I was able to greatly reduce the amount of “code” required to make a calendar. In fact, all I need is:

<!-- JANUARY_2003 -->
<table border="0" cellspacing="4" cellpadding="0">
<caption class="calendarhead">January 2003</caption>
<$MTInclude module="CalendarDOW"$>
<MTCalendar month="200301">
<$MTInclude module="CalendarDays"$>
</MTCalendar>
</table>
<!-- /JANUARY_2003 -->

That’s still a lot more than should be required had a better templating system been used. It would be nice to have all that layout type stuff in one template module so I could just say something like <MTInclude module="Calendar" month="200301"> and still be able to get my own layout, but I this is still better than what MoveableType uses by default.

UPDATE: I was just looking at the 2004 Archive page. With this entry, I’ve now posted more in the month of December than I have for the entire other 11 months of 2004. Kinda sad, isn’t it?

Rescratching My Own Itch

Last July, as in July ’03, I posted about some modifications I made to MoveableType. One thing I had done was a plugin for SETI stats, the other was an actual code modifcation for a “recently archived” sidebar item instead of recently posted. Well, somewhere along the way I seem to have lost that code modification as I noticed my “recently archived” item wasn’t working right. And wouldn’t you know it, I seemed to have made a really boneheaded mistake of not saving the code anywhere! To make a long story short, I figured out, once again, how to do it. It’s my goal to make sure I don’t ever have to spend another couple hours figuring this one out, so I’m gonna post the patch here. As an added bonus, that means other people can use it, too :).

--- Context.pm.orig     2004-12-14 23:11:57.000000000 -0500
+++ Context.pm  2004-12-15 01:15:09.000000000 -0500
@@ -626,6 +626,19 @@
$args{direction} = 'descend';
$args{limit} = $last;
$args{offset} = $args->{offset} if $args->{offset};
+        } elsif (my $days_offset = $args->{days_offset}) {
+            my $sec = 3600 * 24;  # Seconds in a day
+            my $days = $args->{days};
+            my @ago = offset_time_list(time - $sec * $days_offset,
+                $ctx->stash('blog_id'));
+            my $ago_s = sprintf "%04d%02d%02d%02d%02d%02d",
+                $ago[5]+1900, $ago[4]+1, @ago[3,2,1,0];
+            @ago = offset_time_list(time - $sec * $days_offset - $sec * $days,
+                $ctx->stash('blog_id'));
+            my $ago_e = sprintf "%04d%02d%02d%02d%02d%02d",
+                $ago[5]+1900, $ago[4]+1, @ago[3,2,1,0];
+            $terms{created_on} = [ $ago_e, $ago_s ];
+            %args = ( range => { created_on => 1 } );
} elsif (my $days = $args->{days}) {
my @ago = offset_time_list(time - 3600 * 24 * $days,
$ctx->stash('blog_id'));

Mostly it’s a duplicate of the block for $args->{days}, but with some modifications to make it do a starting and ending date. The template block to get this in my sidebar is (I may tweak these number is the future, but it’s just a config option):

<div class="sidetitle">
Recently Archived
</div>

<div class="side">
<MTEntries days="60" days_offset="21">
<a xhref="<$MTEntryPermalink$>" mce_href="<$MTEntryPermalink$>"><$MTEntryTitle$></a><br />
</MTEntries>
</div>

This will put any entry from 21-81 old days under the “recently archived” heading. This is, of course, a modification to the version of MoveableType that I’m running, which is version 2.64. If you’re running a different version, your results may very.

Closing off Comments

Well, the spammers have finally done it. I’ve gotten so sick of dealing with spam that I’ve decided to no longer allow comments on my blog. I will slowly but surely remove the ability for older posts to have comments and will not be allowing comments to any new posts. I’m deeply disappointed in having to do this, but I’m sick of constantly having to delete the crap they put on my site. If you do have something to say, feel free to email it to me and if you give permission (and I think about it), I’ll post it here.

Scratching my own itch

Open source software is unique in that if there’s a feature you want added to an application you have everything you need (except maybe the skill) to add it yourself. That probably isn’t news to anybody reading this entry. Of course, “unique” may not be the right word… there are applications that aren’t really open source, per se, but still come with the source code that can be modified. MovableType is one of those applications. You may have noticed that the past couple days I’ve had my SETI@home stats on the left side of this page. That was done using MovableType’s plug-in architecture and really doesn’t rely on the fact that MovableType comes with the source code. I’ll probably submit the very-simple code for that to MT’s plug-in database in the near future. Slightly more difficult and less noticeable was the change from “Recent Entries” to “Recently Archived” over on the left (of the main index). It never made sense to me that there were links on the left side for entries that were right there on the front page. I wanted instead to have links to the entries that just recently moved off the front page. As I started digging into it I realized that it would require some code changes. Once I figured out exactly what to change, it wasn’t all that difficult and I now have links to items that are 15-23 days old (and therefore recently moved off the index for which the threshold is 14 days).