Posts for June 2003

2003-06-01: Mixed stable/unstable

One thing that I forgot to mention about setting up a mixed stable/unstable system using Debian. After you've made the change to apt.conf, added unstable to the sources.list file, and done an apt-get update, the command to install packages specifically from unstable is apt-get install package/unstable.

Due to dependencies, one may have to specify multiple packages at once to pull in a whole set. If libc is different in unstable from testing, this can get rather annoying, of course.

The only problem I've encountered with Mozilla so far is that for some reason WindowMaker can't figure out how to start it in the right virtual desktop. Not sure what's up with that. I should probably submit a bug; not sure which package the bug would be against.

2003-06-01: Backgammon

I'm clearly not nearly a good enough backgammon player, since playing against the computer, while initially fun, is just frustrating after a few games. Although I really don't know why the computer in gnubg has a tendency to give up during the end game when it's almost certain that it will finish before I will.

In other news, I'm having a bad day for communicating with people (I just don't feel like it), and I'm coming down with a cold.

Oh well.

2003-06-06: Domestic advice

The domestic tip of the week, courtesy of my mother. If you're having great difficulty getting a comb really clean, just put it in the dishwasher. Works like a charm.

I don't think I ever would have thought of that, and I have no idea why not.

2003-06-07: bogofilter

Now that I have a new, much faster desktop machine, I finally took the plunge and set up Bayesian spam filtering. Hand-maintained rules were missing more and more, not enough to really bother me that much for my regular mail, but enough to bother me for group-advice and newgroups mail. I was losing a lot of stuff in the spam deluge.

I've now got Gnus 5.10.2 all set up using the Debian bogofilter package and it works like a charm. So far, it's been perfect. (I pre-seeded it with the past six months of group-advice and newgroups mail, among some other things, and apparently didn't make too many errors.) The only drawback is that registering new spam or non-spam takes a little while, even with a fairly fast machine, but not long enough that I think it will bother me.

The documentation has some real problems, though, so here are a few additional notes:

The documentation really wants you to use customize, preferrably on each group. I don't like putting a lot of stuff into group properties, so I instead set the general variables. gnus-spam-newsgroup-contents takes a list of pairs of regexes that match group names (including the nnml: part) to either gnus-group-spam-classification-spam or gnus-group-spam-classification-ham. Example:

(custom-set-variables '(gnus-spam-newsgroup-contents
'(("^nnml:mail\.spam.*" gnus-group-spam-classification-spam)
("^nnml:mail\.\(eyrie\|rra\)" gnus-group-spam-classification-ham)
("^nnml:work\.\(personal\|news\)" gnus-group-spam-classification-ham)
("^nnml:project\..*" gnus-group-spam-classification-ham))))

gnus-spam-process-newsgroups takes a list of pairs of similar regexes matched to lists of processing functions, which is different than what the documentation says. Example:

(custom-set-variables '(gnus-spam-process-newsgroups
'(("^nnml:project\..*"
(gnus-group-spam-exit-processor-bogofilter
gnus-group-ham-exit-processor-bogofilter))
("^nnml:mail\.\(rra\|eyrie\)"
(gnus-group-spam-exit-processor-bogofilter
gnus-group-ham-exit-processor-bogofilter))
("^nnml:work\.\(personal\|news\)"
(gnus-group-spam-exit-processor-bogofilter
gnus-group-ham-exit-processor-bogofilter))
("^nnml:mail\.spam.*"
(gnus-group-spam-exit-processor-bogofilter)))))

Another point that's very unclear from the documentation that I found out through a lot of fiddling is that you need to add both the spam and ham bogofilter processors to the list of processors for each of your ham groups. Otherwise, the spam that you mark never actually gets registered with bogofilter.

I played with the option to move marked spam into a different group but decided that I didn't like it. I would then have to go read that group and see that spam again to get it to actually expire. So I just leave it in the group and let it expire with the rest of the traffic.

Note that the spam processing stuff does not play well with groups marked auto-expire, since the expirable mark E is not one of the marks that causes a message to be registered as non-spam. I switched all of my auto-expire groups over to total-expire since that's how I use them anyway, and since the registration functions like read marks (r or R) much better than expirable marks. You can change the definition of ham-marks to include expirable messages, but the problem there is that each time you go back into the group to look at old messages, those old messages will all get re-registered as non-spam, which skews the counts.

Other than that, the documentation is okay (you want to use the spam.el package, and I recommend bogofilter over the built-in spam-stats stuff since the latter is going to be a lot slower). Do pay attention to the bit about not running messages through bogofilter in groups where you're not going to use bogofilter filtering; for example, all the groups that I split out before I apply spam checks also aren't registered as non-spam groups because it would skew the counts for the things that bogofilter actually gets to see.

One final note: bogoutil is the program that lets you fiddle with the databases. One thing that's useful to do is to clean out all the tokens that only have one occurrence, to clean out things that are unlikely to recur. I'm guessing I'll probably do that every three months or so. (Note that you can also use db4.0_dump -r and then db4.0_load to dump and restore the database, which is the only way to get it to shrink in size even if you've taken things out of it.)

I should probably stick this all up somewhere more permanent on my web pages....

2003-06-07: Bookstore haul

My parents and I went to the Stanford Bookstore, and of course I can't walk into a bookstore without coming away with a bunch of books. So here's the haul of the day.

Iain M. Banks -- Inversions
Iain M. Banks -- Look to Windward
David Brin -- Infinity's Shore
Orson Scott Card -- Ender's Shadow
Terry Goodkind -- Faith of the Fallen
Joe Haldeman -- The Forever War
Mercedes Lackey -- Take a Thief
Neil Stephenson -- The Big U
Connie Willis -- To Say Nothing of the Dog

I also picked up Unicode Demystified for some non-fiction reading, and the DVD of Frank Herbert's Dune (which I got a chance to watch earlier this year and rather liked).

Mmm, more books. Of course, I've not been reading the past few days since my parents have been in town and we've been busy, but I have quite a lot of it to get back to.

2003-06-07: Minor layout fiddling

The category of posts is now noted in the footer. Eventually I want to generate category indices and then link the categories to those, but I'll save that for another day when I feel like doing more substantial things. I've also added the recent comments to the sidebar (in a smaller font so they don't wrap all the time); thanks, piranha, for the idea!

I've also added some notifications; this is in part a test to see how they look and whether they work properly. :)

2003-06-08: More bogofilter

I'm extremely happy with this. It's performing significantly better than my old rules were for much less effort. So far, in the days of use I've put it to, there have been no false positives whatsoever, and only three false negatives. (Two of which were spam through the Kerberos bug submission interface, which is going to be a bit harder to train.)

The delay in registering spam with Bogofilter has turned out not to bother me at all. It's actually kind of fun to think that the system is accumulating more information about spam and how to detect it.

My only (minor) worry is that I'm not feeding enough legitimate mail through it. I apply my spam filtering rules so late in the game that the mail that reaches the filtering is about 90% spam. But at least so far that doesn't seem to be skewing the results.

2003-06-09: tripwire in Debian

Debian continues to just really impress me. Most times when I install a package and start looking at how it works, it's clear that someone really thought about how everything should work together and has it configured so that the obvious just does the right thing.

I first tried to play with aide, since I'm not very fond of the Tripwire people, but aide is ugly and just doesn't do what I want. It could probably be made to do what I want, but it's just not worth the hassle. So on to Tripwire.

Tripwire has apparently now put together a mechanism where even if you store your databases on local disk, there's some degree of security. It looks like it generates a key pair and uses that to sign the configuration and policy file and also prevent modifications to the database. Of course, someone could always just replace the whole database and all its keys, but then the next time I went to update Tripwire, I'd notice that the password was different. (And if they did that, they could just replace the Tripwire binary itself, of course.)

It looks like the old -update mode is gone, which I do miss, and which means that the new version of Tripwire is probably not suitable for widespread use on our servers. However, the new -interactive mode (which is now --check --interactive) is really cool. It gives you the list of files that have changed in a "ballot" format in an editor and lets you review them there and delete the x in front of the ones that you don't want to update. Extremely convenient to use.

One thing that I definitely don't like is the report format, though. The old report was far more readable and convenient. I suppose it's GPL'd software and I could fix that if it continues bothering me, though. (This new one would be much harder to process automatically, plus one has to page down just to see if there are any inconsistencies.)

But overall, it just works, with some minor tweaking to the policy file.

2003-06-10: spin 1.30 released

Today was a great day. I ended up spending most of it working on adopting my web page generation software so that we can use it with the ITSS web page templates. I made a ton of progress, too; I've already started to convert the pubsw web pages.

Anyway, after fixing up a few minor things that I needed to add to spin, I took a break to add basic table support to it. You can get the latest version from my web page generation tools page.

Tomorrow, back to converting the pubsw pages (and figuring out what to do about the automatically generated ones).

2003-06-13: Latest haul

My Amazon order came in today (went with them instead of Powells, since I wanted to get some DVDs and some of the stuff I wanted wasn't available from Powells for some reason). The results:

Kim Stanley Robinson -- The Years of Rice and Salt
China Miéville -- The Scar
C.S. Lewis -- Til We Have Faces
C.S. Lewis -- Chronicles of Narnia
C.S. Lewis -- The Signature Collection

The last is the standard set of five (Miracles, Mere Christianity, The Great Divorce, The Screwtape Letters, and The Problem of Pain) plus A Grief Observed.

I also picked up DVDs, namely the Sports Night complete set and the first season of Buffy.

2003-06-19: Sports Night

Sabre got me hooked on this show, and I'm still grateful. After watching the first DVD from Sabre's copy, I ordered the complete collection, and I'm currently through the first four discs.

For those who haven't heard of it, Sports Night is the show that Aaron Sorkin did before West Wing. Marketed as a comedy, it isn't really. It's more a slice of life drama inhabited by lots of snarky people who are extremely good at banter (rather like West Wing, but with less important things going on in the background). It's extremely good. It has the same sort of cast magic that West Wing has, is frequently hilarious, and is always highly entertaining. And is good training in banter to boot.

It only ran two seasons, which means that the complete collection of every episode is rather inexpensive. Highly, highly recommended.

2003-06-19: postfaq 1.11

I've released a new version of postfaq that supports posting via an external program. I needed that in order to sign FAQs using PGPMoose for groups where all articles are signed.

I thought about adding PGPMoose support to postfaq directly, but it looked too annoying and complicated to do, given that postfaq doesn't already use News::Article.

Last spun 2024-01-01 from thread modified 2013-01-04