Posts for October 2006

2006-10-01: Done

Yesterday I posted the evaluation of the new Big Eight management system over the past five months and a personal goodbye to news.groups. It's still really surprising to me how hard that was. Intellectually, anything that I've put that much time and effort into would logically have that kind of hold, but for some reason I never internalized that. The emotions leading up to this have been very interesting, and at times hard to deal with.

Now, I mostly feel a sense of relief, of getting something over with, and a focus shift to getting the things done I need to get done for my annual long vacation (which is coming up fast). But today was still a bit of an emotional backlash day. I sorted through paper mail for the first time in a while, but otherwise didn't get much done (other than venting my spleen a bit at the latest step in LJ's inevitable slide into becoming just another soulless corporation, with special bonus screwing over your co-workers by turning them into public scapegoats).

There are considerably more cool things in the world than I have time to take in and appreciate. I've been feeling that about books for some time now (and video games for nearly as long, particularly since appreciating them has almost completely stalled for me), but I'm now feeling the same thing about The Teaching Company's courses. I save all of their catalogs and magazines until I have a chance to read through them with a notepad handy to record which courses I want to get, and every time I find four or five more that I really want to listen to. I have, since April, managed to listen to only one and a third.

This is yet another thing that I can't do anything about until after vacation, since I don't want to order courses and leave them sitting on my porch for weeks.

I'm down to the last actual work week before vacation. Next week doesn't count; I am determined to hold it open to do comprehensive catch-up and an attempt to dispose of every mail message in one way or another before I go. Agenda for this week is to add examine, enable, and disable support to the kadmind proxy we'll be running on our K4 servers, installing and testing that, finishing converting all of the scripts for our internal mail mangling system to Postfix from assuming qmail, and testing and uploading Shibboleth SP packages for Debian if OpenSAML makes it through NEW. I thought I was going to get a head-start on some of that yesterday or today. Given that I was also posting the Usenet documents, that was rather silly of me to expect.

2006-10-02: Today sucked

So, let's see. What happened today.

First, I started trying to hack support for continuing to do examine and enable/disable through the K4 kadmind that will proxy password changes to K5. We need to have all the password changes go to K5 (which will then propagate them back to K4), but the propagation code doesn't handle enable/disable (I believe). We also have interfaces that depend on examine working properly, and while I can just have that interface print out fake K4 information (or parse kas output), it would be nice to have that work. That was some incredibly annoying bit of coding, though, as XDR started fighting between AFS and K5 and I had to suck in additional bits of our old broken kadmind.

Then, once I thought I had that working, I tried to build it on Solaris (which is where we'll have to run it since our VLDB servers are currently Solaris), which required a bunch more annoying and fragile changes.

In the meantime, I discover that tomorrow I have to go to a meeting in which we get to discuss whether we're going to shoot ourselves in the foot by deciding we're no longer officially supporting a major part of our infrastructure because it's too hard, where things that I said were taken out of context and used to justify this position.

Then, the kadmind doesn't run on Solaris. Or rather it starts, but then each time one connects to it to try to do something, that fails and the forked child process crashes. I start tracking that down to it trying to open files that it really has no business wanting to open, and then discover that I can't easily restart it for testing. Despite the fact that it's trying to set SO_REUSEADDR, it's not working, and each time it crashes it leaves sockets in TIME_WAIT.

Frustrated by having to wait five minutes to try a new build, I decide to just reboot the system, since that will take less than five minutes to come back up. Except that, upon reboot, the system decides that it can no longer read its kernel off disk.

After a half-hour of fighting with that, I try to boot it off the network so that I can at least mount the drive and see if I can repair things, given that this is the test environment that I've spent the last month setting up and some of the setup pieces are annoying and took me a while to get right. But it can't find a RARP server. So I try to go add it to our Jumpstart server, which is also a RARP server, discover that i can't remember the right syntax, fight back and forth with that for a while (it takes a minute each time to add or remove the client because Jumpstart is just that slow), and then discover it still doesn't work.

Oh, and in the meantime, the new OpenAFS release candidate immediately segfaults on AMD64 2.6.18 kernels because the kernel folks have moved things around in yet another new and exciting way.

Giving up on the Solaris system for right now, I decide to just reconstruct my test cell on Linux, where it's much faster, at which point I discover that our local kaserver build doesn't support -noauth so I can't bootstrap a new K4 realm. Build the latest version, still no luck. Finally figure out that it's looking for the NoAuth file in a stupid place, get that bootstrapped, get the environment set up for running kadmind, build the kadmind proxy on Linux, and it segfaults. Further analysis with a debugger reveals that it's corrupting its own K5 context while calling one of the internal functions that you're not supposed to use but that this code has to. Wondering if I broke something, I go back to the original code before I hacked the K4 stuff into it. Same thing. Note that this was all working on Solaris and Solaris had other issues.

I've given up and mailed the developer for help. Oh, and Debian is now starting an Apache 2.2 transition, but I can't build new WebAuth packages for 2.2 because apache2.2-common is uninstallable. Bug filed, which has yet to show up in the Debian BTS for reasons that escape me.

This is really one of the worst days that I can remember having in quite a while. I will be going to bed tonight with more work left to do than I had when I started this morning. Oh, did I mention that I have no line manager and other services that my group runs are currently having serious problems that are tapping out the other people in my group? And tomorrow I get to go talk about whether we should blow a hole in the other foot. Yay.

2006-10-03: Much better

Today was a much better day. It was only a moderately better day for most of the day, although the meeting went far better than I feared (I do like working here; the people are generally reasonable). But this afternoon I managed to kick myself in gear sufficiently to finish converting and documenting the rest of the devnull scripts. Andrew found and rescued my development environment, and then with some additional help from the developer, I got the kadmin proxy working. And even better, I got my modified proxy that supports examine and enable/disable working.

Right now, I'm finally finishing a patch for Debian's openssh package that will let it take over ssh-krb5 so that we can not release etch with two copies of the OpenSSH source code. Hopefully the openssh maintainers will take it.

I've worked twelve hours a day for the past couple of days. Maybe I can take it easier for the rest of the week, particularly since just about everything is done that I needed to do before going on vacation.

2006-10-04: kstart 3.6

Nothing exciting in this release, just some pending bug fixes that I'd committed a while back, mostly related to libkafs support. Debian is getting close to a freeze, though, and I'm getting close to a vacation, so I'm using this as an excuse to flush out some of the pending releases and updates I have.

You can get the latest version from the kstart distribution page.

2006-10-05: pam-krb5 2.4

The main changes in this version are again Heimdal compilation fixes, but I also refactored the code to parse options and extract information from krb5.conf to make it much more readable. Error logging has also been improved, and pam-krb5 now complains about unknown options in the PAM configuration, which will hopefully make it easier to catch typos.

There are a few new features I want to add (my TODO file for the package is small enough it's tempting to knock everything off of it), but that will have to wait until after vacation. I don't want to do a lot of feature development right before the Debian freeze.

You can get the latest version from the pam-krb5 distribution page.

2006-10-06: lbcd 3.3.0

Sparked by a user problem report, I finally dug into the lbcd logic for returning load and protocol two packets (which are the most important packets for lbcd since lbnamed only speaks protocol two) and documented the behavior. It turns out that the default load service does a fairly complex calculation of logged-in users, /tmp space, and load, not just a simple load calculation, but the documentation didn't say that anywhere. It also blanks out all the user numbers for protocol two, which is probably correct in general (since it forces the local weight algorithm to be used), but one needs to have a way to turn that off.

I've now added a new -S option that disables the special protocol two handling and tells the truth, at the cost of not returning to protocol two queries any of the service information.

There was also an embarassing bug in the Linux support that caused lbcd to always return 0 for all the logged-in user numbers. That's now fixed, and I added support to detect Linux console users.

Finally, lbcdclient was sending a single-byte packet before the real query for no good reason. I have no idea why it was written that way originally, but it's now not doing that.

You can get the latest version from the lbcd distribution page.

2006-10-12: On vacation

I'm off until November, at which point I'll have a huge backlog of book reviews to post that will probably fill much of the month. Y'all have fun while I'm gone!

I've disabled comments globally for the whole journal while I'm gone since I won't be around to screen spam. They'll be turned back on after I get back.

Last spun 2024-06-13 from thread modified 2013-01-04