Debian Server Best Practices

Table of Contents

Introduction
File System Layout
Packages vs. Templates
Packaging Issues
Template Layout
The Build Script
The Bundle
Log Rotation and Filtering

Introduction

This document describes the standards for running infrastructure servers on Debian GNU/Linux at Stanford University. The policies and standards here are in many respects specific to Stanford, although other sites may find some of our decisions of interest. It will be kept up-to-date here as we modify and evolve our procedures.

One of the goals in moving to Debian is to minimize the amount of custom work that we need to do for servers. Under Solaris, we built nearly all software that we used from source and made extensive changes to the operating system configuration, mostly not using the software that came with Solaris. With Debian, the goal is to use already-packaged software as much as is possible, to use the native Debian configuration methods and file locations, and to only do work the work that is necessary and directly related to the differences in Stanford's environment.

One implication of this is that software will be managed as Debian packages wherever possible. Not only does this let us easily move to the native Debian package when our local customized package isn't necessary, but it lets us use the Debian packaging system and package tools to manage upgrades and system maintenance rather than inventing our own procedures. As a general rule, any compiled software should be built as a Debian package, as should any scripts and configuration which are usable by more than one service. Those Debian packages should be managed like any other Debian package, following the Debian policy standard to the degree that it is applicable, with proper changelog entries. This will make it easier for us to contribute packages back to Debian where possible.

Exceptions to this policy may be made for software that is extremely difficult to package for some reason (an Oracle server, for instance), or software that uses its own automatic update features and for which maintaining a package would just mean pointless extra work (such as Sophos PureMessage).

Production servers are expected to be running the current stable version of Debian, including whatever updates have been released. Some test and development boxes may run Debian unstable or testing. Special attention must be given to keeping those systems up to date with relevant security fixes.

File System Layout

There are innumerable different ways of organizing software and services in the file system that all make sense. Some are better in some situations and others are better in other situations. However, standardization on a workable single layout across all services has more advantages than picking a layout particularly well-suited for a particular application. Furthermore, Debian itself already enforces a standard for all native packages.

Accordingly, we will stick as closely as possible to the Filesystem Hierarchy Standard for all software managed as Debian packages. This will make our local packages and configuration consistent with what Debian does, make it easier to contribute packages back to Debian, allow us to use normal checking tools like lintian that assume FHS compliance, and make it easier to train new staff.

Specifically, this means that user binaries will go in /usr/bin, daemons and other sysadmin-only binaries will go in /usr/sbin, architecture-independent data goes in /usr/share, libraries and architecture-dependent support binaries, modules, or data go in /usr/lib, data modified by the service during normal operation will go into the appropriate directory in /var, PID files will go into /var/run, and logs will go into /var/log. See the standard document referenced above for all of the exact details.

If we have scripts that aren't worth packaging as a regular Debian package, they can be installed in /usr/local/bin with manual pages in /usr/local/man. However, this should be minimized; where reasonable, scripts should be generalized and packaged as regular Debian packages to allow for easier upgrades, testing, and revision control. Scripts that do live in /usr/local/bin may look for configuration files in either /etc or in /usr/local/etc as seems the most appropriate for that script.

Executables should not have extensions indicating what language they're written in. Users don't care that a given executable is implemented as a Perl script, and invocations of executables end up in other scripts and aliases that would then break if the implementation language changes. Avoid any extensions like .pl or .sh.

Normally, we won't have extensive service-specific data that will be served off local disk and won't fit naturally into /var/lib or the like. However, some applications will have this (such as a Subversion server or a streaming media server). The FHS has a provision for these sorts of applications, namely the /srv directory, which is a location for site data that should be independent of the package management system. This is a fairly new provision, but we should try to use it and only move away from it if it really doesn't work.

The following exceptions to the FHS are permitted:

/service

We run svscan on all of our systems to provide a simple and easy way of running monitored services. Rather than try to modify daemontools to make it FHS-compliant, we use it stock, which means that the control files for svscan will be in /service.

/command

To avoid too many unnecessary divergences, we let djb's daemontools install itself in /command (and likewise with other djb software that has been modified to use that packaging system).

/apps

For Java applications that we have not yet packaged or that are too difficult to package for some reason, we may still install them into /apps with their own startup scripts and log directory. This is not a good long-term solution. Long-term, we need to figure out how to properly package Java applications so that we can use all of our normal packaging infrastructure to configure and deploy them. However, in the short term, this will likely be more trouble than it's worth.

Any additional exceptions should be kept to a minimum and only used for cases where it's very difficult to modify the software to comply with FHS and there is little gain in doing so. For internally-developed software, we should strive for making it FHS-compliant as much as possible.

Note that while we are maintaining symlinks in /etc/leland for backward compatibility reasons, we're moving away from /etc/leland towards putting configuration files in /etc like all normal Debian software does. This particularly includes the Kerberos configuration files, which work fine with the standard Debian Kerberos packages. All of krb5.conf, krb5.keytab, krb.conf, krb.realms, and srvtab must be present in /etc as well as (only if needed for backward compatibility) /etc/leland so that the native Kerberos packages will work properly.

For right now, we're also using /etc/leland as a place to put additional srvtabs and keytabs used by the system. This isn't an unreasonable place to put such files, although this may change later to something like /etc/keytabs, depending on whether the migration trouble seems worth it.

Packages vs. Templates

Each of our services will have a template, in /afs/ir/service, which contains the scripts, bundles, and configuration files to build a new instance of that service. For each service we run under Debian, we have to decide what files and configuration should go into the template and what should go into Debian packages.

As stated above, the basic rule of thumb is that all compiled software and non-trivial scripts should go into Debian packages so that we can use the package management system to upgrade and maintain them. Similarly, any software that's used across multiple services should be made into a Debian package so that it can easily be reused. For other components of a service, the question can be somewhat more ambiguous.

If a service needs to have a Debian package (generally named something like stanford-service to make it clear that it's Stanford-specific), it's easiest to put everything service-related that can be put into that package there. That way, as much of the configuration as possible is all in the same place. The exception is that we don't want to produce multiple versions of the Debian package for different tiers of servers (production, preprod, test, development), so any configuration that needs to change according to the tier of the service needs to live in the template. (This is far easier than prompting during the installation of the package.) The service should be structured so that the quantity of those configuration files is kept to an absolute minimum, preferrably only a single file, so that as much of the configuration as possible can be kept generic.

The Debian package for a service should also only contain things specific to that service, not more general system configuration. Most of the generic configuration will be maintained in the stanford-server Debian package and in the bundles installed off of the Debian build system, but if configuration for other daemons (such as lbcd) is required, this should go into the template rather than in the Debian package for a service. Also, all SSL certificates, srvtabs, and keytabs should be downloaded by the build script in the template area rather than by a Debian package.

Modifying files that are installed by other packages in a Debian package is difficult and prone to trouble, and should be avoided unless necessary. As a result, it is preferrable to maintain overrides to configuration files for other packages in the template rather than in the Debian package. If a Debian package does modify the configuration of another Debian package, it must use dpkg-divert appropriately to avoid packaging conflicts.

If the service doesn't warrant its own package (if, for example, we can use stock Debian packages), it's fine to put more borderline configuration or small scripts into the template area to avoid the overhead of maintaining a separate Debian package. Beware of letting the template become too large, though; as soon as it has acquired comprehensive configuration or extensive scripts that are infrequently updated and are better maintained by a Debian package, a Debian package for that service should be created. Note that having a Debian package is also convenient if the packages that make up a service are complex, since the Debian package for the service can depend on all of the other required services, making it easier to install.

Packaging Issues

Debian has excellent documentation of how to create Debian packages, and anyone learning how to create Debian packages should read that documentation first. The three primary guides are the New Maintainer's Guide, the Debian Policy Manual, and the Debian Developer's Reference. The New Maintainer's Guide is where you start; Policy is more of a detailed reference manual. The Developer's Reference contains useful information about packaging, but also explains how Debian itself is managed, how to use Debian's bug tracking system, and similar topics.

All of our packages, even the ones that are purely local to Stanford, should comply as closely as possible to Debian policy, since that's what makes Debian a consistent operating system and keeps the overall quality high.

All of our packages should be lintian-clean, which means that running lintian on the package should not produce any output. Some lintian errors are unavoidable, particularly for packages that cannot comply with Debian policy for whatever reason. Those packages should include lintian overrides to silence the error messages.

Reusable software

If there is any software that we've developed locally that would be useful for people at other sites, we should always try to clean it up, make it sufficiently generic that someone else could use it, and package it independently of anything Stanford-specific. Even if we can't get it completely generic, it's worthwhile to get as far as we can without spending too much time on it, since someone else may always have the chance to finish it later.

By making it generic, I mean that it should expect configuration files in /etc or a directory in /etc like normal Debian software, comply with Policy and the FHS (although all of our packages should do that), have an appropriate license and a correct copyright file, have manual pages, and have all of the other components of a regular Debian package.

For software we maintain locally and make releases of with regular version numbers (things like newsyslog, kstart, or remctl), we maintain the Debian packaging files in a debian directory in the regular CVS tree for that package. Any changes outside of the debian directory warrant a new, regular release of the package and a new Debian build with Debian version -1. Changes only to the files in debian can be made without a regular release and with a regular increment of the Debian version. The Debian packaging files are not included in the regular release of the package, following Debian packaging best practices, but it's convenient to have them in the same CVS repository.

If we use software that we don't maintain but that hasn't been packaged for Debian, we should package it following the normal rules and best practices for packaging software for Debian. These packages should be maintained using svn-buildpackage and the Debian Subversion repository on subversion.stanford.edu.

Stanford-specific packages

Packages specific to Stanford should generally have names beginning with "stanford-" and should only contain the Stanford-specific bits of configuration and scripts that are not generally usable at other sites. When possible, more generic software should live in separate packages with more generic names and the Stanford-specific package should depend on the more generic packages.

One very useful use of Stanford-specific packages is to have them depend on all of the regular packages that a particular service needs. This is useful enough that it may even be worthwhile creating an empty Stanford-specific package for some services, just to manage the dependencies.

While it is not as important for Stanford-specific packages to comply with Debian policy in every respect, every effort should still be made to keep them compliant. This is both for consistency and so that standard packaging tools like lintian can be used to check for real errors without producing a lot of noise.

Overriding configuration with dpkg-divert

Much Debian software supports configuration by dropping additional files into a particular directory (see, for instance, /etc/apache2/conf.d), precisely so that multiple packages can add their own relevant configuration without having to modify the main configuration files. This mechanism should be used whenever possible, since it's a significant additional hassle to manage cases where multiple packages provide the same configuration file.

Sometimes, however, it's necessary to override a configuration file provided by another package with one that contains Stanford-specific configuration. In this case, the dpkg-divert command should be run in the preinstall and post-remove scripts of the Stanford-specific package to add and remove a diversion for that configuration file.

The basic idea of a diversion is to allow one package to provide the same file as is provided by another package by forcing the other package's file to be installed under a different name. For more information, see the dpkg-divert manual page.

Here is an example preinst script from a Stanford-specific Debian package named stanford-weblogin, which diverts the files /etc/webkdc/token.acl and /etc/webkdc/webkdc.conf:

    #!/bin/sh
    # preinst for stanford-weblogin.  Divert some of the standard
    # configuration files from libapache2-webkdc and libwebkdc-perl so
    # that we can install the Stanford ones.

    set -e

    if [ "$1" = install -o "$1" = upgrade ] ; then
        dpkg-divert --add --package stanford-weblogin --rename \
            --divert /etc/webkdc/token.acl.generic /etc/webkdc/token.acl
        dpkg-divert --add --package stanford-weblogin --rename \
            --divert /etc/webkdc/webkdc.conf.generic /etc/webkdc/webkdc.conf
    fi

    #DEBHELPER#

    exit 0

Here is the corresponding postrm script:

    #!/bin/sh
    # postrm for stanford-weblogin.  Remove the diversions added for the
    # generic configuration files.

    set -e

    if [ "$1" = remove ] ; then
        dpkg-divert --remove --package stanford-weblogin --rename \
            --divert /etc/webkdc/token.acl.generic /etc/webkdc/token.acl
        dpkg-divert --remove --package stanford-weblogin --rename \
            --divert /etc/webkdc/webkdc.conf.generic /etc/webkdc/webkdc.conf
    fi

    #DEBHELPER#

    exit 0

When creating packages that need diversions, use the above examples as models, changing the package name of the Stanford-specific package and, of course, the file names to divert as appropriate. You can also do the equivalent of the above in the build script if a Stanford-specific package isn't warranted.

Template Layout

All services should have a template in /afs/ir/service. See the instructions in the README file in that directory for more information on the layout and how it should be used.

Inside the template for a particular service should be at least the following elements: a script to install the files needed for a particular service, named something like build-service; a bundle to install whatever files are maintained in the template area, traditionally named setup.b; and a (possibly very brief) README file that explains the purpose of the template.

The preferred layout of the template is to organize files within the template as a mirror of where they will be installed on disk. In other words, a replacement for /etc/syslog.conf on the system would be in a directory named etc in the template, and scripts to be installed in /usr/local/bin should be in a directory named usr/local/bin in the template. This has several advantages: it makes it easier to find the file in the template corresponding to a system file, it requires less thought in figuring out how to organize the template and therefore makes template organization more consistent, and it potentially allows templates to be easily turned into Debian packages using rsync or the like.

Some older templates instead have directories like system (for system configuration like log rotation), scripts, acl, and so forth and use the bundle to put things in the right place. This is still acceptable, and may be preferrable if there are a lot of scripts or ACL files, but is not as commonly used and probably shouldn't be used for new templates without a good reason.

The Build Script

The build script should be a Bourne or bash shell script that does everything required to turn a generic build into a server for a particular service. This includes installing all required packages, running the appropriate bundles in the template area, installing the required keytabs, srvtabs, and SSL certificates, and installing iptables rules as required by that service.

The build script must never change directories into the template directory, since there may be multiple versions of the template and the one being installed may not be the current version. Instead, it should assume that it's being run from the top level of the service template, and any directory changes that it has to do internally should be done in a subshell (in other words, wrap the set of commands requiring a directory change in parentheses).

The build script should start with set -e so that it will exit automatically if any command it runs fails. This will make it easier to pinpoint failures and to recover from them.

The build script should begin with:

    . /usr/pubsw/lib/build-scripts/functions

so that it can use the various generic functions we've developed for use in these build scripts.

Avoid defining new functions in the build script and then calling them; instead, just do everything in linear order. This is easier to read. If there are particular things that are just crying out to be functions, put them into the global function library instead.

If the service has multiple supported tiers (production, preprod, etc.), the build script should either prompt for the tier (making sure that the entered value is valid) or, more commonly, should prompt for the hostname that the system is being built as and then use that to determine the appropriate tier. The build script can then generate appropriate configuration files or run appropriate bundles based on the tier, as well as use the tier information to chose the right keytabs, srvtabs, and SSL certificates to install.

Keytabs and srvtabs should be installed with the get_keytab function. This function takes three arguments: the name of the principal (must be in K4 format, not K5 format), the path to where to put the keytab, and the path to where to put the srvtab. The last argument may be omitted if no srvtab is wanted (for webauth principals, for instance). For example:

    get_keytab webauth.www /etc/webauth/keytab

The needed Debian packages should be installed with apt-get. Normally, apt-get update should be run before package installation to make sure to get the latest versions and to make sure that one doesn't get "file not found" errors from apt-get install due to old package indices.

SSL certificates should be installed with the get_cert function. This function takes three arguments: the base path of the certificate to install, the name under which to install the certificate on local disk, and which Comodo root certificate to use (optional, assumes 2012 if not given). The base path should be in a certs directory under the relevant service directory, one up from the template directory, and the name of the cert should generally be the machine name or class. For example, the base path for the certificate for the production weblogin pool would be:

    /afs/ir/service/auth/certs/weblogin

.cert.pgp and .key.pgp will be appended to find the encrypted files, and the certificate will be installed in /etc/apache2/ssl. The second argument should generally be the name of the system or pool.

The build script should also run:

    /afs/ir/service/jumpstart/scripts/build-iptables -i
    iptables-restore < /etc/iptables/eth0

to set up the iptables rules for the system. If there are any additional iptables modules needed for the system, add them to the build-iptables command line.

At the end of the build script, it's best to print out a summary of what (if anything) has to be done next to produce a running system, and also a summary of all variables that the script prompted for or figured out for itself while running. This serves as a useful final sanity check on the build process.

Note that the build script should ideally be something one can re-run repeatedly.

The Bundle

Any configuration not installed by the general build system or by packages installed by the build script should be managed by bundle. This allows easy application of configuration changes, verification against the template, and diffs against the installed configuration.

The main bundle for a template should generally be named setup.b. An exception is if the same template installs multiple systems, in which case one can name the bundles after the systems they install, with or without a setup- prefix as seems appropriate. Try to keep everything together in one bundle as much as possible; if there are shared files between all of the systems using that template (and if not, why are they together in one template?), having a single setup.b that does the shared work and then separate supplemental bundles for the different services is the best approach. Try to avoid breaking particular subsystems off into separate bundles unless required for some reason; it's much easier to not have to figure out which bundle needs applying after making a change.

All references to template files in the bundle must be relative (unless the bundle is, for some reason, pulling files from outside of that template -- that should almost never be the case, since such shared files should be installed as a Debian package). The bundle may assume that it will be run with a working directory of the top of the template area. Absolute paths should not be used since there may be multiple copies of the same template corresponding to different version numbers or work in progress on a new template.

It's harder to update systems if the main setup.b bundle requires that variables be set on the command-line, so instead setup.b should just do everything that can be done without requiring knowledge of any variable settings and which is the same across all possible installation tiers. Separate bundles should then be used for each installation tier (or for the master and the slaves if the service is broken down like that), and for the portion of the installation that requires variables be set. That way, the build script can invoke all of the bundles as appropriate, with the right variable settings, and for the most normal case of an update that doesn't change anything that's tier-specific or dependent on a variable, a systems administrator can just run the setup.b bundle. Tier-specific bundles should be named setup-tier.b, such as setup-prod.b, setup-dev.b, etc. Try to avoid variables whenever possible, even in supplemental bundles, since they make it very hard to apply selected changes without re-running the build script.

If it can possibly helped, do not install files via bundle that contain variables and then filter the files after installation to replace those variables with something else. If you must do this, set younger=1 when installing those files in bundle so that they don't always show up as changed, and be aware of the side effects of that choice. A far better tactic is to avoid the need for this sort of customization. For example, it should never be necessary to substitute the system name or IP address into an Apache configuration. The need for the IP address can be avoided by using name-based virtual hosts with the wildcard cert for SSL, and the system name can be handled with tier-specific bundles or, if necessary, be written into a one-line Apache configuration fragment in the build script rather than installed via bundle.

Log Rotation and Filtering

We use newsyslog for log rotation and use filter-syslog for syslog analysis instead of the standard Debian packages. This is for a variety of reasons, mostly related to staff familiarity with newsyslog and its improved support for saving logs into AFS. At some point, we may modify logrotate to do what we need, but it's not a high priority project.

Normally, all system logs will go to /var/log/syslog rather than using Debian's default log layout, so that everything can be found in the same place. This log should be filtered with filter-syslog against a list of known expected log entries, so that we're notified of anything unusual in that log. This is set up automatically by the Debian build system; the only thing that each template has to do is install additional filtering rules in /etc/filter-syslog if necessary. At least two weeks of old logs should be kept on local disk, but this log should not be saved into AFS. The analysis should be sent to the local root account, which should be forwarded to the appropriate alerts address for the service so that the people maintaining the service will see the mail (this is set up by the Debian build system).

If the service logs through syslog and is verbose or provides logs that we want to keep for metric information, those entries should be directed to a different file with a modified /etc/syslog.conf. That file should be rotated into AFS and in some cases may also need to be analyzed.

newsyslog is configured to run all of the configuration fragments in /etc/newsyslog.daily, /etc/newsyslog.weekly, and /etc/newsyslog.monthly on those intervals, and other configuration can be dropped into the appropriate directory to rotate application and web server logs.

Last spun 2007-09-19 from thread modified 2005-12-19