Debian Server Best Practices

Introduction
Software Packaging
File System Layout
Packages vs. Puppet
Packaging Issues
Management Interfaces
Log Rotation and Filtering

Introduction

This document describes the standards for running infrastructure servers on Debian GNU/Linux at Stanford University in 2014 when I left. They're provided here because I think they're fairly good standards and may be of use to other people. I'm no longer developing them and updating them, so they may slowly drift out of date. A few policies and standards here are specific to Stanford, although other sites may find some of our decisions of interest.

Software Packaging

One of the goals in moving to Debian is to minimize the amount of custom work that we need to do for servers. Under Solaris, we built nearly all software that we used from source and made extensive changes to the operating system configuration, mostly not using the software that came with Solaris. With Debian, the goal is to use already-packaged software as much as is possible, to use the native Debian configuration methods and file locations, and to only do work the work that is necessary and directly related to the differences in Stanford's environment.

One implication of this is that software will be managed as Debian packages wherever possible. Not only does this let us easily move to the native Debian package when our local customized package isn't necessary, but it lets us use the Debian packaging system and package tools to manage upgrades and system maintenance rather than inventing our own procedures. As a general rule, any compiled software should be built as a Debian package, as should any scripts installed on the user's path or more complex than a simple /etc/cron.daily or init script. Those Debian packages should be managed like any other Debian package, following the Debian policy standard to the degree that it is applicable, with proper changelog entries. This will make it easier for us to contribute packages back to Debian where possible and to use Debian tools to do package analysis.

Exceptions to this policy may be made for software that is extremely difficult to package for some reason (an Oracle server, for instance). Even software that uses its own automatic update features (such as Sophos PureMessage) should be wrapped in a packaged installer.

Production servers are expected to be running the current stable version of Debian, including whatever updates have been released. Some test and development boxes may run Debian unstable or testing. Special attention must be given to keeping those systems up to date with relevant security fixes.

File System Layout

There are innumerable different ways of organizing software and services in the file system that all make sense. Some are better in some situations and others are better in other situations. However, standardization on a workable single layout across all services has more advantages than picking a layout particularly well-suited for a particular application. Furthermore, Debian itself already enforces a standard for all native packages.

Accordingly, we will stick as closely as possible to the Filesystem Hierarchy Standard for all software managed as Debian packages. This will make our local packages and configuration consistent with what Debian does, make it easier to contribute packages back to Debian, allow us to use normal checking tools like lintian that assume FHS compliance, and make it easier to train new staff.

Specifically, this means that user programs will go in /usr/bin, daemons and other sysadmin-only programs will go in /usr/sbin, architecture-independent data goes in /usr/share, libraries and architecture-dependent support binaries, modules, or data go in /usr/lib, data modified by the service during normal operation will go into the appropriate directory in /var, PID files will go into /var/run, and logs will go into /var/log. See the standard document referenced above for all of the exact details. All scripts that appear on the user's path must go into a regular Debian package and be installed into /usr/bin or /usr/sbin.

Executables should not have extensions indicating what language they're written in. Users don't care that a given executable is implemented as a Perl script, and invocations of executables end up in other scripts and aliases that would then break if the implementation language changes. Avoid any extensions like .pl or .sh.

Normally, we won't have extensive service-specific data that will be served off local disk and won't fit naturally into /var/lib or the like. However, some applications will have this (such as a Subversion server or a streaming media server). The FHS has a provision for these sorts of applications, namely the /srv directory, which is a location for site data that should be independent of the package management system.

The following exception to the FHS is permitted:

/service: We run svscan on all of our systems to provide a simple and easy way of running monitored services. Rather than try to modify daemontools to make it FHS-compliant, we use it stock, which means that the control files for svscan will be in /service. However, now that daemontools has been released as free software and packaged properly for Debian, the contents of /service should be migrated over time to /etc/service. We don't as yet have a transition plan for doing this.

Some systems with old versions of daemontools may still have /command, but this is deprecated.

Kerberos keytabs for a service should go into the configuration directory in /etc for that service. For example, an HTTP/* keytab for Apache should go into /etc/apache2/keytab. Local services that use keytabs are encouraged to create a subdirectory in /etc for them and for other configuration.

Packages vs. Puppet

The primary configuration and package installation for all of our services is handled by Puppet. For each service we run under Debian, we have to decide what files, configuration, and package dependencies should go into Puppet and what should go into Debian packages.

As stated above, the basic rule of thumb is that all compiled software and non-trivial scripts should go into Debian packages so that we can use the package management system to upgrade and maintain them. For other components of a service, the question can be somewhat more ambiguous. We follow these principals:

All compiled binaries, scripts, man pages, and generally anything that isn't a configuration file in the Debian sense and installed under /etc must be packaged. More generic software, such as anything that we could release as open source to the rest of the world, should be packaged under the name of the package. Anything specific to this particular service at Stanford should go into a Stanford-local package named stanford-server-service.
Any software that's relatively generic (could be released as open source meaningfully to the rest of the world) should be packaged as if it were a regular Debian package, including shipping a generic configuration as needed, documentation for how to configure the software, and possibly even debconf questions if needed. If that generic configuration can be used as-is for our service, that may be the only source of that configuration.
Any configuration that's specific to our servers or that changes frequently should be in Puppet. Puppet may be overwriting a generic static configuration installed by the package. This should be done by just replacing the configuration file using Puppet, not using diversions or anything complex, and letting dpkg's normal configuration file handling work. This includes any configuration for integration with other tools specific to our standards, such as filter-syslog rules or newsyslog log rotation configuration files. However, remctl configuration files (but generally not ACL files) should go into the same package as the remctl backend scripts.
Additional packages needed by a service can be handled by either dependencies in the stanford-server-service package or by package installation rules in Puppet. The general rule of thumb is that package dependencies are appropriate for anything used by scripts in the package and for anything without which the package would be meaningless. Package installation rules in Puppet can be used for more ancillary additional packages, such as development packages needed for testing the software. If a service requires a lot of packages, it may be useful to manage that as package dependencies because aptitude is somewhat more efficient resolving dependencies instead of processing multiple independent installation commands from Puppet.

It's fairly clear from the above principles, since this would be Stanford-specific configuration, but Debian packages must not do any keying of systems (downloading keytabs, installing SSL certificates, and so forth). That should always be handled via Puppet.

Packaging Issues

Debian has excellent documentation of how to create Debian packages, and anyone learning how to create Debian packages should read that documentation first. The three primary guides are the New Maintainer's Guide, the Debian Policy Manual, and the Debian Developer's Reference. The New Maintainer's Guide is where you start; Policy is more of a detailed reference manual. The Developer's Reference contains useful information about packaging, but also explains how Debian itself is managed, how to use Debian's bug tracking system, and similar topics.

All of our packages, even the ones that are purely local to Stanford, should comply as closely as possible to Debian policy, since that's what makes Debian a consistent operating system and keeps the overall quality high.

All of our packages should be Lintian-clean, which means that running lintian -iI on the package should not produce any output. Some Lintian errors are unavoidable, particularly for packages that cannot comply with Debian policy for whatever reason. Those packages should include Lintian overrides to silence the error messages.

Reusable software

If there is any software that we've developed locally that would be useful for people at other sites, we should always try to clean it up, make it sufficiently generic that someone else could use it, and package it independently of anything Stanford-specific. Even if we can't get it completely generic, it's worthwhile to get as far as we can without spending too much time on it, since someone else may always have the chance to finish it later.

By making it generic, I mean that it should expect configuration files in /etc or a directory in /etc like normal Debian software, comply with Policy and the FHS (although all of our packages should do that), have an appropriate license and a correct copyright file, have manual pages, and have all of the other components of a regular Debian package. The standard license for all Perl modules written locally is:

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

The standard license for anything else is:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

(This is the standard Expat license, also often referred to as the MIT License, although that's ambiguous since there are multiple MIT licenses.)

For software we maintain locally and make releases of with regular version numbers (things like newsyslog, kstart, or remctl), we maintain the Debian packaging files on a separate branch of the Git repository for the package following my Git packaging layout for combined packaging and upstream. Any changes outside of the debian directory warrant a new, regular release of the package and a new Debian build with Debian version -1. Changes only to the files in debian can be made without a regular release and with a regular increment of the Debian version. The Debian packaging files are not included in the regular release of the package, following Debian packaging best practices.

If we use software that we don't maintain but that hasn't been packaged for Debian, we should package it following the normal rules and best practices for packaging software for Debian. These packages should be maintained using Git and git-buildpackage, with the Git repositories stored on git.stanford.edu, using my Git Debian packaging layout.

Stanford-specific packages

Packages specific to Stanford should generally have names beginning with "stanford-" and should only contain the Stanford-specific scripts, configuration, and layout that are not generally usable at other sites. When possible, more generic software should live in separate packages with more generic names and the Stanford-specific package should depend on the more generic packages.

While it is not as important for Stanford-specific packages to comply with Debian policy in every respect, every effort should still be made to keep them compliant. This is both for consistency and so that standard packaging tools like Lintian can be used to check for real errors without producing a lot of noise.

Stanford versions of standard packages often can't take a different name due to dependencies in other packages that must be maintained. In that case, where we have to package a standard package with different patches or different build options, it should instead have a version formed by appending su and a number to the Debian revision of the package on which it's based. In this case, also consider creating a separate specialized repository for the service requiring the different package build to reduce the risk of the change on other unrelated services.

Management Interfaces

All services should provide management interfaces through remctl where possible. The stanford-server package, installed on all infrastructure servers, sets up some basic interfaces to control Puppet and to upgrade packages and maintains remctl ACL files for staff. Services should take advantage of those ACL files and add additional remctl interfaces to control the operation of the service or retrieve information, where possible and appropriate. Any operation that has to routinely be done with a service and which would otherwise require logging on to the system should be provided via a remctl interface.

There are two remctl ACL files used for internal interfaces:

idg: all IDG staff
idg-root: all IDG staff, root instances only

Any operation that is equivalent to having root on the system (such as running Puppet or installing packages) should be limited to idg-root, as should any interface that allows access to restricted or private data, such as spooled e-mail.

Normally, a service should provide a single remctl backend script that implements a group of related actions to control a specific portion of the service. This script is generally named service-backend and installed in /usr/sbin (since normally it requires root privileges for at least some of its operations). It is then configured in a remctl configuration fragment under /etc/remctl/conf.d named service and containing something like:

    service ALL /usr/sbin/service-backend \
        /etc/remctl/acl/idg

(or one of the other ACLs as mentioned above). This sends any commands of the type service to this script.

Such a script should support a help command which provides a short summary (preferrably one line per command) of the available commands. Alternately, it can provide a text conversion of the script documentation, but a short summary is preferred.

All input to such a backend should be carefully checked and treated as untrusted except as required for the operation, even if initially access is limited to UNIX Systems staff. This makes it easier to open up access later to Help Desk staff or others when appropriate.

Log Rotation and Filtering

We use newsyslog for log rotation and use filter-syslog for syslog analysis instead of the standard Debian packages. This is for a variety of reasons, mostly related to staff familiarity with newsyslog and its improved support for saving logs into AFS. At some point, we may modify logrotate to do what we need, but it's not a high priority project.

Normally, all system logs will go to /var/log/messages rather than using Debian's default log layout, so that everything can be found in the same place. This log should be filtered with filter-syslog against a list of known expected log entries, so that we're notified of anything unusual in that log. This is set up automatically by Puppet; the only thing that each service Puppet configuration or module has to do is install additional filtering rules in /etc/filter-syslog if necessary. At least two weeks of old logs should be kept on local disk, but this log should not be saved into AFS. The analysis should be sent to the local root account, which should be forwarded to the appropriate alerts address for the service so that the people maintaining the service will see the mail (this is set up by Puppet).

If the service logs through syslog and is verbose or provides logs that we want to keep for metric information, those entries should be directed to a different file with a modified /etc/syslog.conf. That file should be rotated into AFS and in some cases may also need to be analyzed.

newsyslog is configured to run all of the configuration fragments in /etc/newsyslog.daily, /etc/newsyslog.weekly, and /etc/newsyslog.monthly on those intervals, and other configuration can be dropped into the appropriate directory to rotate application and web server logs.

Last spun 2022-02-06 from thread modified 2017-07-15