Russ Allbery > Technical Notes > Debian | Debian Private Repositories > |
Introduction
Software Packaging
File System Layout
Packages vs. Puppet
Packaging Issues
Management Interfaces
Log Rotation and Filtering
This document describes the standards for running infrastructure servers on Debian GNU/Linux at Stanford University in 2014 when I left. They're provided here because I think they're fairly good standards and may be of use to other people. I'm no longer developing them and updating them, so they may slowly drift out of date. A few policies and standards here are specific to Stanford, although other sites may find some of our decisions of interest.
One of the goals in moving to Debian is to minimize the amount of custom work that we need to do for servers. Under Solaris, we built nearly all software that we used from source and made extensive changes to the operating system configuration, mostly not using the software that came with Solaris. With Debian, the goal is to use already-packaged software as much as is possible, to use the native Debian configuration methods and file locations, and to only do work the work that is necessary and directly related to the differences in Stanford's environment.
One implication of this is that software will be managed as Debian
packages wherever possible. Not only does this let us easily move to the
native Debian package when our local customized package isn't necessary,
but it lets us use the Debian packaging system and package tools to manage
upgrades and system maintenance rather than inventing our own procedures.
As a general rule, any compiled software should be built as a Debian
package, as should any scripts installed on the user's path or more
complex than a simple /etc/cron.daily
or init script. Those Debian
packages should be managed like any other Debian package, following the
Debian policy standard to the degree that it is
applicable, with proper changelog entries. This will make it easier for
us to contribute packages back to Debian where possible and to use Debian
tools to do package analysis.
Exceptions to this policy may be made for software that is extremely difficult to package for some reason (an Oracle server, for instance). Even software that uses its own automatic update features (such as Sophos PureMessage) should be wrapped in a packaged installer.
Production servers are expected to be running the current stable version of Debian, including whatever updates have been released. Some test and development boxes may run Debian unstable or testing. Special attention must be given to keeping those systems up to date with relevant security fixes.
There are innumerable different ways of organizing software and services in the file system that all make sense. Some are better in some situations and others are better in other situations. However, standardization on a workable single layout across all services has more advantages than picking a layout particularly well-suited for a particular application. Furthermore, Debian itself already enforces a standard for all native packages.
Accordingly, we will stick as closely as possible to the Filesystem Hierarchy Standard for all software managed as Debian packages. This will make our local packages and configuration consistent with what Debian does, make it easier to contribute packages back to Debian, allow us to use normal checking tools like lintian that assume FHS compliance, and make it easier to train new staff.
Specifically, this means that user programs will go in /usr/bin
,
daemons and other sysadmin-only programs will go in /usr/sbin
,
architecture-independent data goes in /usr/share
, libraries and
architecture-dependent support binaries, modules, or data go in
/usr/lib
, data modified by the service during normal operation will
go into the appropriate directory in /var
, PID files will go into
/var/run
, and logs will go into /var/log
. See the standard
document referenced above for all of the exact details. All scripts that
appear on the user's path must go into a regular Debian package and be
installed into /usr/bin
or /usr/sbin
.
Executables should not have extensions indicating what language they're
written in. Users don't care that a given executable is implemented as a
Perl script, and invocations of executables end up in other scripts and
aliases that would then break if the implementation language changes.
Avoid any extensions like .pl
or .sh
.
Normally, we won't have extensive service-specific data that will be
served off local disk and won't fit naturally into /var/lib
or the
like. However, some applications will have this (such as a Subversion
server or a streaming media server). The FHS has a provision for these
sorts of applications, namely the /srv
directory, which is a
location for site data that should be independent of the package
management system.
The following exception to the FHS is permitted:
/service
We run svscan
on all of our systems to provide a simple and
easy way of running monitored services. Rather than try to modify
daemontools to make it FHS-compliant, we use it stock, which means
that the control files for svscan will be in /service
.
However, now that daemontools has been released as free software and
packaged properly for Debian, the contents of /service
should
be migrated over time to /etc/service
. We don't as yet have a
transition plan for doing this.
Some systems with old versions of daemontools may still have
/command
, but this is deprecated.
Kerberos keytabs for a service should go into the configuration directory
in /etc
for that service. For example, an HTTP/*
keytab for
Apache should go into /etc/apache2/keytab
. Local services that use
keytabs are encouraged to create a subdirectory in /etc
for them
and for other configuration.
The primary configuration and package installation for all of our services is handled by Puppet. For each service we run under Debian, we have to decide what files, configuration, and package dependencies should go into Puppet and what should go into Debian packages.
As stated above, the basic rule of thumb is that all compiled software and non-trivial scripts should go into Debian packages so that we can use the package management system to upgrade and maintain them. For other components of a service, the question can be somewhat more ambiguous. We follow these principals:
All compiled binaries, scripts, man pages, and generally anything that
isn't a configuration file in the Debian sense and installed under
/etc
must be packaged. More generic software, such as anything
that we could release as open source to the rest of the world, should
be packaged under the name of the package. Anything specific to this
particular service at Stanford should go into a Stanford-local package
named stanford-server-service.
Any software that's relatively generic (could be released as open source meaningfully to the rest of the world) should be packaged as if it were a regular Debian package, including shipping a generic configuration as needed, documentation for how to configure the software, and possibly even debconf questions if needed. If that generic configuration can be used as-is for our service, that may be the only source of that configuration.
Any configuration that's specific to our servers or that changes frequently should be in Puppet. Puppet may be overwriting a generic static configuration installed by the package. This should be done by just replacing the configuration file using Puppet, not using diversions or anything complex, and letting dpkg's normal configuration file handling work. This includes any configuration for integration with other tools specific to our standards, such as filter-syslog rules or newsyslog log rotation configuration files. However, remctl configuration files (but generally not ACL files) should go into the same package as the remctl backend scripts.
Additional packages needed by a service can be handled by either dependencies in the stanford-server-service package or by package installation rules in Puppet. The general rule of thumb is that package dependencies are appropriate for anything used by scripts in the package and for anything without which the package would be meaningless. Package installation rules in Puppet can be used for more ancillary additional packages, such as development packages needed for testing the software. If a service requires a lot of packages, it may be useful to manage that as package dependencies because aptitude is somewhat more efficient resolving dependencies instead of processing multiple independent installation commands from Puppet.
It's fairly clear from the above principles, since this would be Stanford-specific configuration, but Debian packages must not do any keying of systems (downloading keytabs, installing SSL certificates, and so forth). That should always be handled via Puppet.
Debian has excellent documentation of how to create Debian packages, and anyone learning how to create Debian packages should read that documentation first. The three primary guides are the New Maintainer's Guide, the Debian Policy Manual, and the Debian Developer's Reference. The New Maintainer's Guide is where you start; Policy is more of a detailed reference manual. The Developer's Reference contains useful information about packaging, but also explains how Debian itself is managed, how to use Debian's bug tracking system, and similar topics.
All of our packages, even the ones that are purely local to Stanford, should comply as closely as possible to Debian policy, since that's what makes Debian a consistent operating system and keeps the overall quality high.
All of our packages should be Lintian-clean, which means that running
lintian -iI
on the package should not produce any output. Some
Lintian errors are unavoidable, particularly for packages that cannot
comply with Debian policy for whatever reason. Those packages should
include Lintian overrides to silence the error messages.
If there is any software that we've developed locally that would be useful for people at other sites, we should always try to clean it up, make it sufficiently generic that someone else could use it, and package it independently of anything Stanford-specific. Even if we can't get it completely generic, it's worthwhile to get as far as we can without spending too much time on it, since someone else may always have the chance to finish it later.
By making it generic, I mean that it should expect configuration files in
/etc
or a directory in /etc
like normal Debian software,
comply with Policy and the FHS (although all of our packages should do
that), have an appropriate license and a correct copyright
file,
have manual pages, and have all of the other components of a regular
Debian package. The standard license for all Perl modules written locally
is:
This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
The standard license for anything else is:
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
(This is the standard Expat license, also often referred to as the MIT License, although that's ambiguous since there are multiple MIT licenses.)
For software we maintain locally and make releases of with regular version
numbers (things like newsyslog, kstart, or remctl), we maintain the Debian
packaging files on a separate branch of the Git repository for the package
following my Git packaging layout for combined
packaging and upstream. Any changes outside of the debian
directory warrant a new, regular release of the package and a new Debian
build with Debian version -1. Changes only to the files in debian
can be made without a regular release and with a regular increment of the
Debian version. The Debian packaging files are not included in the
regular release of the package, following Debian packaging best practices.
If we use software that we don't maintain but that hasn't been packaged for Debian, we should package it following the normal rules and best practices for packaging software for Debian. These packages should be maintained using Git and git-buildpackage, with the Git repositories stored on git.stanford.edu, using my Git Debian packaging layout.
Packages specific to Stanford should generally have names beginning with "stanford-" and should only contain the Stanford-specific scripts, configuration, and layout that are not generally usable at other sites. When possible, more generic software should live in separate packages with more generic names and the Stanford-specific package should depend on the more generic packages.
While it is not as important for Stanford-specific packages to comply with Debian policy in every respect, every effort should still be made to keep them compliant. This is both for consistency and so that standard packaging tools like Lintian can be used to check for real errors without producing a lot of noise.
Stanford versions of standard packages often can't take a different name
due to dependencies in other packages that must be maintained. In that
case, where we have to package a standard package with different patches
or different build options, it should instead have a version formed by
appending su
and a number to the Debian revision of the package on
which it's based. In this case, also consider creating a separate
specialized repository for the service requiring the different package
build to reduce the risk of the change on other unrelated services.
All services should provide management interfaces through remctl where possible. The stanford-server package, installed on all infrastructure servers, sets up some basic interfaces to control Puppet and to upgrade packages and maintains remctl ACL files for staff. Services should take advantage of those ACL files and add additional remctl interfaces to control the operation of the service or retrieve information, where possible and appropriate. Any operation that has to routinely be done with a service and which would otherwise require logging on to the system should be provided via a remctl interface.
There are two remctl ACL files used for internal interfaces:
idg
: all IDG staffidg-root
: all IDG staff, root instances only
Any operation that is equivalent to having root on the system (such as
running Puppet or installing packages) should be limited to
idg-root
, as should any interface that allows access to
restricted or private data, such as spooled e-mail.
Normally, a service should provide a single remctl backend script that
implements a group of related actions to control a specific portion of the
service. This script is generally named service-backend and
installed in /usr/sbin
(since normally it requires root privileges
for at least some of its operations). It is then configured in a remctl
configuration fragment under /etc/remctl/conf.d
named
service and containing something like:
service ALL /usr/sbin/service-backend \ /etc/remctl/acl/idg
(or one of the other ACLs as mentioned above). This sends any commands of the type service to this script.
Such a script should support a help
command which provides a short
summary (preferrably one line per command) of the available commands.
Alternately, it can provide a text conversion of the script documentation,
but a short summary is preferred.
All input to such a backend should be carefully checked and treated as untrusted except as required for the operation, even if initially access is limited to UNIX Systems staff. This makes it easier to open up access later to Help Desk staff or others when appropriate.
We use newsyslog for log rotation and use filter-syslog for syslog analysis instead of the standard Debian packages. This is for a variety of reasons, mostly related to staff familiarity with newsyslog and its improved support for saving logs into AFS. At some point, we may modify logrotate to do what we need, but it's not a high priority project.
Normally, all system logs will go to /var/log/messages
rather than
using Debian's default log layout, so that everything can be found in the
same place. This log should be filtered with filter-syslog against a list
of known expected log entries, so that we're notified of anything unusual
in that log. This is set up automatically by Puppet; the only thing that
each service Puppet configuration or module has to do is install
additional filtering rules in /etc/filter-syslog
if necessary. At
least two weeks of old logs should be kept on local disk, but this log
should not be saved into AFS. The analysis should be sent to the local
root account, which should be forwarded to the appropriate alerts address
for the service so that the people maintaining the service will see the
mail (this is set up by Puppet).
If the service logs through syslog and is verbose or provides logs that we
want to keep for metric information, those entries should be directed to a
different file with a modified /etc/syslog.conf
. That file should
be rotated into AFS and in some cases may also need to be analyzed.
newsyslog is configured to run all of the configuration fragments in
/etc/newsyslog.daily
, /etc/newsyslog.weekly
, and
/etc/newsyslog.monthly
on those intervals, and other configuration
can be dropped into the appropriate directory to rotate application and
web server logs.
Russ Allbery > Technical Notes > Debian | Debian Private Repositories > |