control-archive release 1.7.0

(processing and archiving of Netnews control messages)

Written by Russ Allbery <eagle@eyrie.org>

Copyright 2002, 2003, 2004, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016 Russ Allbery <eagle@eyrie.org>. Portions based on code copyright 2001 Marco d'Itri and copyright 1996 UUNET Technologies, Inc. This product includes software developed by UUNET Technologies, Inc. This software is distributed under a BSD-style license. Please see the file LICENSE in the distribution for more information.

BLURB

This software generates an INN control.ctl configuration file from hierarchy configuration fragments, verifies control messages using GnuPG where possible, processes new control messages to update a newsgroup list, archives new control messages, and exports the list of newsgroups in a format suitable for synchronizing the newsgroup list of a Netnews news server. It is the software that maintains the control message and newsgroup lists available from ftp.isc.org.

DESCRIPTION

This package contains two major components:

Manual changes to the canonical newsgroup list are also supported, in a way that generates the same log messages and uses the same locking structure so that they can co-exist with automated changes and be included in the same reports. Also included in this package are the documentation files included in the control message archive and newsgroup lists on ftp.isc.org.

REQUIREMENTS

All of the software in this package is written in Perl and requires Perl 5.6 or later. The Perl modules Compress::Zlib, Date::Parse, Net::NNTP, and Text::Template are also required. Compress::Zlib is included in Perl 5.10 or later and otherwise available from CPAN. Date::Parse is part of the TimeDate distribution in CPAN. Net::NNTP is included in Perl 5.8 or later and otherwise available as part of the libnet distribution in CPAN. Text::Template is available from CPAN. Most Linux distributions have packages for the extra Perl modules.

gzip (available from <http://www.gnu.org/software/gzip/>) and bzip2 (available from <http://www.bzip.org/>) are required. Both are generally available with current operating systems, possibly as supplemental packages.

process-control expects to be fed file names and message IDs of control messages on standard input and therefore needs to be run from a news server or some other source of control messages. A minimalist news server like tinyleaf is suitable for this (I wrote tinyleaf, available from the current INN Subversion repository, for this purpose).

VERSIONING

This package uses a three-part version number. The first number will be incremented for major changes, major new functionality, incompatible changes to the configuration format (more than just adding new keys), or similar disruptive changes. For lesser changes, the second number will be incremented for any change to the code or functioning of the software. A change to the third part of the version number indicates a release with changes only to the configuration, PGP keys, and documentation files.

LAYOUT

The configuration data is in one file per hierarchy in the config/ directory. Each file has the format specified in FORMAT and is designed to be readable by INN's new configuration parser in case this can be further automated down the road. The config/special/ directory contains overrides, raw control.ctl fragments that should be used for particular hierarchies instead of automatically-generated entries (usually for special comments). Eventually, the format should be extended to handle as many of these cases as possible.

The keys/ directory contains the PGP public keys for every hierarchy that has one. The user IDs on these keys must match the signer expected by the configuration data for the corresponding hierarchy.

The forms/ directory contains the basic file structure for the three generated files.

The scripts/ directory contains all the software that generates the configuration and documentation files, processes control messages, updates the database, creates the newsgroup lists, and generates reports. Most scripts in that directory have POD documentation included at the end of the script, viewable by running perldoc on the script.

The templates/ directory contains sample templates for the control-summary script.

The docs/ directory contains the extra documentation files that are distributed from ftp.isc.org in the control message archive and newsgroup list directories.

INSTALLATION

This software is set up to run from /srv/control. To use a different location, edit the paths at the beginning of each of the scripts in the scripts/ directory to use different paths. By default, copying all the files from the distribution into a /srv/control directory is almost all that's needed. An install rule is provided to do this. To install the software, run:

    make install

You will need write access to /srv/control or permission to create it.

process-control and generate-files need a GnuPG keyring containing all of the honored hierarchy keys. To generate this keyring, run make install or:

    mkdir keyring
    gpg --homedir=keyring --allow-non-selfsigned-uid --import keys/*

from the top level of this distribution. process-control also expects a control.ctl file in /srv/control/control.ctl, which can be generated from the files included here (after creating the keyring as described above) by running make install or:

    scripts/generate-files

Both of these are done automatically as part of make install. process-control expects /srv/control/archive to exist and archives control messages there. It expects /srv/control/tmp to exist and uses it for temporary files for GnuPG control message verification.

To process incoming control messages, you need to run process-control on each message. process-control expects to receive, on standard input, lines consisting of a path to a file, a space, and a message ID. This input format is designed to work with the tinyleaf server that will come with INN 2.5 and later and is available from INN's Subversion repository, but it should also work as a channel feed from pre-storage-API versions of INN (1.x). It will not work without modification via a channel feed from a current version of INN, since it doesn't understand the storage API and doesn't know how to retrieve articles by tokens. This could be easily added; I just haven't needed it.

If you're using tinyleaf, here is the setup process:

  1. Create a directory that tinyleaf will use to store incoming articles temporarily, the archive directory, and the logs directory and install the software:

        make install
  2. Run tinyleaf on some port, configuring it to use that directory and to run process-control. A typical tinyleaf command line would be:

        tinyleaf /srv/control/spool /srv/control/scripts/process-control

    I run tinyleaf using tcpserver (from the ucspi-tcp package), but inetd should work equally well.

  3. Set up a news feed to the system running tinyleaf that sends control messages of interest. You should be careful not to send cancel control messages or you'll get a ton of junk in your logs. The INN newsfeeds entry I use is:

        isc-control:control,control.*,!control.cancel:Tf,Wnm:

    combined with nntpsend to send the articles.

That should be all there is to it. Watch the logs directory to see what happens for incoming messages.

scripts/process-control just maintains a database file. To export that data in a format that's useful for other software, run scripts/export-control. This expects a /srv/control/export directory into which it stores active and newsgroups files, a copy of the control.ctl file, and all of the logs in a LOGS subdirectory. This export directory can then be made available on the web, copied to another system, or whatever else is appropriate. Generally, scripts/export-control should be run periodically from cron.

Reports can be generated using scripts/control-summary. This script needs configuration before running; see the top of the script and its included POD documentation. There is a sample template in the templates/ script, and scripts/weekly-report shows a sample cron job for sending out a regular report.

BOOTSTRAPPING

This package is intended to provide all of the tools, configuration, and information required to duplicate the ftp.isc.org control message archive and newsgroup list service if you so desire. To set up a similar service based on that service, however, you will also want to bootstrap from the existing data. Here is the procedure for that:

  1. Be sure that you're starting from the latest software and set of configuration files. I will generally try to make a new release after committing a batch of changes, but I may not make a new release after every change. See the HOMEPAGE AND SOURCE REPOSITORY section below for information about the Git repository in which this package is maintained. You can always clone that repository to get the latest configuration (and then merge or cherry-pick changes from my repository into your repository as you desire).

  2. Download the current newsgroup list from:

        ftp://ftp.isc.org/pub/usenet/CONFIG/newsgroups.bz2

    and then bootstrap the database from it:

        bzip2 -dc newsgroups.bz2 | scripts/update-control bulkload
  3. If you want the log information so that your reports will include changes made in the ftp.isc.org archive before you created your own, copy the contents of ftp://ftp.isc.org/pub/usenet/CONFIG/LOGS/ into /srv/control/logs.

  4. If you want to start with the existing control message repository, download the contents of ftp://ftp.isc.org/pub/usenet/control/ into /srv/control/archive. You can do this using a recursive download tool that understands FTP, such as wget, but please use the options that add delays and don't hammer the server to death.

After finishing those steps, you will have a copy of the ftp.isc.org archive and can start processing control messages, possibly with different configuration choices. You can generate the files that are found in ftp://ftp.isc.org/pub/usenet/CONFIG/ by running scripts/export-control as described above.

MAINTENANCE

To add a new hierarchy, add a configuration fragment in the config/ directory named after the hierarchy, following the format of the existing files, and run scripts/generate-files to create a new control.ctl file. See the documentation in scripts/generate-files for details about the supported configuration keys.

If the hierarchy uses PGP-signed control messages, also put the PGP key into the keys/ directory in a file named after the hierarchy. Then, run:

    gpg --homedir=keyring --import keys/<hierarchy>

to add the new key to the working keyring.

The first user ID on the key must match the signer expected by the configuration data for the corresponding hierarchy. If a hierarchy administrator sets that up wrong (usually by putting additional key IDs on the key), this can be corrected by importing the key into a keyring with GnuPG, using gpg --edit-key to remove the offending user ID, and exporting the key again with gpg --export --ascii.

When adding a new hierarchy, it's often useful to bootstrap the newsgroup list by importing the current checkgroups. To do this, obtain the checkgroups as a text file (containing only the groups without any news headers) and run:

    scripts/update-control checkgroups <hierarchy> < <checkgroups>

where <hierarchy> is the hierarchy the checkgroups is for and <checkgroups> is the path to the checkgroups file.

SUPPORT

The control-archive web page at:

    http://www.eyrie.org/~eagle/software/control-archive/

will always have the latest released version of this package, the current documentation, and pointers to any additional resources.

Configuration updates should be sent to usenet-config@isc.org. You can send bug reports to the same place, or to my personal email address at eagle@eyrie.org. You can also open an issue or pull request on GitHub at:

    https://github.com/rra/control-archive

Please be aware that I tend to be extremely busy and work projects often take priority. I'll save your mail and get to it as soon as I can, but it may take me a couple of months.

SOURCE REPOSITORY

control-archive is maintained using Git. You can access the current source and configuration by cloning the repository at:

    git://git.eyrie.org/usenet/control.git

or view the repository via the web at:

    http://git.eyrie.org/?p=usenet/control.git

You can also use GitHub via:

    https://github.com/rra/control-archive

The eyrie.org repository is the canonical one, maintained by the author, but using GitHub is probably more convenient for most purposes. Pull requests are gratefully reviewed and normally accepted.

LICENSE

The control-archive package as a whole covered by the following copyright statement and license:

  Copyright 2002, 2003, 2004, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
      2014, 2016 Russ Allbery <eagle@eyrie.org>
  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
  without limitation the rights to use, copy, modify, merge, publish,
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:
  The above copyright notice and this permission notice shall be
  included in all copies or substantial portions of the Software.
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
  IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
  CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
  TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
  SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

All individual files without an explicit exception below are released under this license. Some files may have additional copyright holders as noted in those files. There is detailed information about the licensing of each file in the LICENSE file in this distribution.

Some files in this distribution are individually released under different licenses, all of which are compatible with the above general package license but which may require preservation of additional notices. All required notices are preserved in the LICENSE file.

Converted to XHTML by faq2html version 1.33