< January 2012 | Russ Allbery > Eagle's Path | March 2012 > |
I wanted to give a final update of my experiment with adding symbols files to C++ library packages.
In the end, I reverted the changes and have gone back to not providing a symbols file, and instead just using shlibs. Some of this is because of the general state of the available tools, and some of it is due to the nature of the packages I was working on. This set of packages has frequent SONAME bumps from upstream, so backward compatibility across Debian stable versions can't happen anyway. They also don't use symbol export control, which means that the number of exported symbols is quite large and contains a lot of things leaked from internal objects.
But there are also tool issues. The biggest that I ran into is that symbols appear and disappear in the export list with different versions of the compiler, and the pkg-kde-tools utilities, while excellent, still interpret those changes as architecture-specific changes. So the symbols file accumulates supposedly arch-specific variations, which are actually about what version of the compiler happened to be installed on that buildd when the package was built. This also means that if a package that has had its symbols file updated for one architecture is built again on the same architecture but on another buildd with a different g++, it will often FTBFS since a different set of symbols will be present.
Obviously, this is too fragile to be maintainable.
Further debian-devel discussion pointed out that these symbol variations are probably all inline functions, where the compiler may or may not output a weak version of the symbol if it chooses not to inline it in some particular case. The right thing to do for these libraries (not all, since there are other uses of weak symbols) is to mark all weak symbols as optional and not care if they randomly appear and disappear.
There are also some 32-bit vs. 64-bit variations that are not captured by the subst mechanism, which resulted in the symbols file developing a list of architecture restrictions that amounted to "all 32-bit systems" or "all 64-bit systems." These explicit lists are also fragile, and will break with each new architecture introduced in Debian.
I filed three bugs with pkg-kde-tools with suggestions for improving the tools for other people going forward:
#657805 —
pkgkde-getbuildlogs gets truncated logs
#657806 —
representation of covariant return thunks
#658333 —
option to mark weak symbols as optional
but I think that the level of work and remaining fragility doesn't make sense for a lot of C++ libraries, at least right now without more direct support in dpkg-gensymbols and other tool improvements. I'm therefore also planning on changing the proposed Policy patch for symbols to make it more clear that shlibs is an acceptable alternative for C++, and to patch Lintian to suppress the info tag for no symbols file for C++ libraries.
A few people have been using the license-count tool that I wrote to analyze the breakdown of licenses in Debian and draw some conclusions. Unfortunately, some of those conclusions can't be drawn from the output of this specific tool (which doesn't make them wrong, just unproven). So here are a few cautions.
license-count was written for a specific purpose: to see how many references there are to a particular license in Debian so that we had some facts behind our discussion of whether to include it in common-licenses. It was not written to classify packages into particular categories, only to find out how many copies of a license we could save by adding it to common-licenses. This means that using it to determine how much software in Debian is covered by a particular license fails in the following ways:
license-count numbers cannot be added. Since the goal is to find every reference to a given license, each package is not classified as under one and only one license. If a package has material under the GPL, the LGPL, and the GFDL, it will add one to each of those counts. This means that you cannot determine, from its output, questions like "how many packages are covered under one of the GPL family of licenses?" You will double- or treble-count packages when you add the numbers together.
license-count counts any reference, not just "important" ones. If the license is referenced in debian/copyright, that counts, since the goal is to see how many packages might benefit from being able to reference it in common-licenses. This may not be the license that the package is under, just some small portion of the package.
For example, all of my packages that use Autotools add to the GPL count because I include all the source code licensing statements, including the statements for libtool and the Automake helper scripts, which say that the files can be released under either the GPL or the license of the including package. These packages are generally under a BSD-style license (or in some cases the Apache 2.0 license), not under the GPL, but since the reproduced license text has a reference to the GPL, license-count adds one to the GPL column for each of those binary packages. For most purposes for counting GPL-covered software, these are false positives. I don't know how many people are as complete in documenting license statements as I am, but just the binary packages that I maintain probably count for at least 50 false positives because of this alone.
license-count numbers include dual-licensing. Whether this matters for your analysis or not depends on what question you're asking, of course. If you're asking whether the package is covered under a copyleft license, including the thousands of packages that are dual-licensed under the Artistic license and the GPL may be okay, since the Artistic license is a sort of copyleft (although it allows you to avoid the copyleft requirement by renaming executables, providing the originals, and documenting your changes). But, for example, one of my packages is dual-licensed under a BSD-style license and the GPL (for reasons that aren't worth getting into). This package is not covered by copyleft under any reasonable definition of the term, but license-count will count it as a package using the GPL.
In order to draw interesting conclusions like "how much software in Debian is protected by copyleft," I'm afraid that you'd have to write a much more sophisticated tool than license-count.
Now is the point in the Debian release cycle where I usually try to go through my packages and bring them generally up to date. In case other people are doing the same thing, here are a few new facilities or techniques that I'm rolling out across my packages. (This is apart from the obvious stuff, like multiarch where appropriate and debhelper 9, and the older stuff, like using dh.)
dh-autoreconf is a new helper tool that runs autoreconf on the package
during the build and cleans up properly afterwards. I've started
switching all of my packages that use Autoconf and Automake and can
use autoreconf over to it. (Some upstreams have more complex scripts
that have to be run to regenerate the build system.) It plugs in
trivially as a dh add-on, and even adds support for --as-needed
(see below).
I'm doing this even for packages where I don't patch the build system, on the grounds that rebuilding everything from source, including the build system, is a good idea. It also means that I can patch the build system when I need to without having to add additional machinery at the time.
Linking with --as-needed
. As I build new packages, I'm looking
for anything that has warnings about generating unnecessary
dependencies and adding --as-needed
to the linker flags.
The combination of dh-autoreconf and the new dpkg-buildflags support
makes this trivial to do for most packages. Just add something like:
export DEB_LDFLAGS_MAINT_APPEND = -Wl,--as-needed
to debian/rules. It's worth being aware that --as-needed
can
break some unusual uses of special cases around shared library
loading, but I've not run into any of those cases with any of the
software that I package.
dpkg-buildflags comes essentially for free with debhelper 9, but it's worth mentioning that, as mentioned above, it's a really easy way to add additional flags. And if you have to pass in flags via some other mechanism, use dpkg-buildflags to get the default flags.
Once you're up to debhelper 9 and are using dpkg-buildflags, adding hardening flags is easy. You get the default ones for free, and that's a pretty good start. (Install hardening-includes and use hardening-check to check the status of the binaries built by your package.) I always add at least hardening=+bindnow to DEB_BUILD_MAINT_OPTIONS (set with export in debian/rules), since the minor speed hit at startup doesn't matter for anything that I'm packaging. (It might for something like ls that runs all the time.)
I usually try to also add +pie, but be careful of that. Libtool will cope correctly with it and switch it back to PIC for shared libraries, but other shared library build processes may not. And it doesn't always work; for example, gnubg (GNU Backgammon) just immediately dies if built PIE, for reasons that I didn't track down.
If, like me, you maintain your packaging in Git without using a separate tool that exports a patch series, and therefore use the single-debian-patch option, there is new support for applying that option only to your build as the maintainer but not to any other build that people do of your source package. This is good, since it means that any NMU diffs will be kept separate from your maintainer diff because they'll get the version of the NMU package added.
To get this behavior, move debian/source/patch-header to debian/source/local-patch-header and debian/source/options to debian/source/local-options (assuming that's your only option; otherwise, you might need to split it). Then the patch header and options won't be included in the generated source package and hence won't apply to NMUs or other packaging changes based on the source package in the archive instead of on the packaging repository.
It's also worth mentioning that Ubuntu was responsible for breaking a lot
of ground here. Due to bug reports and patches submitted from Ubuntu,
several of my packages already had hardening build flags and
--as-needed
issues fixed before this round of packaging refresh,
which made adding these features much easier than it would be otherwise.
I've just uploaded version 3.9.3.0 of Debian Policy. The really big change in this version is the 1.0 version of the copyright format specification, but that will get a separate announcement once the www team has had a chance to put it in place on the web site.
There are a bunch of other fixes, though, including some normative changes. Here's the upgrading checklist (duplicated from my announcement, but formatted a bit prettier):
New archive sections education, introspection, and metapackages added.
The Architecture
field in *.dsc
files may now contain
the value any all
for source packages building both
architecture-independent and architecture-dependent packages.
If a dependency is restricted to particular architectures, the list of architectures must be non-empty.
/run
is allowed as an exception to the FHS and replaces
/var/run
. /run/lock
replaces /var/lock
. The FHS
requirements for the older directories apply to these directories as
well. Backward compatibility links will be maintained and packages
need not switch to referencing /run
directly yet. Files in
/run
should be stored in a temporary file system.
New section spelling out the requirements for packages that use files
in /run
, /var/run
, or /var/lock
. This
generalizes information previously only in 9.3.2.
Cron job file names must not contain "." or "+" or they will be ignored by cron. They should replace those characters with "_". If a package provides multiple cron job files in the same directory, they should each start with the package name (possibly modified as above), "-", and then some suitable suffix.
Packages using doc-base do not need to call install-docs anymore.
Packages that declare the same conffile may see left-over configuration files from each other even if they conflict.
The Policy rules around Motif libraries were just a special case of normal rules for non-free dependencies and were largely obsolete, so they have been removed.
debian/copyright
is no longer required to list the Debian
maintainers involved in the creation of the package (although note
that the requirement to list copyright information is unchanged).
Version 1.0 of the "Machine-readable debian/copyright
file"
specification is included.
This separate document has been retired and and its (short) contents merged into Policy section 9.7. There are no changes to the requirements.
Packages may declare an interest in the <perl-major-upgrade> trigger to be notified of major upgrades of perl.
ttf-japanese-{mincho,gothic}
have been renamed to
fonts-japanese-{mincho,gothic}
.
Please review and update your packages as necessary the next time you make an upload.
Four and a half years after Sam Hocevar started a draft proposal on the
Debian Wiki, we've published revision 1.0 of the machine-readable format
for debian/copyright
files. The format (as DEP-5) has been in
widespread use for some time, but there have been multiple versions of the
format and multiple transitional versioned URLs. Everyone can now update
to the same version of the document and use a stable format URL.
Below is the text of the annoucement that I just sent to debian-devel-announce.
Version 1.0 of the machine-readable format for debian/copyright files (the culmination of the DEP-5 process) has now been published. The canonical URL (and also the URL to use in the Format field in such files) is:
http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
For those who have adopted various iterations of the DEP-5 format for your packages, please consider updating to this format the next time you revise your package.
Use of this format in Debian packages is optional.
Thanks to all of the people who have contributed to this work over its long evolution. Particular thanks go to Steve Langasek, Charles Plessy, and Lars Wirzenius, who kept this document moving through its extended DEP discussion process, and Sam Hocevar, who started this work in 2007.
The copyright format specification is now being maintained as part of the debian-policy package and will follow the same update process that's used for other parts of Policy. See:
for that process.
For this specification, we're trying something new for Debian technical policies. This document is versioned, and older versions of the specification will be maintained to not invalidate older references in copyright files that have not been moved to newer versions. This is similar to the way that specifications are handled by some other projects, particularly the freedesktop.org standards.
The plan, going forward, is that informative changes (changes that do not change the requirements or the contents of compliant copyright files) will be made following the same relaxed procedure as other informative changes to Policy, and new versions will only informative changes will be published with the same version number. So the existing 1.0 document may receive further improvements in wording, examples, or similar changes that don't modify its meaning.
Normative changes (changes that change requirements for compliant files, which includes addition of new standardized license short names) will follow the normative Policy change process, including formal seconds.
Once normative changes are accepted, the next release of this specification will receive a new version number (1.1, for example). The previous version, prior to the normative changes, will be frozen, and from that point forward will not be changed except as required to continue to publish it. This includes further informative changes. We will continue publishing these frozen older versions of the standard on www.debian.org and in the debian-policy package for some time: possibly forever, and at least as long as they're still referred to by packages distributed by the project.
The hope is that this will provide a good balance between specification stability, editing for clarity, and specification changes to meet changing requirements, while not invalidating the URLs in older files and while continuing to provide the information required to interpret older files.
The only change in this release is to add a new bstrndup function to the C TAP library. Like the other b* memory allocation functions, this emulates the (semi-standard) strndup function but calls sysbail on memory allocation failure. It reimplements strndup to not assume that the system C library has it (since it's missing on Mac OS X).
You can get the latest version from the C TAP Harness distribution page.
The primary changes in this release are in the networking utility library and in the various provided TAP add-ons for testing software that uses Kerberos and remctl.
In the networking library, new network_read and network_write functions have been added to read and write from network sockets with an optional timeout. Various problems with the network utility test suite have also been fixed.
The Kerberos and remctl TAP add-ons have been significantly refactored and improved. The kerberos_setup function now returns a struct of configuration and supports being told what configuration is required. skip_all will be called automatically if required configuration is missing. The native Kerberos and kinit versions of the test suite setup have also been merged and the correct alternative is chosen at compile time. The remctl TAP add-on has been updated to match, now gets the path to remctld from a #define so that each caller doesn't have to provide it, uses the new test file functions to find its configuration, and supports running remctld under fakeroot.
This release also has a bug fix for the replacement strndup (it previously assumed the source string was nul-terminated) and some improvements to the xstrndup checked version.
You can get the latest release from the rra-c-util distribution page.
The primary change in this release is to add a new remctl_set_timeout function to the remctl client library and similar functions to the Perl, PHP, Python, and Ruby bindings. When set, the timeout affects all subsequent network operations, including connect.
The remctld server, thanks to Andrew Mortensen, now supports a new configuration option that sets the user as which to run the command. When set, remctld will change to that user (including their groups) before running the command, freeing command authors from having to write the the code to change users.
Thanks to the code for timeout handling, the timeout on the remctld server is now an hour timeout between messages from the client rather than an hour limit on the entire connection.
There are also some minor fixes to some of the internals of the PHP and Python bindings.
You can get the latest release from the remctl distribution page.
< January 2012 | Russ Allbery > Eagle's Path | March 2012 > |