Summary of C++ symbols experience

This is a follow-up to my previous post about the way that Debian's library symbols analysis support interacts with C++ libraries, based on further debian-devel discussion. The primary outcome of that discussion was to point me at the pkg-kde-tools utilities for handling maintenance of these files, which solved a bunch of my problems.

As before, this was also posted on debian-devel, which is where I expect most of the discussion to be.

First of all, for those who haven't explored the pkg-kde-tools infrastructure for this, it looks like the effective process goes something like this:

  1. Do a local build of the package and then generate an initial symbols template for that architecture. So, for example, I did:

        fakeroot debian/rules build install
        pkgkde-gensymbols -plibxml-security-c16 -v1.6.1 -Osymbols.i386 \
        pkgkde-symbolshelper create -o debian/libxml-security-c16.symbols \
            -v 1.6.1 symbols.i386

    This generates an initial symbols file that will work for i386.

  2. Build and upload this version of the package. Now wait for all the buildds to fail (because they will, on probably nearly every other architecture than your local one).

  3. Download all of the build logs and feed them back into pkgkde-symbolshelper so that it can adjust your symbols template for all of the architecture variations. So, for example, I did:

        pkgkde-symbolshelper batchpatch -v 1.6.1 *_unstable_logs/*.build

    This will generate rather nice annotated symbols files with appropriate arch lists and using subst tags where appropriately (such as when an argument to a function is a size_t).

  4. Modify your package to use pkg-kde-tools during the build, because subst tags are a really nice way to handle a lot of C++ symbol patterns but #533916 against dpkg-dev is wontfix, so you need the pkg-kde-tools replacement for dpkg-gensymbols. I added a dependency on pkg-kde-tools and then added the (undocumented) pkgkde_symbolshelper add-on to my dh --with flag.

  5. Build and upload the package again. Now it will hopefully build on all architectures.

This is, as you might expect, a fair bit of work, but it does work, and what comes out the other end is a symbols file that works across the current buildd architectures and detects any mistakes by upstream that change the symbols exported.

However, it does have some problems, some obvious, and some more subtle. Here's a list of the issues that I see:

  1. It feels like this symbols file is still likely to be fragile. While pkg-kde-tools detects a lot of template instances and marks them optional, I was still seeing a lot of "leaked" symbols from other C++ libraries that my package build-depends on, and which I suspect may also disappear. There's also the problem of optimizers eliminating some things that are inlined; for example, that appears to be the variation that my package sees on arm. And for a substantial C++ library, the symbols file is HUGE. Over 12,000 lines for one of my libraries. It's kind of scary to think of how many fragility landmines are buried under there.

  2. By the nature of this process, it's very sensitive to the current supported architectures. I expect any package with a symbols template built this way stands a good chance of FTBFS on a new architecture. For example, my symbols file has a lot of !armel !armhf patterns, which don't appear to be anything unique to arm except that the C++ compiler behaves slightly differently there due to inlining, and those are the only two architectures in the current set with that pattern.

  3. The lack of dpkg-gensymbols support for subst is very unfortunate, since it's used by the pkg-kde-tools utilities and it's the Right Thing To Do for symbols that take, for example, size_t as an argument. I would strongly encourage the dpkg maintainers to accept the work in #533916 except maybe for the vt work, since the alternative is to generate regexes that are significantly worse at capturing mistakes (since they're going to accept either int or long, rather than only whatever one corresponds to size_t, for example), or to embed a lot of specific architecture restrictions listing, for instance, all 64-bit architectures and guarantee FTBFS for every new architecture.

  4. It's a little hard on the infrastructure, since adding a symbols file requires uploading a package that's guaranteed to fail to build almost everywhere just so that you can get all the build logs to feed into the machinery. I always feel guilty about doing that.

  5. It's still not clear that the benefit is worth the amount of effort, since I expect most C++ libraries to require frequent SONAME changes anyway, which means that the long-term binary compatibility angle of symbols is probably futile. Mostly, all of this is to give you a tool to do ABI checking, which will catch some mistakes but not all of them. And I'm guessing it's going to be substantially more work than maintaining symbols for a C library if one actually does it properly and checks every missing symbol and change in new builds.

I'm going to try this for the Shibboleth packages for a while since I want to see what it's like across a new upstream release, but I'm still very much not convinced that this is a useful use of a packager's time.

Posted: 2012-01-27 21:39 — Why no comments?

Last spun 2022-02-06 from thread modified 2018-07-15