afs-balance

(Balance AFS servers using AMPL)

SYNOPSIS

afs-balance [-hpsv] [-f fudge] server partition [server partition ...]

afs-balance -r [solution]

REQUIREMENTS

The Stanford::LSDB::AFSDB Perl module (or some editing of the script to substitute in some other way of connecting to the database), AMPL with the CPLEX solver to solve the linear programming problem, partinfo to obtain partition capacity information, and mvto to implement the solution.

DESCRIPTION

Drawing data about AFS volumes from the AFS MySQL database, afs-balance phrases an AFS server balancing problem as an AMPL problem so that it can be solved with linear programming tools.

afs-balance takes a list of server and partition pairs on the command line and, for each volume on those servers and partitions, gathers the size, quota, and accesses in the past day from the information loaded the previous night into the AFS database. It also determines the total capacity available across all of those server partitions and then defines target goals for balancing, specifying that each partition must contain volumes that add up to at least 97.5% of its fair share of the total accesses and total size. (Quota is not used for balancing.) This 97.5% fudge factor can be adjusted with the -f command-line argument.

AFS servers may be specified as a number, in which case afssvr will be implicitly added to the beginning of the number to form the system name. The partition may be a single partition, a partition range expressed as something like a-k, or . which indicates all partitions on that server.

All of this data is written out into a file whose name starts with vols. and ends with .dat and contains an abbreviated list of the servers and partitions affected in the name. An AMPL control file, referencing one of the AMPL models used to specify the constraints, is then written out with the same file name but ending in .in. AMPL should then be run and given as input the .in file, saving its output to a .out file.

Finally, once AMPL generates a .out file, afs-balance should be run on the .out file with the -r flag, saving its output in a separate file. This mode will parse the AMPL output and generate a list of volumes that have to be moved, where they should be moved from, and where they should be moved to. That output can then be used as input to mvto -s -L.

See the EXAMPLES section for a detailed example.

By default, afs-balance balances servers so that each partition has a number of volumes proportionate to its share of the total space, satisfies the size balance, and satisfies the accesses balance. This can take a very long time when attempting to balance across many partitions. In this situation, you may want to instead do the balance in two parts using -s and then -p as described below.

If afs-balance is run with the -s option, rather than balancing at the level of partitions, it will balance at the level of servers. Restrictions on which partitions on each server are involved in the balance will still be honored, but the AMPL solution will only indicate what server the volume should be moved to rather than server and partition, and will only attempt to equalize accesses, volumes, and space usage between the servers. The resulting output file, after processing with -r, will tell mvto to put the volume on the least full partition on the destination srver.

If afs-balance is run with the -p option, the balancing will be done at the partition level but accesses will be ignored. Balancing accesses across partitions on a single server isn't particularly useful, and this will allow the balance solution to be found far faster.

The combination of these two options can balance much larger cells. First, balance between some set of servers using -s and implement that solution. Then, wait for the AFS database to update its data and balance between partitions using -p on each of the individual servers involved. It will take longer and will mean more total volume moves, but will arrive at the same quality of solution as doing a full balance.

OPTIONS

-f fudge, --fudge=fudge

The ratio by which to reduce the size and access targets for each location in the balance. The default is 0.975. Note that accesses can vary widely and often some AFS volumes have significantly more accesses than others, so setting this value too high can cause AMPL to fail to find a solution.

-h, --help

Print out this documentation (which is done simply by feeding the script to perldoc -t).

-p, --partition

Balance without regard to accesses, which is the mode that one should use when balancing across the partitions of a single server. The data file is the same, but a different model will be used that doesn't include a constraint on accesses.

-r, --results

Rather than generate the AMPL input files, process an AMPL output file and produce a list of volumes, source locations, and destination locations reflecting the volume moves that AMPL considered necessary. This output will go to standard output (although generally you want to redirect it to a file) and will contain one line for each volume, formatted as:

    <volume> <server> <partition> <server> <partition>

where the first server/partition pair is where the volume is now, and the second pair is where it should be moved to.

This output is suitable as input to mvto -s -L.

-s, --server

Balance by server, ignoring partitions. When these AMPL results are processed by afs-balance with the -r flag, the resulting destination information will specify the partition as .. (This means that if not all partitions of some servers are valid destinations for this balance, you will have to modify the output file before running mvto.)

-v, --version

Print the version of afs-balance and exit.

EXAMPLES

To do a full balance of all partitions on afssvr2, partitions /vicepa through /vicepc on afssvr5, and partition /vicepa on afssvr6, one would run the command:

    afs-balance 2 . 5 a-c 6 a

afs-balance will write out two files:

    vols.2+5a-c+6a.dat
    vols.2+5a-c+6a.in

To calculate the balancing solution, run:

    ampl < vols.2+5a-c+6a.in > vols.2+5a-c+6a.out

Once ampl finishes, review the .out file to make sure that AMPL found a valid solution. It starts by printing out data about the initial server state and then prints out status information about whether it found a solution.

If it did find a valid solution, run:

    afs-balance -r vols.2+5a-c+6a.out > vols.2+5a-c+6a.list

Review that list to make sure it looks reasonable and then apply the balancing solution with:

    mvto -s -L vols.2+5a-c+6a.list

If, instead, you wished to balance all partitions between afssvr1, afssvr2, afssvr3, and partitions /vicepa through /vicepk on afssvr4, this may be too large of a problem for AMPL to solve in a reasonable amount of time. Instead, you can use the two-phase approach. Start by balancing between just the servers:

    afs-balance -s 1 . 2 . 3 . 4 a-k
    ampl < vols.1+2+3+4a-k.in > vols.1+2+3+4a-k.out

Make sure that AMPL got a solution and then run:

    afs-balance -r vols.1+2+3+4a-k.out > vols.1+2+3+4a-k.list

Normally, you could just apply this list with mvto, but note that only partitions /vicepa through /vicepk are participating in the balance on afssvr4 (the other partitions might be used for something else, like read-only replicas). afs-balance is not currently smart enough to figure that out, so you need to modify the output list to change any occurance of afssvr4 . to afssvr4 a-k. Once you've done that, run:

    mvto -s -L vols.1+2+3+4a-k.list

to apply the server balance.

Now, wait for a day for the nightly refresh of the AFS database to pick up the new volume locations, and then run individual server balances for each of the affected servers:

    afs-balance -p 1 .
    ampl < vols.1.in > vols.1.out
    afs-balance -r vols.1.out > vols.1.list
    mvto -s -L vols.1.out

(checking the AMPL output first before applying the results), repeating this for each of the affected servers.

FILES

/afs/ir/service/afs/data/balance.mod
/afs/ir/service/afs/data/balance-size.mod

The AMPL models used for balancing, which express the variables and constraints involved in the AMPL program. balance-size.mod is identical to balance.mod but doesn't include the constraint on accesses (it's used when afs-balance is invoked with the -p option).

NOTES

Current, afs-balance always requests the CPLEX solver in its AMPL input file. This is the solver that we use, and it works well for solving this type of balancing problem. We have not experimented with using any other solver, but it's easy enough to change the script to specify a different one.

Sometimes, AMPL will be unable to come up with any solution and will bail out saying that the problem isn't feasible, meaning that there is no possible arrangement of the volumes that satisfies the constraints. Usually this is due to one volume with an extremely high access count in the past day, such that there is no combination of other volumes that can balance that one out and give partitions roughly equal total access counts. This can also be caused by a single particularly large volume.

When this happens, there are a few things that you can do:

  1. Wait for the next day and try the balance again. If the access spike is anomalous, this often works.

  2. Reduce the fudge parameter using the -f option. This tells AMPL to tolerate more deviation in the sizes and accesses of the locations across which it is balancing. AMPL will still equalize the number of volumes, but the solution won't be as nice in equal size and accesses. Note that this parameter affects both size and access count; there isn't a way to affect just one and not the other.

  3. Manually edit the resulting .dat file and reduce the threshold requirements for either size or accesses, whichever is causing problems (you can usually tell by scanning the volume information and looking for particular volumes with anomalously large values).

  4. Lie to AMPL by editing the resulting .dat file and reducing the access count or size of the offending volume. AMPL will then construct a solution based on the value that you told it. This is safe to do for accesses; be careful with doing this for size and don't overfill a partition. If you do this, you'll also need to reduce all the threshold values for that statistic accordingly, since otherwise since you've reduced the total access count (or total size) there won't be enough to spread around to meet the thresholds required for each partition.

Unfortunately, none of the ways of working around AMPL's failure to find a solution are automated. They all require a bit of manual fiddling and manual investigation.

BUGS

afs-balance currently cannot handle balancing read-only replicas properly. It can prepare the AMPL problem, although the way it determines what type of volume to balance is by checking to see if there are any read/write volumes on the affected partitions and balancing only read/write volumes if there are, which may not be what is desired. It cannot, however, produce a list for mvto from the result, and the list that it does produce will cause mvto to do the wrong thing.

Using data from a SQL database of AFS volume information is required. There is no way of doing a balance from vos listvol output, even though enough information would be available to do so. (You can always just load the vos listvol output into a database, but then you have to modify this script since the table names are hard-coded.)

As mentioned above, -r doesn't correctly handle the case where only some of a server's partitions are participating in a balance run with -s. In particular, it will output instructions that will cause mvto to pick the least loaded partition across the entire server, not limited to just the participating partitions. The output has to be modified before using mvto.

The partinfo requirement isn't strictly necessary and is only nice for the pretty output when preparing the balance. afs-balance should cope if partinfo isn't available.

AUTHORS

Written by Neil Crellin <neilc@wallaby.cc> and Russ Allbery <rra@stanford.edu>, based on an idea and an AMPL model by Neil Crellin.

COPYRIGHT AND LICENSE

Copyright 1998, 1999, 2005, 2009 Board of Trustees, Leland Stanford Jr. University.

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

mvto(1), partinfo(1)

The AMPL and CPLEX implementation that we've always used for balancing is the one from <http://www-03.ibm.com/software/products/en/ibmilogcpleoptistud/> This is commercial software that has to be purchased. It may be possible to use this program with a free version of AMPL and a free solver, but we have not investigated doing so.

mvto and partinfo are available as part of afs-admin-tools at <http://www.eyrie.org/~eagle/software/afs-admin-tools/>.

The current version of this program is available its web page at <http://www.eyrie.org/~eagle/software/afs-balance/>.

Last spun 2014-08-10 from POD modified 2014-03-22