| Russ Allbery > Software > afs-balance | afs-balance Changes > |
(Balance AFS servers using AMPL)
afs-balance [-hpsv] [-f fudge] server partition [server partition ...]
afs-balance -r [solution]
The Stanford::LSDB::AFSDB Perl module (or some editing of the script to substitute in some other way of connecting to the database), AMPL with the CPLEX solver to solve the linear programming problem, partinfo to obtain partition capacity information, and mvto to implement the solution.
Drawing data about AFS volumes from the AFS MySQL database, afs-balance phrases an AFS server balancing problem as an AMPL problem so that it can be solved with linear programming tools.
afs-balance takes a list of server and partition pairs on the command line and, for each volume on those servers and partitions, gathers the size, quota, and accesses in the past day from the information loaded the previous night into the AFS database. It also determines the total capacity available across all of those server partitions and then defines target goals for balancing, specifying that each partition must contain volumes that add up to at least 97.5% of its fair share of the total accesses and total size. (Quota is not used for balancing.) This 97.5% fudge factor can be adjusted with the -f command-line argument.
AFS servers may be specified as a number, in which case afssvr will
be implicitly added to the beginning of the number to form the system
name. The partition may be a single partition, a partition range
expressed as something like a-k, or . which indicates all
partitions on that server.
All of this data is written out into a file whose name starts with
vols. and ends with .dat and contains an abbreviated list of
the servers and partitions affected in the name. An AMPL control file,
referencing one of the AMPL models used to specify the constraints, is
then written out with the same file name but ending in .in. AMPL
should then be run and given as input the .in file, saving its
output to a .out file.
Finally, once AMPL generates a .out file, afs-balance should
be run on the .out file with the -r flag, saving its output
in a separate file. This mode will parse the AMPL output and generate a
list of volumes that have to be moved, where they should be moved from,
and where they should be moved to. That output can then be used as input
to mvto -s -L.
See the EXAMPLES section for a detailed example.
By default, afs-balance balances servers so that each partition has a number of volumes proportionate to its share of the total space, satisfies the size balance, and satisfies the accesses balance. This can take a very long time when attempting to balance across many partitions. In this situation, you may want to instead do the balance in two parts using -s and then -p as described below.
If afs-balance is run with the -s option, rather than balancing at the level of partitions, it will balance at the level of servers. Restrictions on which partitions on each server are involved in the balance will still be honored, but the AMPL solution will only indicate what server the volume should be moved to rather than server and partition, and will only attempt to equalize accesses, volumes, and space usage between the servers. The resulting output file, after processing with -r, will tell mvto to put the volume on the least full partition on the destination srver.
If afs-balance is run with the -p option, the balancing will be done at the partition level but accesses will be ignored. Balancing accesses across partitions on a single server isn't particularly useful, and this will allow the balance solution to be found far faster.
The combination of these two options can balance much larger cells. First, balance between some set of servers using -s and implement that solution. Then, wait for the AFS database to update its data and balance between partitions using -p on each of the individual servers involved. It will take longer and will mean more total volume moves, but will arrive at the same quality of solution as doing a full balance.
The ratio by which to reduce the size and access targets for each location in the balance. The default is 0.975. Note that accesses can vary widely and often some AFS volumes have significantly more accesses than others, so setting this value too high can cause AMPL to fail to find a solution.
Print out this documentation (which is done simply by feeding the script
to perldoc -t).
Balance without regard to accesses, which is the mode that one should use when balancing across the partitions of a single server. The data file is the same, but a different model will be used that doesn't include a constraint on accesses.
Rather than generate the AMPL input files, process an AMPL output file and produce a list of volumes, source locations, and destination locations reflecting the volume moves that AMPL considered necessary. This output will go to standard output (although generally you want to redirect it to a file) and will contain one line for each volume, formatted as:
<volume> <server> <partition> <server> <partition>
where the first server/partition pair is where the volume is now, and the second pair is where it should be moved to.
This output is suitable as input to mvto -s -L.
Balance by server, ignoring partitions. When these AMPL results are
processed by afs-balance with the -r flag, the resulting
destination information will specify the partition as .. (This
means that if not all partitions of some servers are valid destinations
for this balance, you will have to modify the output file before running
mvto.)
Print the version of afs-balance and exit.
To do a full balance of all partitions on afssvr2, partitions /vicepa through /vicepc on afssvr5, and partition /vicepa on afssvr6, one would run the command:
afs-balance 2 . 5 a-c 6 a
afs-balance will write out two files:
vols.2+5a-c+6a.dat
vols.2+5a-c+6a.in
To calculate the balancing solution, run:
ampl < vols.2+5a-c+6a.in > vols.2+5a-c+6a.out
Once ampl finishes, review the .out file to make sure that
AMPL found a valid solution. It starts by printing out data about the
initial server state and then prints out status information about whether
it found a solution.
If it did find a valid solution, run:
afs-balance -r vols.2+5a-c+6a.out > vols.2+5a-c+6a.list
Review that list to make sure it looks reasonable and then apply the balancing solution with:
mvto -s -L vols.2+5a-c+6a.list
If, instead, you wished to balance all partitions between afssvr1, afssvr2, afssvr3, and partitions /vicepa through /vicepk on afssvr4, this may be too large of a problem for AMPL to solve in a reasonable amount of time. Instead, you can use the two-phase approach. Start by balancing between just the servers:
afs-balance -s 1 . 2 . 3 . 4 a-k
ampl < vols.1+2+3+4a-k.in > vols.1+2+3+4a-k.out
Make sure that AMPL got a solution and then run:
afs-balance -r vols.1+2+3+4a-k.out > vols.1+2+3+4a-k.list
Normally, you could just apply this list with mvto, but note that
only partitions /vicepa through /vicepk are participating in the balance
on afssvr4 (the other partitions might be used for something else, like
read-only replicas). afs-balance is not currently smart enough to
figure that out, so you need to modify the output list to change any
occurance of afssvr4 . to afssvr4 a-k. Once you've done
that, run:
mvto -s -L vols.1+2+3+4a-k.list
to apply the server balance.
Now, wait for a day for the nightly refresh of the AFS database to pick up the new volume locations, and then run individual server balances for each of the affected servers:
afs-balance -p 1 .
ampl < vols.1.in > vols.1.out
afs-balance -r vols.1.out > vols.1.list
mvto -s -L vols.1.out
(checking the AMPL output first before applying the results), repeating this for each of the affected servers.
The AMPL models used for balancing, which express the variables and constraints involved in the AMPL program. balance-size.mod is identical to balance.mod but doesn't include the constraint on accesses (it's used when afs-balance is invoked with the -p option).
Current, afs-balance always requests the CPLEX solver in its AMPL input file. This is the solver that we use, and it works well for solving this type of balancing problem. We have not experimented with using any other solver, but it's easy enough to change the script to specify a different one.
Sometimes, AMPL will be unable to come up with any solution and will bail out saying that the problem isn't feasible, meaning that there is no possible arrangement of the volumes that satisfies the constraints. Usually this is due to one volume with an extremely high access count in the past day, such that there is no combination of other volumes that can balance that one out and give partitions roughly equal total access counts. This can also be caused by a single particularly large volume.
When this happens, there are a few things that you can do:
Wait for the next day and try the balance again. If the access spike is anomalous, this often works.
Reduce the fudge parameter using the -f option. This tells AMPL to tolerate more deviation in the sizes and accesses of the locations across which it is balancing. AMPL will still equalize the number of volumes, but the solution won't be as nice in equal size and accesses. Note that this parameter affects both size and access count; there isn't a way to affect just one and not the other.
Manually edit the resulting .dat file and reduce the threshold
requirements for either size or accesses, whichever is causing problems
(you can usually tell by scanning the volume information and looking for
particular volumes with anomalously large values).
Lie to AMPL by editing the resulting .dat file and reducing the
access count or size of the offending volume. AMPL will then construct a
solution based on the value that you told it. This is safe to do for
accesses; be careful with doing this for size and don't overfill a
partition. If you do this, you'll also need to reduce all the threshold
values for that statistic accordingly, since otherwise since you've
reduced the total access count (or total size) there won't be enough to
spread around to meet the thresholds required for each partition.
Unfortunately, none of the ways of working around AMPL's failure to find a solution are automated. They all require a bit of manual fiddling and manual investigation.
afs-balance currently cannot handle balancing read-only replicas properly. It can prepare the AMPL problem, although the way it determines what type of volume to balance is by checking to see if there are any read/write volumes on the affected partitions and balancing only read/write volumes if there are, which may not be what is desired. It cannot, however, produce a list for mvto from the result, and the list that it does produce will cause mvto to do the wrong thing.
Using data from a SQL database of AFS volume information is required. There is no way of doing a balance from vos listvol output, even though enough information would be available to do so. (You can always just load the vos listvol output into a database, but then you have to modify this script since the table names are hard-coded.)
As mentioned above, -r doesn't correctly handle the case where only some of a server's partitions are participating in a balance run with -s. In particular, it will output instructions that will cause mvto to pick the least loaded partition across the entire server, not limited to just the participating partitions. The output has to be modified before using mvto.
The partinfo requirement isn't strictly necessary and is only nice for the pretty output when preparing the balance. afs-balance should cope if partinfo isn't available.
Written by Neil Crellin <neilc@wallaby.cc> and Russ Allbery <rra@stanford.edu>, based on an idea and an AMPL model by Neil Crellin.
Copyright 1998, 1999, 2005 Board of Trustees, Leland Stanford Jr. University.
This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
mvto(1), partinfo(1)
The AMPL and CPLEX implementation that we've always used for balancing is the one from <http://www.ilog.com/>. This is commercial software that has to be purchased. It may be possible to use this program with a free version of AMPL and a free solver, but we have not investigated doing so.
mvto is available from <http://www.eyrie.org/~eagle/software/mvto/>. partinfo is available from <http://www.eyrie.org/~eagle/software/partinfo/>.
The current version of this program is available its web page at <http://www.eyrie.org/~eagle/software/afs-balance/>.
| Russ Allbery > Software > afs-balance | afs-balance Changes > |