Formatting Reports with Template Toolkit
Randal L. Schwartz
Recently, a Usenet posting (yes, I still read that)
discussed using Perl to determine the disk block usage for various users on
the system. Apparently, the poster had noticed a discrepancy between the
number of blocks reported by du and the total blocks returned by summing Perl's -s function applied to each
file. I replied that you can't use -s, because that's merely the length of the file, and
thanks to sparse Unix files, the value wasn't necessarily the number
of blocks used. Also, a large file consumes indirect blocks to locate the blocks as needed.
Because of sparse files and indirect blocks, the
accurate way to determine the actual cost of a file requires calling stat and using the blocks
value (element index 12). As I was thinking about this, I thought
it'd be interesting to profile my own disk usage organized according
to logarithmic buckets. Once I had that working, I started tinkering some
more (as I often do) and wanted a nice HTML table output, organized by
user, of the disk blocks and file count for a given directory hierarchy.
And the result is shown in
Listing 1
.
Line 5 of the listing provides a constant to help me
scale a given block count into the relevant bucket. If I take the natural
log of the block count and divide it by the natural log of 2, then truncate
that downward to an integer, I'll get buckets for 1 block, 2-3
blocks, 4-7 blocks, 8-15 blocks, and so on, in nice powers of two. Rather
than keep dividing by the natural log of 2, I'll use the reciprocal
and multiply, which in hindsight is a completely unnoticed
micro-optimization.
|