BACKUP OFTEN BUT Reorganize only when NEEDED!
Other questions and answers in this section talk about
this too, but let's make it very clear.
After a CI or a CA split occurs, free space has been
inserted in the vicinity of the insert. If there is any
tendency for inserts to be clustered, then we just created
free space where additional inserts are likely to occur,
and so there will be no need for an additional CI or CA
split in that area for a while.
If you take the batch window processing time to reorganize
a file with a clustered insert pattern, then during the next
on-line processing interval, there is a higher probability of
additional CI and CA splits after the reorganization than there
would have been had you not done the reorganization!
If, instead of watching how many CI splits and CA splits
have occurred since the file was last loaded, you were to
watch how many new CI and CA splits occurred each day, you'd
find that the number of new splits declines over time if there
is such a clustered insert pattern.
Thus, when evaluating the free space distribution and
reorganization interval for a file with significant inserts,
you should not just look at the current total number of CI
and CA splits that have occurred since the file was last loaded,
but the number of new splits that happen each day.
It is the new splits that cause on-line and batch processing
delays -- not those that occurred before.
It is the Swami's opinion, based on a lot of experience,
that most insert patterns are at least somewhat clustered, so
this advice is more applicable than not.
In the typical case, there is some mixture of inserts -- and
so with an appropriate amount of CI and CA free space defined,
after a reorganization there will be few if any splits across
most of the file, but the clustered inserts will cause splits
where the extra free space is needed.
Withh no CI free space, a CI split will be required as soon
as the first insert is made to any CI in the file.
With no CA free space, as soon as that single CI split occurs
within a CA, a CA split will be needed. Ideally, you will leave
enough free space in each CI during loading of the file so that
inserts to that CI will be able to be accommodated without a split.
It may be that most CIs will never receive an insert, and so it
is quite appropriate to specify no CI free space. The same
rationale can be applied to the CA free space, but since CA splits
cost so much more than other processing, reserving a few percent of
the CIs in a CA can be justified.
Remember that some whole number of CIs within a CA will be left
empty during load if any CA free space is specified. (VSAM will
round the percentage you specify up to include a whole number of
CIs in each CA.) If you follow my advice elsewhere and use larger
CI sizes to improve sequential performance, you may find that the
amount of space reserved in free CIs during load is larger than
is needed for your file. As CI sizes increase, the number of CIs
in each CA decrease, and the amount of space reserved for free space
will change.
|