Capacity planning is a very similar but easier exercise, since I have most of the information already in my hands. Using the information gathered during your interview of the data owners, you should be able to extrapolate an estimation of how fast the data will grow. Most database administrators that I have talked to have a very good idea of the percentage of growth within their databases.
Make sure you ask these questions during your initial interview phase with them to make this part of your capacity planning go as smooth as possible. Here are a few sample questions to ask data owners:
- How large is your data/database?
- What percentage of change happens to your data daily?
- Is there a particular point when the data changes more than usual?
- How much of that change is actual data growth?
- Can you anticipate an annual percentage of growth?
- What are your recovery expectations?
The most important part of capacity planning is determining where your data plateaus during the backup schedules and retentions you have subscribed. It is the plateau that will allow you to properly size your environment. Here I will present to you the formulae used to calculate the required backup storage media need for the server, Mammoth, based on the maximum retention level, or 28 days. The percentage of change comes from our initial interview of the data owners, who may know the estimated percentage of change, or by simply taking a rough estimate, for the sake of example I am going to use 10 percent as our rate of change. While this rate of change may seem high or low, it makes the examples much easier to visualize. If you are using VERITAS NetBackup, you may use their File System Analyzer tool, which may give you a more accurate view.
Let’s take some time to understand the Backup Models. The top portion is a graph that represents the days along the x-axis (1-33), with the data (D-Amount of data backed up) backed up running along the pos-y-axis and the data changed (pD) along the neg-y-axis. Whenever I use an arrow in the positive direction on the y-axis, it represents a backup that has been run, while an arrow in the negative direction on the y-axis represents changed data (pD, where p is the rate of change and D is the amount of data). Notice the Ff (Full-frequency) between #1 and #2, this represents the number of days between scheduled full backup jobs. Also note none of the changed data (pD) is being backed up.











Leave a Reply