Saturday, June 26, 2010

Avamar - Viewing Large Files

Avamar and viewing large file folders:

Avamar GUI has a problem working with folders with more than 50000 files. When trying to do a backup or restore, Avamar GUI can only show up to 50000 files. To verify this limitation, I created a perl script to create 60,000 files. The perl script is as follows:

for ($a =0; $a <>$a";
close FILE;
}

I ran this script in a test folder, and tried to backup this folder. When I clicked on this folder to select files, the AVamar GUI became unresponsive. On the client side I noticed the avagent.exe process jump up to 25% processing usage and memory usage also increased. After 30 seconds I got a pop up in the Avamar GUI saying that "The client was unable to complete this browse request within the allotted limit of 10 seconds". It gave me two options: Increase the time limit, or view partial results. I clicked on "view partial results" and it only showed me the first 20,000 files.



I clicked on the top folder which selected all the files inside the folder, and then I started the backup. After the backup completed the log reported that it backed up 60,001 files (it also backed up the perl script).

Backup #140 timestamp 2010-06-16 14:32:09, 60,001 files, 3 folders, 10.53 MB (60,001 files, 274 bytes, 0.00% new)

When trying to restore, I ran into the same issue. If I browse for the file by date, and go to that folder I get a popup message saying "Backup list truncated. The console truncated the backup listing because the maximum number of entries was exceeded. The preference max_backup_nodes in mcclient.xml specifies the limit. If the data of interest is not listed, refresh the view and then reselect your backup. This will allow you to select other folders. Selecting the same folder or any other folder that exceeds the limit will cause the truncation to occur again."

I clicked OK, and it only showed me the first 50,000 files. I didn't count, but I am assuming it did. I followed the EMC solution esg110422 which wanted me to edit the MAX_BACKUP_NODES to a higher value from the one listed. The XML file mcclient.xml is located under c:\prograam files\var\mc\gui_data\prefs on the computer where the Avamar administrator GUI (console) is installed. I increased the value to 500,000 and rebooted by desktop according to the instructions. That did not make any difference, and I still saw only 50,000 file entries. I tried refreshing and reselecting, but that did not help, I still could not see files beyond the first 50,000 entries. this causes a huge issue because you have to restore the whole folder to get that file back, something I did not want to do.

I contacted support, and was told that 50,000 is the limit and cannot be changed. I can however use command line to restore the files. I tried mccli from the Avamar utility node and I was able to restore the file by providing the full path and file name.

So, this is how things are right now. If you have a folder with more than 50,000 files, listing the files in the GUI is not possible which makes is impossible to restore any file which is not in that list. You can however use the command line to do a restore. Both mccli.exe and avtar.exe can be used to do the restore.

Avamar System State Backup

NOTE: This applies to Avamar 5.0 with no service pack installed. Windows 2008 system state backups have changed in Avamar 5.0 Service Pack 2.

Avamar System State Backups:

There appears to be some confusion related to Windows systems state backups with Avamar. Most new comers to Avamar are used to the backup software handling the system state backup natively, that is without the need for scripting or storing the backups locally on the client. Avamar handles the system state backups differently as compared to other backup applications. For Windows 2000 and Windows 2003 clients, Avamar utilizes the NTBackup utility to create a system state backup which is stored locally on the client. Starting with Windows 2008, Avamar is capable of making a backup of the system state using the VSS plugin and storing it on the Avamar node(s) itself. However, this method is no longer supported by EMC, and EMC recommends to script the backup using the Windows Backup utility and direct it to a shared folder.

Following are the systems state backup requirements, observations and best practices for each Windows OS.

Windows 2000 and Windows 2003:

Windows 2000 is not officially supported by Avamar 5.0+, so the documentation may not cover it. However, it's system state backup procedure is similar to Windows 2003. System state backups on Windows 2000 are around 300 - 500 MB in size, while in Windows 2003, they can range from 800 MB to 3 GB. If there is enough space in the local C: drive, then use the option in the dataset to create a system state backup. When a client backup is initiated, the Avamar agent calls the NTBackup program and creates a system state backup locally. The files is called systemstate.bkf. On Windows 2003, this procedure uses VSS which is known to cause issues with SQL in certain situations, however in my experience this process has been mostly problem free.

Inside the Windows File System dataset options there is an additional setting called "Backup the System Profile" and this is set to disabled by default. I believe this is what is causing the most confusion. This option only works with Windows 2003 servers and requires that the Avamar Backup System State agent (refered to as AvamarBackupSystemState-windows-x86-5.0.100-409.msi) be installed on the client. Installing this, and enabling the system profile option creates a recovery profile for the client. This profile contains all the configuration information that would be necessary to perform a bare metal restore of the server. The profile is called a HomeBase profile and is often referred to as HomeBase Lite since it only works with Windows 2003 and only offers limited capabilities, such as it will not install the OS or the service pack as is common with other bare metal restore applications. This option is not necessary to restore the client to similar hardware.

If a complete restore of the system is necessary, then install the operating system and the service pack, restore the system state backup file to another client and follow the procedure outlined in the documentation to restore the client. During my testing I only had the Windows system state and the server data, so I had to follow the directions from the version 4.1 Administrator Guide. After following the instructions, and a couple of reboots later, the system came back up.

If the client does not enough space to create a system state backup locally on the C drive, then the backup can be redirected to another local drive, or be scripted to send it to a remote share. To send the system state to a different local drive, include this parameter in the dataset definition that is applied to the client: systemstatefile=d:\avamarfolder. The redirect location can be any directory on the drive. However, if you are doing this, then it is a good idea to document where the system state file resides so it can be located when it is needed.

If the backup needs to be scripted, then the Avamar dataset can be instructed to run a script before the backup begins. This option is found under advanced dataset options and is called pre-script. The script needs to be placed inside c:\program files\avs\etc\scripts and can only be a .bat, .vbs or .js file. Be sure to uncheck the option underneath which says "Abort backup if script fails". If the script fails to run, there is no sense in not backing up the data. When manually creating a system state backup, the system state and either the c:\windows\windows32 (for Windows 2003) or the c:\winnt\system32 (for Windows 2000) need to specified. The script I use is as follows:

Windows 2000 Script:

@echo off
ntbackup backup "c:\WINDOWS\system32" /m normal /f "\\servername\share\%COMPUTERNAME%.bkf"
ntbackup backup systemstate /M normal /f "\\servername\share\%COMPUTERNAME%.bkf" /a

Windows 2003 Script:

@echo off
ntbackup backup "c:\WINDOWS\system32" /m normal /f "\\servername\share\%COMPUTERNAME%.bkf"
ntbackup backup systemstate /M normal /f "\\servername\share\%COMPUTERNAME%.bkf" /a

These scripts require that a share already exist, and it should have share permissions set to allow Everyone to write. This is because Avamar agent runs as System user and this is the only option I have found to allow the System user to write to remote share. I haven't tried running the Avamar client agent as a specific user, so I don't know if that will allow specific share level permissions. I know this is a security hole, but I don't have a work around for this.

One important thing to do when redirecting system state backups to a remote share is to create a avtar.cmd file in c:\program files\avs\var and put the following parameter in it: --backupsystem=false. This is important because if by mistake a different dataset is selected for a client and it does not specify that the system state should be directed to a remote share, the system state backup will be created locally. This parameter blocks the system state backup to made using the Avamar agent. I do this because I only enable redirecting system state backups to a remote share for clients which have very small amount of free space available locally, and if by accident the drives fill up, the client could crash.

When the client backup runs, it calls the system state backup script which creates the backup to a remote share. This remote share can then be backed up by installing the Avamar agent on that client and backing up that client last.

One thing to note about the Windows dataset is that it includes a SystemState/ option under Source Data. This makes some people think that the system state will be backed up for Windows 2000 and 2003 servers. This is not true, and this option is there to backup the system state for Windows 2008 servers. Although, even then this option is not used.

Windows 2008:

Windows 2008 system state backups cannot be made with Avamar natively. EMC has a document on Powerlink called "Windows Server 2008 Offline System Recovery Using Windows Server Backup with Avamar" which describes how to configure system state backups for Windows 2008. Do not use the VSS Plugin to backup the system state, even though it appears to be the obvious choice, or the client logs might indicate so if you enabled the "Backup the System State" option. If you enable it, you will see that a successful system state backup was made, but this backup cannot be properly restored. If you try to restore from the VSS plugin system state backup, it will appear to restore data, but it will never complete, it will just get stuck at 99%. EMC says that this is a Microsoft issue due to recent changes they have made to Windows 2008. Thus, the process outlined in the document mentioned earler has to be followed in order to make a successful restorable system state backup.

To enable the system state backup, the Windows Backup utility has to be installed on Windows 2008. This utility is not installed by default, so go to Manage, Features and enable it. Once it is installed, system state backups can be made. I use a script to start the backup and redirect it to a remote share. The script is:

@echo off
wbadmin start backup -backupTarget:\\remoteserver\share$ -allCritical -quiet

Then, go to the dataset being used with this server and specify the script under pre-script option.

Windows 2008 system state backups take much longer than Windows 2003 backups and at least 10GB in size because they backup more data. Whatever server the system state backups are going to should be backed up last, otherwise system state backups will be at least a day old.

The script provided in the documentation has a line to delete the system state backup after it has been backed up, but I don't do that.

Avamar Retention

Retention.

Ther are two types of retention in Avamar, basic and advanced. Basic retention policy can be specified in three ways:

Retention Period: Allows you to define how long a backup should be maintained. Length can be defined in days, weeks, months or years. The retention period is calculated from the start time. So if a job started on 3/31/2010 11:00 PM but ends on 4/1/2010 5:00 AM, the retention period will use 3/31/2010 to calculate when to expire the job.

End Date: Expire jobs on this particular date. This is not a moving backup window, and all jobs that have this retention policy will expire on the defined date. This is good for one time backup jobs where a system may need to be backed up, but after a certain date its backups are no longer necessary.

No end data: Backups never expire.

The second type of retention in Avamar, advanced, allows you to define how long to keep backups based on how they are tagged. Backup jobs can be tagged as daily, weekly, monthly and yearly. Every backup job is a daily job and is marked with a "D". If a backup was made on a Sunday, it is tagged with a "W" to signify it is a weekly. The very first backup job of the month is marked as an "M" which stands for monthly. The very first backup job of the year is marked with a "Y" for yearly. Tags can be combined for backup jobs to create layers of retention. The first backup job of any system is tagged as "DWMY". Jobs made on a Sunday are tagged "DW", while the first backup of the month is marked "DM" if it is not on a Sunday, which is then tagged with a "DWM".

Retention periods for each tag can be defined in days, weeks, months and years. The job expires when it is older than the time period defined in the retention policy. For example, if advanced retention policy is set to D: 20 days, Weekly: 40 days, Monthly: 100 days and Yearly: 365 days, and a job is tagged as DWMY, the D tag drops off after 20 days, W tag after 40 days, and Monthly tag after 100 days. If you look at the job after 100 days, it will have only one tag, Y. After 365 days, the job will expire.

According to the best practices guide for Avamar 5.0, weekly backups are equal to three daily backups, and a monthly backup is equal to six daily backups. This helps conserve space by reducing the amount of data that is kept on the system. But, this also reduces the amount of days you can go back to recover data from.

One important thing to note about advanced retention is that it does not apply to on demand and client initiated backups.
On demand jobs have an option to specify basic retention just before initiating the job. Client initiated backups use the End Use On Demand Retention. Both jobs get tagged with an "N" which stands for not tagged.

These tags can be changed by going to a job under Backup Management and selecting what tags you want to apply to a job. A job marked weekly, can be changed to daily, monthly, yearly, or a combination of all four tags. When jobs run as scheduled jobs, they are automatically tagged. If only basic retention is enabled, the jobs are still tagged, but only the expiration date is used to expire the job.

The best practice for retention is to use advanced retention since it saves data. Another best practice is to set minimum retention to 14 days for all jobs. This is because retention can only be specified in time periods. There is no setting in Avamar to not delete the very last backup job, or only delete a job until a new backup becomes available. If there is a problem with backing up a client, and retention is set to 7 days, it is likely that the failure can go unnoticed and all backups will be deleted. Setting minimum retention to 14 days buys some time for the admin to check if a job failed and if so why.

Avamar Restoring ACLs only (File Permissions)

Restoring ACL only for a Windows host is a bit tricky with Avamar. There is no explicit option in the GUI that only restores the ACL. If you are going through the GUI, then you can only restore the files and the Acess Control List together. I searched through the Administrator guide and was unable to find anything related to restoring ACL only.

I looked through the avtar.exe command line and found a parameter that can be used to specify that only ACL be restored. The parameter is --restore-acls-only=true which is specified in the avtar.cmd file. The avtar.cmd files is located in c:\program files\avs\var\ if the default installation location was selected during install. However, when I tried to do a restore of several files and folders I saw these errors in the job log:

WARN: <0000> ntsecurity error:Unable to reset security on pre-existing directory "%s" during restore "C:\Documents and Settings\srvsandtm\Desktop", LastError=87 (code 87: The parameter is incorrect)

I looked through the avtar.cmd command line again and found another related parameter called --existing-dir-aclrestore=true. After much experimentation I found out that this parameter restores the files inside the folders, and the security of the folder itself. If the files inside had their security modified, but they exist at the time of the restore, then only the ACL of the folder is restored.

I still got the same error stated above, but it did not have any effect on restoring the ACL.

So in summary if you want to restore folder ACL and file ACL (security) then use --restore-acls-only=true. Only those folders and files that exist will have their ACL restored. If a file or folder does not exist, then it is not restored. If you want only the folder ACL restored but don't want the file ACL touched, then use --existing-dir-aclrestore=true. During a regular restore, that is is with no parameters, if a folder exists then its ACL is not restored.

DPN Report

DPN Report

The DPN summary report is a good way to see various statistics related to backups. It is ran by going to Tools --> Manage Report and selecting Activities - DPN Summary. To get a daily report, I select the period between backup start time and end time. By default it selects the last 24 hours, which I have find to be a good setting.

The DPN summary report can be exported as a CSV file and opened with Excel. However, there is a lot of data in the report and it is up to the administrator to determine what information he or she needs out of it.

Here is what I do with a DPN Summary report:

1) Remove all columns except the following: Host, Seconds, NumFiles, ModSent, PcntCommon. I don't find much useful informatino in any of the other colums.
2) Add two new columns: Minutes and Mbytes.
3) Minutes is calcualted by the following formula: Seconds/60. I find that expressing the backup duration in minutes is the best way to get a good picture of how long backups too. With Avamar, unlike other traditional backup softwares, successfull backups can range from 2 minutes to several hours. Expressing it in hours make it difficult for me visualize 0.03 of an hour.
4) MBytes is calculated by the following formula: ModSent/(1024 ^2). ModSent is the amount of backup data after deduplication that was sent over the network for a particular host. Avamar is very efficient at deduplication, so backups can range from 1 MB to several GB. Expressing it in GB make it hard for me to visualize 0.03 GB of data. I find it easier to work with whole numbers.

Using this method I am able to sort the data and determine which servers took the longest to backup and which servers backed up the most data. I can also find out which server has the most unique data. It can also be used to calculate the right settings for f_cache and p_cache by seeing how long a backup takes and how many files (NumFiles) are there on the server.

It's a good practice to run the DPN summary report on a daily basis to see if certain clients are not overwhelming the Avamar with too much data, or taking too long.

Assessing Storage Performance Just Got More Complicated!

Assessing Storage Performance Just Got More Complicated!

Here is how a SAN used to look like: Host -> Fibre Switch -> Storage. 3 Basic layers. A bottleneck could exist at the host level, or it could be at the switch, or it could be the lack of sufficient disk drives to support the required throughput. Performance analysis, while not easy, had a method to it and a storage administrator could go down the logical path of determining where a bottle neck existed. However, with recent enhancements and introducuction of new virtualization layers, analysing performance has become much more complicated.

Take the host layer for example: SQL 2008 server running inside a VM, hosted on a VMware vSphere server. There is the SQL server one would have to analyse, then the operating system itself, and then the vSphere server which connects the VM to the SAN. A bottleneck could exist at any of three sublayers. Although, in this case the only new addition is the virtual machine sublayer.

The switch layer has seen a lot of changes, although not much in the way of virtualization. Fibre saw speeds going from 2 Gb/s to 4 Gb/s and then 8 Gb/s. Then there was the introduction of iSCSI which went from 1 Gb/s to 10 Gb/s, although achieving the full potential of 10 Gb/s has been questionable. And recently we saw Fibre Channel Over Ethernet (FCOE) which takes advantage of the 10 Gb/s network adapters and switches. The switch layer also transformed into the encryption layer, as switches started encrypting data-in-fligth and data-at-rest. In a way, encryption switches are acting as a storage virtualization layer between the host and the storage.

Then there was the introduction of a brand new storage layer called block storage virtualization. Offerings such as Storage Volume Controller (SVC) and VPlex from EMC sit between the switches and the storage array, and virtualize the storage layer. They can act as a storage cache, hide the complexities of storage from the host, and replicate data between storage arrays.

The final layer, the storage layer is going through some major changes also. They now support multiple protocols including Fibre, iSCSI, FcOE, CIFS and NFS, and have multiple disk options from the slow but large 3 TB SATA to the super fast but expensive 5000 IOPS flash disks. With an emphasis on storage tiering and utilizing the right disks for the right performance level, features like block level and sub block level tiering have been added. Sub block tiering is especially interesting since it allows certain parts of the data in a LUN to reside on a different type of disk compared to the rest of the data. And then there is LUN level compression in storage arrays and deduplication in NAS filers.

There is another layer, but I don't believe is that prevelant yet, the external optimizers which sit between the host and storage. They access the data, perform deduplication on it and write it back to the storage. The Ocarina Optimizer is a good example of this. There is also the file virtualization layer such as the F5 Data Manager and Rainfinity which can combine several NAS filers under one logical access point.

So, storage performance analysis is no longer looking at the three layers. Now, it involves looking at all the sublayers involved. The bottleneck may be the storage array or the host, but finding which sublayer inside the storage array is causing it has becoming more complicated.

Using avtar.exe to restore files

Using avtar.exe to restore files

Avtar.exe is an executable which gets installed on a client when the Avamar agent is installed. Unless a non-default path was chosen during the install, it can be found in c:\program files\avs\bin. Avtar.exe can be used to list backups, list backup content, create backups, delete backups and restore files. In this entry I will focus on listing and restoring files using Avtar.exe.

Basic command line:

avtar.exe --id= --password= --path=\Domain\fully qualified DNS name of client\


Listing backups for a client using Avtar.exe

avtar.exe --backups --id= --password= --path=\test\testserver.testdomain.com

This lists all the backups in Avamar for this client. The list looks similar to this:

Date Time Seq Label Size Plugin Working directory Targets
---------- -------- ----- ----------------- ---------- -------- --------------------- -------------------
2010-06-16 14:32:09 140 MOD-127671661 18783K Windows C:\Program Files\avs\var R:/test/
2010-06-15 22:03:50 139 MON-PM-Test-testserver-127665720 32478034K Windows C:\Program Files\avs\var
2010-06-14 22:18:19 138 MON-PM-Test-testserver-127657080 32477794K Windows C:\Program Files\avs\var
2010-06-13 22:32:53 137 MON-PM-Test-testserver-127648440 32479575K Windows C:\Program Files\avs\var

The important infomration you will need from this is Label. This will be provided as an option when doing a restore to tell Avamar to use this backup.

Listing files for a client using Avtar.exe

avtar.exe --list --id= --password= --path=\test\testserver.testdomain.com --label="MON-PM-Test-testserver-127665720"

All files that belong to the backup labeled as "MON-PM-Test-testserver-127665720" will be listed. There is no limit of 50,000 files as in the gui. The list can be very long, and I haven't found an option in avtar.exe to list files that match a pattern. Maybe using grep or qgrep in Windows would be appropriate here.

Restoring ALL files for a client using Avtar.exe

avtar.exe --extract --id= --password= --path=\test\testserver.testdomain.com --label="MON-PM-Test-testserver-127665720"

This will restore all files for the client, but it will not overwrite files.

Restoring only certain files for a client using Avtar.exe

avtar.exe --extract --id= --password= --path=\test\testserver.testdomain.com --label="MON-PM-Test-testserver-127665720" c:\test\temp.txt

This will restore the file c:\test\temp.txt to the original location.If the file needs to be overwritten, then use --overwrite=always before the file name. Multiple filenames can be seperated by a space. Multiple files cannot be selected using an *.