Thursday, September 30, 2010

How Avamar has changed my backups

Avamar is nothing short of a revolutionary backup product. While deduplication concepts and algorithms have existed in theory or in bits and pieces in other backup and replication products, I believe it is the first to combine client side and target side deduplication in one hardware and software product.

There are several discussions related to the pros and cons of Avamar found on message boards and email lists, comparing Avamar with more established backup products such as Netbackup, Tivoli and Networker (another EMC product). I want to add to the pros and cons discussion of Avamar and talk about how Avamar has changed my backups. Coming from Commvault and Backup Exec, Avamar was a big change, towards a positive direction.

First the pros:

1) I now measure my daily backups in MBs and GBs. 70% of my servers only backup 200 - 400 MB per day. The other 20% backup 400 MB - 1 GB per day and the remaining 10% backup anywhere from 1 GB to 25 GB per day. If I see a server backing up more than a few GB of data I get worried that I didn't setup exclusions correctly.

2) Backups are really fast with majority of them finishing within 30 minutes. I know this will not apply to all servers and different kinds of data, but it is working out well for my servers. Compare this to Commvault where it took 5 - 10 minutes just to scan the files to backup.

3) No more complicated setups of incremental, differential and full backups. Every backup is full. Avamar combines a daily incremental backup with the previous backups to create a virtual full backup. Unlike a traditional backup product, if you delete previous backups, the subsequent backups are not affected. So there is no golden or master backup to worry about. But just as a disclaimer, SQL backups do have an option for incremental backup, so this doesn't apply to them.

4) Restores are easy and if you are not restoring several hundred GBs or TBs of data, then they are quick. Since no tape is involved, I feel confident about restoring the data. When I was using tape I was always scared that the tape might stop during the restore and my data could become unrecoverable. Probably just an irrational fear.

5) Replicating backups to another Avamer grid is straight forward and restoring from the replicated data is really easy. There are no auxiliary or staged clones or snapshot backups to worry about.

6) There are no media servers or backups nodes. In Commvault and Backup Exec there is a concept of a control backup server, and then several media servers which connect the clients to the tape or backup-to-disk applinces. In Avamar there is one central grid and all clients connect to it for backup.

7) Significantly lowered network utlization. After the first backup, very little data is sent to Avamar, which equals less network useage. Before, I couldn't even count how many slow application response time issues were blamed on "...the backup are running which slows down the network".

8) No more huge spikes in SAN throughput. With Commvault, a graph of the read throughput during the backup window looked like a plateau, a giant plateau. With Avamar, it looks more like a steep mountain which tapers off quickly.


Now the cons:

1) Horrible admin console. Each action opens a new window. For example, if I want to look at the backup activity, that opens a new window. If I want to look at previous backups, that is another window. If I am working on an issue and looking around, I can easily have 6 - 7 windows open. But that's not all, each Avamar grid is managed seperately. So if I am managing both Avamar grids from the GUI, I can have 14 - 16 windows open at any time.

2) Blackout Window. Avamar needs some time to go through old backups and delete them. The deletion of old backups is called garbage collect and can take up to 1 - 3 hours. No backups can run during this time and all running backups are killed. This duration is referred to as the black out window. What this does is set up a hard stop for all backups and forces a backup window. Garbage collect is scheduled to start at 6:00 PM (default, can be changed), and if a client is still backing up data at that time, the backup is killed and the data is not available for restore. So if you had a client that experienced a lot of changed data and is taking a long time to backup, you could potentially run into a situation where the backup is not available for restore.

3) Limited SQL restore options. No incremental restores or log only restores are available.

4) High CPU intensive backups. While a backup is running, the client will experience high CPU useage,sometimes in excess of 70%.

5) Confusing storage limits. Avamar can only be 95% full before becoming read only, at which point old backups have to deleted to create space. With Commvault I always had the option of destaging the backups to tape and freeing up space on my primary backup target. Also, purchasing additional Avamar storage (nodes) is EXPENSIVE!

6) No way for the admin to determine how much space will be freed up by deleting backups. I have asked EMC and they don't know either. This probably ties into the fundamentals of how Avamar works, but it is frustrating when you are close to being full and don't know which backups to delete to free up the most space. The DPN report helps some what, but it is still confusing.

7) Difficult to isolate from the network to test restores. In a traditional backup application, the admin can setup a seperate network, do a disaster recovery restore of the backup server and test restores of critical servers like Active Directory and Exchange. With Avamar, the backups have to be first replicated to a second grid, and then the restores can be tested.