This paper addresses the increased performance needs of a disaster
recovery plan, and the common barriers to achieving success. It
also addresses the performance gains that can be achieved by
combining a F5 WANJet application acceleration solution with
Double-Take® replication solutions from Double-Take®
Software.
Factors That Affect Disaster Recovery
Success
Disaster recovery (DR) plans are becoming a
key part of a company’s overall IT planning process to ensure
continuous availability of the company’s critical infrastructure
at all times. A major component of these plans involves protecting
business-critical data through backups and data replication. Such
replication and backup processes may occur between data centers,
branch and home offices, or primary and backup sites.
A successful business continuity/DR plan has two key components at
its core: a solid replication product to manage replication
processes, and an effective and efficient Wide Area Network (WAN)
that enables those processes to be accomplished successfully.
Two of the critical metrics used in measuring the success of a
disaster recovery plan are recovery point objectives (RPO) and
recovery time objectives (RTO). These two metrics measure the
amount of data lost during a disaster and the time required to
restore to normal operations. IT managers must counterbalance the
lowest RTO and RPO possible with factors such as:
• Increasing data storage requirements from increased usage and
regulatory archival requirements
• Limited bandwidth between primary and backup locations
• The expense of adding additional bandwidth between the DR
locations
• Variable factors that can affect the performance of the DR
solution over the WAN (e.g. WAN Latency and Packet Loss)
One of the most common barriers to the effective deployment of any
high-performance data replication solution is the performance of
the solution over the WAN between the DR sites. Storage teams, when
sizing the bandwidth requirements, often find that their initial
sizing estimates are insufficient to meet the performance
requirements of a DR solution. In practice, true WAN performance is
rarely given much thought until the organization ramps up their
production replication system and realizes that the WAN bandwidth
they have does not provide the expected throughput. Suddenly, the
RPOs and RTOs they expected to meet are no longer realistic.
WANs have several inherent characteristics that are the source of
missed expectations within replication scenarios:
• Latency (caused by limits to the speed of light over distance
and the number of network hops between the DR sites)
• Packet Loss (caused by signal degradation over the network
medium, oversaturated network links, corrupted packets rejected
in-transit, or faulty networking hardware)
• Network congestion (excess of data on the network slows overall
transmission speeds)
• Actual bandwidth does not match expected bandwidth (often due
to a combination of the factors listed above)
• Expense (large pipes can incur significant monthly leasing
charges)
Unfortunately, these factors often cripple what was originally a
good backup/DR plan. Moreover, when the DR application shares the
WAN links with other application traffic, file transfers, and even
possibly other migration or recovery activities, the RPOs and RTOs
that were met previously can be
F5 Networks, Inc. - 2 - © Feb-07
completely unobtainable. This could be due to a variety of factors
including congestion caused by the added throughput from the other
applications. In addition, large latency due perhaps to the
extended distance between the DR sites can prevent the storage team
from achieving their RPOs and RTOs irrespective of how much
bandwidth is used.
The most common solution attempted by storage teams is to replicate
the most critical data and hence reduce the amount of data
replicated. The other option frequently exercised by the DR team is
to increase the amount of bandwidth leased. Neither option is
attractive since they do not solve the core issue, which is the
performance of the application over the WAN.
Solution
WANJet and Double-Take
The solution from Double-Take Software is a disaster recovery
application based on asynchronous real-time replication and
automatic failover to provide cost-effective business continuity
for Microsoft® Exchange®, Microsoft SQL Server, Oracle®, virtual
systems, file servers, and many other applications. Double-Take
provides continuous data protection by sending an up-to-the-minute
copy of the data as it is being changed on the origin server to the
target replication server. Double-Take does an excellent job at
ensuring transactional integrity of the replication data, but like
all replication solutions, their performance is subject to adverse
WAN conditions.
F5’s WANJet® appliance uses compression and acceleration
technologies to dramatically improve the speed of application
traffic over WANs. WANJet accelerates a wide variety of application
traffic types including data replication, file transfer, e-mail,
client-server applications, and many others. WANJet also has some
unique features that enable bandwidth to be efficiently allocated
across different applications, thereby ensuring that the most
critical traffic receives priority access to valuable
bandwidth.
The advantages of using the WANJet to accelerate the Double-Take
replication processes are:
(1) The combination helps meet RPOs and RTOs without upgrading
bandwidth or replication infrastructure by:
• Accelerating the replication processes irrespective of the WAN
conditions
• Enabling the network to adapt dynamically to network congestion
levels
• Guaranteeing bandwidth for important and critical replication
traffic
• Providing more control of WAN resources allocated to Storage or
DR needs
(2) Reduces the cost of meeting the RPOs and RTOs by:
• Using less bandwidth to replicate the same or more amounts of
data
• Reducing the tangible and intangible costs associated with
troubleshooting
(3) Secures the replication traffic by:
• Encrypting the replication traffic using SSL encryption
WANJet Acceleration of a Typical Double-Take Replication
Scenario
The WANJet and Double-Take solution has been thoroughly tested for
compatibility and performance. While each customer deployment is
unique, the following data shows the likely performance gains for
most customers.
Replication Scenario
Performance Increase from WANJet
Mirroring of Microsoft Exchange® Database (4.5GB in size)
11-times Faster Mirroring (aka initial bulk transfer)
F5 Networks, Inc. - 3 - © Feb-07
Replication of Microsoft Exchange® Database
6-times Faster Replication
Replication of typical departmental File Server data
8-times Faster
WANJet Impact on Replication
Concept
How It Is Accomplished
Avoid transmitting uncompressed data
Dynamic Compression
Transparent Data Reduction Level 1, which is a dynamic compression
routine that adjusts compression ratios based on WAN
conditions.
Avoid transmitting redundant data
Byte-Level Pattern Matching
Transparent Data Reduction Level 2, which is a high speed
byte-level pattern matching and removal algorithm. This is very
different from file-level caching, and much more efficient for
replication scenarios that are not dealing with files per-se.
Avoid degrading critical replication traffic, when sharing
bandwidth with less important traffic
Bandwidth Guarantees and Prioritization
Application QoS, which ensures that the Double-Take replication
protocol gets the bandwidth it needs and is protected from other
data on the WAN.
Ensure that the Double-Take replication protocol is accelerated
irrespective of the WAN conditions
TCP Optimization
WANJet Optimization Policy, which allows you to specify the ports
related to the Double-Take application.
Ensure that important information gets encrypted for protection
during transmission
SSL Encryption
F5 Networks, Inc. - 4 - © Feb-07
Factors that affect the acceleration performance that the WANJet
provides:
• Amount of redundant data traversing the WAN
• “Compress-ability” of the data (e.g. text is easily
compressible, images are typically not)
• Traffic mix over the WAN links (this requires WANJet to begin
enforcing bandwidth guarantees which can significantly improve
performance of the important traffic, at the expense of the less
important traffic)
• Traffic volume and link utilization (Congestion on the WAN
links are also affected by the change in traffic volume over the
course of a day. Peak load times during which a replication process
is ground to a halt can now be prevented using bandwidth
allocation).
Overview of Test Scenario
Double-Take Software and F5 Networks partnered to perform formal
testing of the Double-Take solution in conjunction with the F5
WANJet WAN acceleration appliance. Figure 1 shows the replication
test scenario used.
Figure 1: Replication Test Scenario
Separated by a T1 WAN link of about 1000 miles with a latency of 50
milliseconds, Double-Take was coupled with F5’s WANJet 500
acceleration appliance and put through a series of tests using
Microsoft Exchange and generic file server data. Mirror and
replication operations were performed with Double-Take Software
using an Exchange database on the target server on the opposite end
of the WAN link. This established a common starting point for both
databases. Subsequently, a second test using replication was
performed. Replication is the Double-Take operation which mirrors
all activity on one Exchange database to the other across the WAN
link. DR administrators generally use this procedure to maintain
dual copies of Exchange for failover and backup in the case of
disaster. A final test mirrored typical file server data, which
included a mix of document, presentation, media, graphics, and
binary files from one server to the other across the WAN
link.
Conclusion
The combination of F5 Networks’ WANJet acceleration appliances
with Double-Take Software’s data protection and recovery
solutions offer significant performance gains to the storage team
managing replication. The combination delivers cost savings and
improved RPOs and RTOs for the customer’s business. The end
result is reduced risk and lower costs.