BACKUP IN A NETWORK ENVIRONMENT:

A WINDOW OF VULNERABILITY AND OPPORTUNITY

Dr. Peter Liu
Syncsort Incorporated


As the amount of data held on networks has increased, the need for effective backup strategies has grown also. The earliest network backups consisted of simply copying data to floppy disk or tape cassette. These backups were isolated and often haphazard. Later, network backups became more formalized. Sophisticated strategies from mainframe backups were adapted for network backups. Control was centralized as were the storage devices that held the backup images. Unfortunately, the amount of data held on networks continued to grow, leading to increased network traffic and bandwidth problems during backup processing. In addition, sensitive data held on the network is vulnerable to tampering as the data passes over the network to the centralized storage devices. The proposed solution to these problems is a return to distributed storage devices but with centralized control. This strategy would improve both performance and security and lead to reliable, efficient, and scaleable network backups.

INTRODUCTION

According to current estimates, data stored on networks is growing at the rate of more than 40% per year [COOP94]. Such explosive growth brings other concerns to the forefront, and one of the most important of these is protecting increasingly large amounts of data from unwanted intrusion, from natural disaster, and from computer system and human error. In addition, increased amounts of data stored on a network implies increased amounts of data traveling on the network, creating bandwidth problems, which are extremely expensive to remedy.

In this paper, we will review the various solutions offered to the backup dilemma in a computer environment. Backup techniques and procedures are well-established in the traditional mainframe environment. However, like the network itself, backup techniques in a client-server environment are still evolving.

Network backup strategies have progressed from totally isolated islands of storage and responsibility to environments where storage and control are completely centralized. This evolution was necessitated by the increased demands being made on networks and their increased importance. Based on our network experience and a careful analysis of backup strategies, it is our belief that it is time for network backup software to be transformed again by combining the best features of two strategies used in the past. The new backup strategy would entail distributing storage devices while retaining centralized control. Such a strategy can improve security for critical data now resident in network environments, and alleviate the bandwidth problems created as larger and larger amounts of data travel over the network.

We have called this new type of backup strategy a "distributed backup" to distinguish it from the two types of backup solutions currently available: isolated and centralized storage on a network. We will discuss each of these strategies, and then describe the advantages of the distributed system: improved performance and enhanced security for critical data.

 

BACKUP DEFINED

In a backup, data from a computer system is copied from disk to another medium, normally a less expensive one like tape, for safekeeping. Such backups are crucial for protecting data against unpredictable losses from media or system failures and from human error.

By examining an established mainframe backup product, we can observe the basic functions that commercial-strength backup software should include [GEPN95]. These functions can then be compared with the requirements for backup and recovery described recently for open [FARN94] and distributed [LASH94] systems. The comparison reveals that two basic backup decisions need to be made in any environment:

§ Where will the backup images be stored?

§ How will the backup process be controlled?

To answer the first question, we need to look at three types of organization that have been used in the past:

§ Independent

§ Isolated

§ Centralized

The independent strategy is used only on mainframes. Isolated and centralized are two types of network backup strategies.

THE OLD GLASS HOUSE: INDEPENDENT MAINFRAME BACKUP

In a mainframe environment, the basic decisions and organization are relatively straightforward, at least as far as the backup is concerned.

 

Figure 1 shows a typical mainframe configuration from the backup’s point of view. All control resides within the mainframe, making it an independent entity. Terminals through which an administrator defines and monitors backup processing are connected directly to the mainframe as are the tape drives or other storage devices to which the backup software writes the backup image.

THE NEW OPEN ENVIRONMENT: NETWORK BACKUP

In a client-server network environment, the basic backup decisions and organization have evolved from totally isolated to strict centralized control of both processing and storage. This evolution was necessary because of the growing importance and complexity of networks.

Figure 2 shows a radically simplified view of a client-server environment. There are three basic components in this environment: client machines (marked with a C), server machines (marked with an S), and a network that connects all of the machines and organizes them into a working system.

The network was originally developed as a means for sharing data and expensive hardware resources such as printers, and for communicating with remote sites. Administrative control was isolated and restricted to individual PCs. Gradually workstations evolved, and their functions became divided into "client" and "server" machines, which have more complex and varied workloads. As the data used on these isolated machines grew larger and more important, simple storage devices were used to back data up. Normally, these were floppy disks or standalone tape cassette devices. Figure 3 provides an illustration of the isolated network backup strategy. Storage devices are marked with an SD.

The problem with the isolated strategy was, obviously, its lack of administrative control. As is normal with human nature, some users were very careful about backing up their data while others totally neglected the responsibility. Hence, when hardware or software failed, as it inevitably does, lost data often could not be restored completely, and so had to be recreated, a costly and very time-consuming procedure--if the data could be recreated at all. As the importance of the work done on networks grew along with the importance of the data stored there, backup procedures gradually became too critical to entrust to individual users.

CENTRALIZED BACKUP STRATEGY

It soon became apparent that backup software, like other software used on the network, had to be designed to be "mainframe-strength" while working in a network environment. Many of the standard features in mainframe backup software could easily be transferred to software designed for a client-server system where centralized control of backup processing was the aim. These features include:

§ Centrally controlled scheduling

§ Centralized device control

§ Storage media pooling

These sophisticated mainframe-derived backup functions made network backups far easier to implement and much more secure.

Unfortunately, there was one important design limitation. The storage devices that were available when the backup software was being written did not have the large capacities and unattended backup capabilities that are available today. For this reason, not only were control functions centralized but also the placement of backup storage devices. See Figure 4 for an illustration of this type of backup strategy.

In order to make its more sophisticated view of backup processing workable, a centralized storage backup had to create a new image of the network environment for itself. For the purposes of the backup only, one machine is always designated as the administrative "server" machine, and it is to this machine that all storage devices are attached. All other machines are considered "clients" or "agents" by the backup software.

This strategy seemed the most logical design at the time it was implemented because most storage devices were tape-based with a relatively small capacity, which would require an operator to mount, unload, and store tapes. Distributing the devices across a network could mean distributing them in different rooms, different floors, or even different buildings, causing considerable logistical problems for the sole operator.

One other detail in Figure 4 should be noted. Seven storage devices are pictured in the illustration because seven is the capacity of a SCSI controller, which is the standard connection in use today. However, because high capacity storage devices are readily available at a reasonable cost now, a large number of devices or more than one SCSI controller are seldom connected to one "server" machine, since each additional device causes a further degradation in I/O performance.

PITFALLS OF CENTRALIZED STORAGE BACKUPS

 

Although the centralized storage strategy that was developed worked well while networks were in an intermediate stage of evolution, limitations soon became apparent as the amount of data stored on networks continued to grow. The two major problems that have become obvious are poor performance and weak security.

Because all the storage devices are attached to one machine, most of the data to be backed up must travel across the network (that is, all data except the data held on the machine to which the storage devices are directly attached). When large amounts of data are involved, severe traffic problems can result on the network. Bandwidth limitations become a severe handicap, which data compression can only partially relieve [GOOD94].

In addition, since most of the data must travel across the network, it is vulnerable to interception by unauthorized persons unless it is encrypted. Encryption is a very expensive and resource-intensive process, and it can increase elapsed time processing as much as 20%.

Traffic problems, bandwidth limitations, and security issues may not surface immediately on a network since many client-server migrations begin as pilot projects where relatively small amounts of data are stored for test purposes. However, as more and more applications are migrated, sensitive data in some or all of the following forms may reside on the network [HIGG94]:

§ Employee records

§ Financial data and budgets

§ Executive correspondence

§ Production data

§ Product and marketing plans

§ R&D data

§ Customer lists

All of these types of data are easily secured on glass-house mainframe systems with password security, and such data can also be secured in similar ways in a client-server environment. It is only during a backup operation that data, which is normally secured on its home machine, becomes vulnerable to anyone who is capable of intercepting it while it is traveling on the network to the storage devices connected to the backup "server". Since backups normally take place after normal business hours, intruders are more likely to attempt access because of the privacy they can enjoy for their clandestine activities.

Because securing data entails resource-intensive encryption, all current centralized storage backup software products only encrypt data on the storage device itself and not before the data begins to travel unsecured on the network.

A PROPOSED DISTRIBUTED BACKUP SOLUTION

Two major problems inherent in centralized storage backup schemes have been discussed:

§ Poor performance arises from the significant increase in backup traffic caused because most of the data to be backed up must travel across the network to reach the backup server and the storage devices connected to it.

§ Weak security is caused by allowing unencrypted data to travel across the network to the centralized storage devices.

What can be done?

 

Since small capacity is no longer a limitation and the availability of carousels and jukeboxes allow for unattended backup, it seems logical to combine the strengths of the isolated backup strategy and the centralized storage strategy into a solution that we propose to call a "distributed backup". Such a strategy can be implemented by adding flexibility to the backup software without eliminating any of the sophisticated features, such as centralized scheduling and device control, which were derived from the mainframe and which are part of the centralized storage strategy.

 

Figure 5 illustrates the proposed strategy, and it is called a "distributed backup" because it literally allows storage devices to be distributed throughout a network, while retaining the advantages of centralized control. The backup’s view is again that any machine that is connected to a storage device (SD) is considered a "server" (S) and all other machines are backup clients (C) regardless of whether they are considered clients or servers in the overall network scheme.

 

The following benefits will result when a distributed backup strategy is used:

§ Any number of storage devices may be connected to the network for backup with far less degradation in performance because multiple backup servers are employed. In a centralized storage backup, storage devices may be connected to only one server. Although as many as seven devices may be connected to a single SCSI controller, and multiple controllers can be used, connecting a large number of devices to one server causes significant performance degradation.

§ Storage devices may be connected to the network at any machine as long as that machine has a backup module residing on it. This allows storage devices to be located where they are needed most or where they fit best, based on their capacity and/or whether or not they can be used for unattended backup.

§ Any machine that holds a large amount of data on its disk drive may be directly connected to a storage device dedicated to its use. Likewise, any machine that holds extremely sensitive data may be directly connected to a dedicated storage device, regardless of the location of the other storage devices used for backup on the network.

§ If several machines are directly connected to storage devices on a network, network traffic will be reduced considerably because data can travel directly to these devices rather than traveling over the network to reach a storage device. Bandwidth problems will be relieved. Data is also more secure when it can be backed up directly because it need not travel on the network where it might be intercepted.

§ Backups complete more quickly when data is backed up to several distributed storage devices simultaneously. In addition, the time that data would normally take to travel to a storage device over a network is saved. In a typical network environment, data can be backed up directly to a storage device at the rate of 700 kilobytes per second. When the same data is backed up to a device elsewhere on the network, backup throughput can fall to 200 kilobytes per second.

§ The time needed to encrypt sensitive data is eliminated when data is backed up to a device directly. Such data would have had to be encrypted before it enters the network and decrypted after it exits to insure its integrity while it is on the network

SCENARIOS

In order to show the advantages of a distributed backup strategy in more concrete terms, two scenarios will be discussed. Both the scenarios and the illustrations for them will be extremely simplified and should be seen as examples of what can be done within part of a more complex network or over networks that are linked together.

SCENARIO 1

In Scenario 1, we are hypothesizing a network or part of a network that has one large database server and seven clients. These client machines use a variety of application tools to analyze and report on data extracted from the database. 50% of all the data on the network is held on the database server, and four storage devices are available for backup: three small capacity tape drives and one large optical jukebox.

PROPOSED SOLUTION FOR SCENARIO 1

The proposed solution for Scenario 1 is shown in Figure 7. Since a distributed backup strategy allows storage devices to be attached to any client or server machine on the network, the ideal scheme would be to attach the large capacity optical jukebox directly to the database server machine.

 

Because 50% of the data on the network is held on the database server and the data from the database would be directly backed up to the optical jukebox, 50% of the data is backed up totally off the network. This would, of course, eliminate 50% of the network traffic during backup, and would increase the speed of the database backup processing considerably.

If the size of the database grew beyond the effective capacity of the optical disk for backups, another device could be attached. Up to seven devices can currently be attached with a standard SCSI controller.

The other three small capacity tape devices should ideally be attached to one of the client machines, and that machine would then be considered a second backup server. These devices should probably be grouped on one machine to facilitate the mounting of tapes. If the tape drives are scattered over an entire floor or building, the physical mounting of the tape volumes would be difficult. If larger capacity devices are purchased, they could be distributed and run unattended.

Compression could be used to further improve backup performance by reducing the amount of data that must travel over the network to a storage device.

SCENARIO 2

In Scenario 2, we hypothesize a network with four server and four client machines. The servers contain 80% of the data on the network, roughly 20% for each server. The four client machines use data from the servers, but generally perform standalone functions such as word processing.

Four storage devices are available. All are medium size tape jukeboxes with roughly similar capacity. All four are also automated devices, not requiring physical tape mounts if set up properly.

PROPOSED SOLUTION FOR SCENARIO 2

The proposed solution with a distributed backup strategy for Scenario 2 is shown in Figure 9.

With distributed backup software, one storage device could be attached to each server. This would eliminate 80% of the network traffic normally generated by a backup process and would speed the backup of the servers, which could run in parallel. The other 20% of the data, which resides on the client machines, could be backed up to any one server, or distributed among them. If set up properly, the backup should run completely unattended and need only be monitored for hardware errors and for jobs that otherwise fail to complete.

Again, compression could be used to further reduce network traffic.

CONTROLLING THE BACKUP

We will now consider the second question posed at the beginning of this paper:

§ How will the backup process be controlled?

To do this, we will again look at the ways in which three types of backups handle these issues:

§Isolated network backup (original ad hoc strategy)

§ Centralized storage backup (current widespread strategy)

§ Distributed backup (proposed strategy to remedy current problems)

CENTRALIZED CONTROL

One of the most important aspects of a backup strategy, regardless of its environment, is control. Under the umbrella of control come such critical backup functions as scheduling, catalog maintenance, and media tracking. These functions insure that backup images are reliable and can be restored when needed.

In a traditional mainframe environment, the system administrator generally controls all aspects of the backup procedure, and, with some consultation with management and users, can plan backup schedules with relative ease. Security measures are generally firmly in place, and the backup works within the security parameters already set up on the system. Careful monitoring for media errors and successful completion of backup jobs are generally all that’s needed after a backup schedule has been defined.

As has already been discussed, early backups on a network were usually ad hoc and spotty because machines in early network environments were isolated and no software existed to provide overall administrative control. The LAN revolution began when legions of individual PCs were linked for the convenience of exchanging information or sharing expensive hardware resources. Companies often had a series of small LANs, which did not communicate with each other, and administrative control (and backup) was generally left to managers of these groups or in the hands of individual users.

As LANs grew and were linked, administrative control became centralized, and backups became the province of the network administrator. Backup procedures were formalized, and tools, namely centralized storage backup software, met the demand for centralized control and other advanced features facilitated by such control.

Centralized administrative control is still critical, even in a distributed backup. Control of functions like scheduling and catalog maintenance along with monitoring and error handling are more efficient and more likely to be performed regularly if one person is in charge and responsible. In addition, storage devices were once simple and very cheap. Today’s devices are so varied and feature-rich that evaluation by one administrator is the most cost-effective purchasing strategy. Devices can then be chosen by someone qualified to judge their ability to fit into a company’s overall backup strategy.

SECURITY AND THE CATALOG

Backups would be futile if the information needed to restore the backed up data were not recorded and readily available. The repository for this information is generally called a catalog in a traditional mainframe environment and may be called a backup database or index on a client-server network. The following kinds of information must be held:

§ Information about what has been backed up

§ Information about when the backup was done and how long it is to be retained

§ Information about where the backup image is held

In a large shop, where multigigabytes of data are backed up, this catalog can be huge. Having an automated process in place is critical. This process recycles storage media when a backup image expires and deletes outdated catalog entries.

However, there is a vulnerability during a centrally-controlled network backup, which does not exist in mainframe environments [STAL90] or when storage devices are directly attached to isolated individual machines on a simple LAN. Whenever a file is processed during a centralized network backup, information about that backup must be recorded and maintained, and this is done by sending messages to the central backup server and adding the information to the catalog. The messages must travel across the network to reach the catalog.

Messages passed during a traditional mainframe backup are relatively secure because they are difficult to intercept. But, in a network environment, where there are many points of entry, messages sent by the backup to its central catalog can be intercepted. These messages can then be used to gain access to files of sensitive data that have been backed up (even if the backup has encrypted them) or to sabotage the backup itself by adding catalog records containing false information.

If the catalog is centralized, it is generally wise to use backup software that allows the optional encryption of backup messages along with the optional encryption of files. Having a variety of encryption options available allows each network administrator to balance the costs of encryption with a business’s need for security.

CONCLUSION

Finding the best backup strategy for a particular site is a matter of striking a balance between the perfect solution to backup needs and what is possible given the restraints of time, resources, and budget. Backup software must always have two characteristics--reliability and efficiency.

If individual files from a backup or the entire backup image cannot be restored when needed, the backup is useless. Likewise, unless a backup completes quickly and is efficient in its use of resources, a full schedule of backups, which would insure that a site is ready for any type of emergency, cannot be performed. Reliability and high performance are equally important in any type of backup software.

However, because of the special problems inherent in backing up data in a client-sever environment, backup software must also have flexibility. The ideal software for this task should be able to process data across networks and allow storage devices to be attached where needed instead of at one central site. This is important for security and crucial for reducing network traffic and preventing backup processing from overwhelming a network.

Encryption of messages and selective encryption of files is also particularly important in a distributed environment. A client-server system is far more vulnerable than a mainframe environment during backup because of the many points of entry on the network. For this reason, encryption of backup messages may be critical at some sites to prevent interception and interference from unauthorized persons. Encryption of selected files is also important for securing sensitive data.

Like other types of software designed for use on a network, backup software is gradually evolving to fulfill the special performance and security needs faced on a client-server system. By distributing resources and finding innovative ways to control them, designers can create backups as truly flexible and distributed as the networks on which they run--and those backups can be as reliable and efficient as they were in the glass house.

REFERENCES

[COOP94] Edward B. Cooper, "Taming the Beast: Backup Challenges Administrators," LAN Times, August 8, 1994, p. 54.

[FARN94] Mark Farnham, "Backup and Recovery Requirements," The MOSES Whitepapers, UniForum, 1994, pp. 23-34.

[GEPN95] Herb Gepner, "Specifications" for Syncsort/BACKUP in Datapro Computer Systems Series: Software (1995), No. 3850 Change Management, pp. 5-7.

[GOOD94] Stuart R. Goodgold, "Distributed Backup and Restore Performance." CMG94 Proceedings, 2, 1009.

[HIGG94] Kelly Jackson Higgins, "Security is a Dog from Hell," reprinted as "Securing Distributed Open Systems" in Datapro Computer Systems Series: Overviews (1995), No. 9004 Systems Security. Originally published in UniForum Monthly, Vol. XIV, No. 6, June 1994, pp. 20-23.

[LASH94] David A. Lash, "Distributed Systems Data Management Needs and Experience." CMG94 Proceedings, I, 521-527.

[STAL90] William Stallings, Local Networks, Third Edition, New York: Macmillan, 1990, pp. 447-467.