Data storage system - std. SAS, NAS, SAN: a step towards storage networks What is storage system

It is information that is the driving force of modern business and is currently considered the most valuable strategic asset of any enterprise. The volume of information is growing exponentially along with the growth of global networks and the development of e-commerce. Success in the information war requires an effective strategy for storing, protecting, sharing and managing your most important digital asset - data - both today and in the near future.

Storage resource management has become one of the most pressing strategic challenges facing departments information technologies... Due to the development of the Internet and fundamental changes in business processes, information is accumulating at an unprecedented rate. In addition to the urgent problem of ensuring the possibility of a constant increase in the volume of stored information, the problem of ensuring the reliability of data storage and constant access to information is no less acute on the agenda. For many companies, the “24 hours a day, 7 days a week, 365 days a year” data access formula has become the norm.

In the case of a separate PC, a data storage system (DSS) can be understood as a separate internal HDD or a disk system. If it comes to corporate storage, then traditionally there are three technologies for organizing data storage: Direct Attached Storage (DAS), Network Attach Storage (NAS) and Storage Area Network (SAN).

Direct Attached Storage (DAS)

DAS technology implies direct (direct) connection of drives to a server or to a PC. In this case, drives (hard drives, tape drives) can be both internal and external. The simplest case of a DAS system is one disk inside a server or PC. In addition, the organization of an internal RAID array of disks using a RAID controller can also be referred to as a DAS system.

It is worth noting that, despite the formal possibility of using the term DAS system in relation to a single disk or to an internal array of disks, a DAS system is usually understood as an external rack or cage with disks, which can be considered as an autonomous storage system (Fig. 1). In addition to independent power supply, such autonomous DAS systems have a specialized controller (processor) for managing the storage array. For example, a RAID controller with the ability to organize RAID arrays of various levels can act as such a controller.

Rice. 1. An example of a DAS storage system

It should be noted that autonomous DAS systems can have several external I / O channels, which makes it possible to connect several computers to the DAS system at the same time.

SCSI (Small Computer Systems Interface), SATA, PATA and Fiber Channel interfaces can be used as interfaces for connecting drives (internal or external) in DAS technology. While SCSI, SATA, and PATA are used primarily for connecting internal drives, Fiber Channel is used exclusively for connecting external storage and standalone storage systems. The advantage of the Fiber Channel interface is this case in that it does not have a strict length limitation and can be used when the server or PC connected to the DAS system is at a considerable distance from it. SCSI and SATA interfaces can also be used to connect external storage systems (in this case, the SATA interface is called eSATA), however, these interfaces have a strict limitation on the maximum length of the cable connecting the DAS system and the connected server.

The main advantages of DAS systems include their low cost (in comparison with other storage solutions), ease of deployment and administration, and high speed of data exchange between the storage system and the server. Actually, it is thanks to this that they have gained great popularity in the segment of small offices and small corporate networks... At the same time, DAS systems have their drawbacks, which include poor manageability and suboptimal utilization of resources, since each DAS system requires a dedicated server to be connected.

Currently, DAS systems occupy a leading position, but the share of sales of these systems is constantly decreasing. DAS systems are gradually being replaced by either universal solutions with the ability to seamlessly migrate from NAS systems, or systems that provide the ability to use them both as DAS and NAS and even SAN systems.

DAS systems should be used when you need to increase the disk space of one server and take it out of the chassis. Also, DAS systems can be recommended for use for workstations that process large amounts of information (for example, for nonlinear video editing stations).

Network Attached Storage (NAS)

NAS systems are network-attached storage systems that connect directly to the network just like a network print server, router, or any other network device(fig. 2). In fact, NAS systems are an evolution of file servers: the difference between a traditional file server and a NAS device is about the same as between a hardware network router and a software dedicated server router.

Rice. 2. An example of a NAS storage system

To understand the difference between a traditional file server and a NAS device, let's remember that a traditional file server is a dedicated computer (server) that stores information available to network users. To store information, hard disks installed in the server can be used (as a rule, they are installed in special baskets), or DAS devices can be connected to the server. The file server is administered using the server operating system. This approach to organizing data storage systems is currently the most popular in the segment of small local area networks, but it has one significant drawback. The fact is that a universal server (and even in combination with a server operating system) is by no means a cheap solution. At the same time, the majority functionality, inherent in the universal server, is simply not used in the file server. The idea is to create an optimized file server with an optimized operating system and balanced configuration. It is this concept that the NAS device embodies. In this sense, NAS devices can be thought of as "thin" file servers, or, as they are called otherwise, filers.

In addition to an optimized OS that is free of all functions not related to file system maintenance and data I / O, NAS systems have a speed-optimized file system. NAS systems are designed in such a way that all of their computing power is focused solely on serving and storing files. The operating system itself is located in flash memory and is preinstalled by the manufacturer. Naturally, with the exit new version OS the user can independently "reflash" the system. Connecting NAS devices to the network and configuring them is a fairly simple task and can be done by any power user, let alone a system administrator.

Thus, compared to traditional file servers, NAS devices are more powerful and less expensive. Currently, almost all NAS devices are focused on use in Ethernet networks ( Fast Ethernet, Gigabit Ethernet) based on TCP / IP protocols. NAS devices are accessed using special file access protocols. Most common protocols file access are the CIFS, NFS and DAFS protocols.

CIFS(Common Internet File System System - the common file system of the Internet) is a protocol that provides access to files and services on remote computers(including the Internet) and uses a client-server interaction model. The client creates a request to the server to access the files, the server fulfills the client's request and returns the result of its work. CIFS is traditionally used on Windows LANs to access files. CIFS uses the TCP / IP protocol to transport data. CIFS provides functionality similar to FTP (File Transfer Protocol), but provides clients with improved control over files. It also allows you to share file access between clients using blocking and automatic recovery communication with the server in the event of a network failure.

Protocol NFS(Network File System) is traditionally used on UNIX platforms and is a combination of a distributed file system and a network protocol. NFS also uses a client-server communication model. The NFS protocol provides access to files on a remote host (server) as if they were on the user's computer. NFS uses the TCP / IP protocol to transport data. For NFS to work on the Internet, the WebNFS protocol was developed.

Protocol DAFS(Direct Access File System) is a standard file access protocol that is based on NFS. This protocol allows applications to transfer data bypassing the operating system and its buffer space directly to transport resources. DAFS provides high file I / O speeds and lowers CPU utilization by dramatically reducing the number of operations and interrupts typically required when processing network protocols.

DAFS was designed with a cluster and server environment in mind for databases and a variety of end-to-end Internet applications. It provides the lowest latency in accessing file shares and data, and supports intelligent system and data recovery mechanisms, which makes it attractive for use in NAS systems.

Summarizing the above, NAS systems can be recommended for use in multi-platform networks in the case when network access to files is required and the ease of installation of the storage system administration is quite important factors. A great example is using a NAS as a file server in a small company office.

Storage Area Network (SAN)

Actually, a SAN is no longer a separate device, but a complex solution, which is a specialized network infrastructure for storing data. SANs are integrated as separate dedicated subnets within a local area (LAN) or wide area (WAN) network.

Basically, SANs link one or more servers (SANs) to one or more storage devices. SAN networks allow any SAN server to access any storage device without loading any other servers or local area network... In addition, it is possible to exchange data between storage devices without the participation of servers. In fact, SANs allow very a large number users to store information in one place (with fast centralized access) and share it. As data storage devices can be used RAID-arrays, various libraries (tape, magneto-optical, etc.), as well as JBOD-systems (disk arrays that are not combined in RAID).

Data storage networks began to develop intensively and are introduced only since 1999.

Just as local area networks can in principle be built on the basis of different technologies and standards, various technologies can also be used to build SANs. But just as the Ethernet (Fast Ethernet, Gigabit Ethernet) standard has become the de facto standard for local area networks, the Fiber Channel (FC) standard dominates in storage networks. Actually, it was the development of the Fiber Channel standard that led to the development of the very concept of the SAN. At the same time, it should be noted that the iSCSI standard is gaining more and more popularity, on the basis of which it is also possible to build SAN networks.

Along with the speed parameters, one of the the most important advantages Fiber Channel is long distance capability and topology flexibility. The concept of building a storage network topology is based on the same principles as traditional local area networks based on switches and routers, which greatly simplifies the construction of multi-node system configurations.

It is worth noting that for data transmission in the Fiber Channel standard, both fiber-optic and copper cables are used. When organizing access to geographically remote nodes at a distance of up to 10 km, standard equipment and single-mode fiber are used for signal transmission. If the nodes are separated by a greater distance (tens or even hundreds of kilometers), special amplifiers are used.

SAN topology

A typical Fiber Channel SAN is shown in Fig. 3. The infrastructure of such a SAN network consists of storage devices with a Fiber Channel interface, SAN servers (servers connected to both a local network via an Ethernet interface and a SAN network via a Fiber Channel interface) and a switching fabric (Fiber Channel Fabric) , which is built on the basis of Fiber Channel switches (hubs) and is optimized for transferring large blocks of data. Network users can access the storage system through SAN servers. At the same time, it is important that the traffic inside the SAN network is separated from the IP traffic of the local network, which, of course, helps to reduce the load on the local network.

Rice. 3. Typical SAN network layout

Benefits of SANs

To the main advantages SAN technologies include high performance, high data availability, excellent scalability and manageability, the ability to consolidate and virtualize data.

Fiber Channel fabrics with non-blocking architecture allow multiple SAN servers to access storage devices concurrently.

In a SAN architecture, data can be easily moved from one storage device to another to optimize data placement. This is especially important when multiple SAN servers require concurrent access to the same storage devices. Note that the process of data consolidation is not possible in the case of using other technologies, such as, for example, when using DAS devices, that is, data storage devices that are directly connected to servers.

Another opportunity provided by the SAN architecture is data virtualization. The idea behind virtualization is to provide SAN servers with access not to individual storage devices, but to resources. That is, servers should not "see" storage devices, but virtual resources. For practical implementation of virtualization, a special virtualization device can be placed between SAN servers and disk devices, to which storage devices are connected on one side, and SAN servers on the other. In addition, many modern FC switches and HBAs provide virtualization capabilities.

The next capability provided by SANs is the implementation of remote data mirroring. The principle of data mirroring is to duplicate information on several media, which increases the reliability of information storage. An example of the simplest case of data mirroring is combining two disks into a RAID 1 array. In this case, the same information is written simultaneously to two disks. The disadvantage of this method is that both drives are located locally (as a rule, drives are located in the same cage or rack). SANs overcome this drawback and provide the ability to mirror not just individual storage devices, but the SANs themselves, which can be hundreds of kilometers apart from each other.

Another advantage of SANs is the ease of organization. Reserve copy data. Traditional backup technology, which is used on most LANs, requires a dedicated backup server and, most importantly, dedicated network bandwidth. In fact, during a backup operation, the server itself becomes inaccessible to users on the local network. Actually, this is why, as a rule, backups are made at night.

SAN architecture allows a fundamentally different approach to the problem of backup. In this case, the Backup server is part of the SAN and connects directly to the switch fabric. In this case, the Backup traffic is isolated from the LAN traffic.

Equipment used to create SAN networks

As noted, a SAN deployment requires storage devices, SAN servers, and switch fabric hardware. Switching factories include both physical layer devices (cables, connectors) and interconnect devices for connecting SAN nodes to each other, translation devices that perform functions of converting the Fiber Channel (FC) protocol to other protocols, for example SCSI, FCP, FICON, Ethernet, ATM, or SONET.

Cables

As noted, Fiber Channel allows both fiber and copper cables to connect SAN devices. At the same time, different types of cables can be used in one SAN network. Copper cable is used for short distances (up to 30 m), while fiber optic cables are used for both short and for distances up to 10 km or more. Both multimode and Singlemode fiber optic cables are used, with multimode being used for distances up to 2 km, and singlemode for long distances.

Coexistence different types cables within the same SAN network are provided by means of special converters GBIC (Gigabit Interface Converter) and MIA (Media Interface Adapter).

There are several possible transmission rates in the Fiber Channel standard (see table). Note that currently the most common FC devices of the 1, 2 and 4 GFC standards. This provides backward compatibility of higher-speed devices with lower-speed ones, that is, a 4 GFC device automatically supports connecting devices of 1 and 2 GFC standards.

Interconnect Device

Fiber Channel accepts a variety of device networking topologies such as Point-to-Point, Arbitrated Loop (FC-AL), and switched fabric.

A point-to-point topology can be used to connect a server to dedicated storage. In this case, the data is not shared with the SAN servers. In fact, this topology is a variant of the DAS system.

At a minimum, a point-to-point topology requires a server equipped with a Fiber Channel adapter and a Fiber Channel storage device.

A split-access ring (FC-AL) topology is a device wiring scheme in which data is transmitted in a logical closed loop. In an FC-AL ring topology, the connectivity devices can be Fiber Channel hubs or switches. With hubs, the bandwidth is shared among all nodes in the ring, while each port on the switch provides protocol bandwidth to each node.

In fig. Figure 4 shows an example of a split-access Fiber Channel ring.

Rice. 4. Example of a Fiber Channel Shared Ring

The configuration is similar to the physical star and logical ring used in Token Ring LANs. In addition, as in Token Ring networks, data travels around the ring in one direction, but unlike Token Ring networks, a device can request the right to transfer data, rather than waiting for a blank token from the switch. Shared Fiber Channel rings can address up to 127 ports, however, as practice shows, typical FC-AL rings contain up to 12 nodes, and after 50 nodes are connected, performance drops dramatically.

The Fiber Channel switched-fabric topology is implemented using Fiber Channel switches. In this topology, each device has a logical connection to any other device. In fact, Fiber Channel fabric switches perform the same functions as traditional Ethernet switches. Recall that, unlike a hub, a switch is a high-speed device that provides an “one-to-one” connection and handles multiple concurrent connections. Any host connected to a Fiber Channel switch receives protocol bandwidth.

In most cases, large SANs are built using a mixed topology. At the lower level, FC-AL rings are used, connected to low-performance switches, which, in turn, are connected to high-speed switches that provide the highest possible bandwidth. Several switches can be connected to each other.

Broadcasting devices

Translator devices are intermediate devices that convert the Fiber Channel protocol to more than high levels... These devices are designed to connect a Fiber Channel network to an external WAN network, a local network, and also to join a Fiber Channel network. various devices and servers. These devices include Bridges, Fiber Channel Adapters (HBAs), routers, gateways and network adapters. The classification of translation devices is shown in Figure 5.

Rice. 5. Classification of broadcasting devices

The most common broadcast devices are HBAs with PCI interface that are used to connect servers to a Fiber Channel network. Network adapters allow to connect local Ethernet networks to Fiber Channel networks. Bridges are used to connect SCSI storage devices to a Fiber Channel network. It should be noted that recently, almost all storage devices that are intended for use in a SAN have built-in Fiber Channel and do not require bridging.

Storage devices

Both hard disks and tape drives can be used as storage devices in SAN networks. In terms of possible application configurations hard drives as storage devices in SAN networks, it can be both JBOD arrays and RAID arrays of disks. Traditionally, storage devices for SAN networks come in the form of external racks or baskets equipped with a dedicated RAID controller. Unlike NAS or DAS devices, SAN devices are equipped with a Fiber Channel interface. At the same time, the disks themselves can have both SCSI and SATA interfaces.

In addition to hard disk storage devices, tape drives and libraries are widely used in SANs.

SAN servers

SAN servers differ from conventional application servers in only one detail. In addition to an Ethernet network adapter, they are equipped with an HBA adapter for server interaction with a local network, which allows them to be connected to Fiber Channel-based SAN networks.

Intel storage systems

Next, we'll look at a few specific examples of Intel storage devices. Strictly speaking, Intel does not release complete solutions and is engaged in the development and production of platforms and individual components for building data storage systems. On the basis of these platforms, many companies (including a number of Russian companies) produce complete solutions and sell them under their own logos.

Intel Entry Storage System SS4000-E

The Intel Entry Storage System SS4000-E is a NAS device designed for use in small and medium-sized offices and multi-platform LANs. With the Intel Entry Storage System SS4000-E, Windows, Linux and Macintosh clients can access shared data. In addition, the Intel Entry Storage System SS4000-E can act as both a DHCP server and a DHCP client.

The Intel Entry Storage System SS4000-E is a compact external rack that supports up to four SATA drives (Figure 6). Thus, the maximum system capacity can be 2TB using 500GB drives.

Rice. 6. Intel Entry Storage System SS4000-E storage system

The Intel Entry Storage System SS4000-E uses a SATA RAID controller with support for RAID levels 1, 5, and 10. Since this system is a NAS device, that is, in fact, a "thin" file server, the storage system must have a specialized processor, memory and a flashed operating system. The Intel Entry Storage System SS4000-E uses Intel 80219 with clock frequency 400 MHz. In addition, the system is equipped with 256 MB of DDR memory and 32 MB of flash memory for storing the operating system. The operating system is Linux Kernel 2.6.

To connect to a local network, the system provides a two-channel gigabit network controller. In addition, there are also two USB ports.

The Intel Entry Storage System SS4000-E supports CIFS / SMB, NFS, and FTP, and is configured using a web interface.

In the case of using Windows clients (Windows 2000/2003 / XP are supported), it is additionally possible to implement backup and data recovery.

Intel Storage System SSR212CC

The Intel Storage System SSR212CC is a versatile storage platform for DAS, NAS and SAN storage. This system is housed in a 2 U chassis and is designed to be mounted in a standard 19-inch rack (Figure 7). The Intel Storage System SSR212CC supports up to 12 hot-swappable SATA or SATA II drives for up to 6 TB of storage capacity with 550 GB drives.

Rice. 7. Intel Storage System SSR212CC

In fact, the Intel Storage System SSR212CC is a full-fledged high-performance server running under the operating systems Red Hat Enterprise Linux 4.0, Microsoft Windows Storage Server 2003, Microsoft Windows Server 2003 Enterprise Edition, and Microsoft Windows Server 2003 Standard Edition.

The server is based on Intel processor Xeon 2.8 GHz (800 MHz FSB, 1 MB L2 cache). The system supports up to 12GB DDR2-400 SDRAM with ECC (six DIMM slots are provided for memory modules).

The Intel Storage System SSR212CC is equipped with two Intel RAID Controller SRCS28Xs with the ability to create RAID levels 0, 1, 10, 5, and 50. In addition, the Intel Storage System SSR212CC has a dual channel Gigabit LAN controller.

Intel Storage System SSR212MA

The Intel Storage System SSR212MA is an iSCSI-based IP SAN storage platform.

The system is housed in a 2 U chassis and is designed to be mounted in a standard 19 ”rack. The Intel Storage System SSR212MA supports up to 12 SATA drives (hot-swappable) for up to 6 TB of storage capacity with 550 GB drives.

The hardware configuration of the Intel Storage System SSR212MA does not differ from the Intel Storage System SSR212CC.

Direct Attached Storage Systems (DAS) implement the most well-known connection type. When using DAS, the server has a personal connection with the storage system and is almost always the sole user of the device. In this case, the server receives block access to the data storage system, that is, it addresses directly to the data blocks.

Storage systems of this type are fairly simple and usually inexpensive. The disadvantage of the direct connection method is short distance between the server and the storage device. The typical DAS interface is SAS.

Network Attached Storage (NAS)

Network-attached storage (NAS) systems, also known as file servers, provide their network resources to clients over the network in the form of shared files or directory mount points. Clients use network file access protocols such as SMB (formerly known as CIFS) or NFS. The file server, in turn, uses block access protocols to its internal storage to process client requests for files. Since the NAS operates over a network, the storage can be very far away from the clients. Many network attached storage systems provide additional functions such as storage imaging, deduplication or data compression, and others.

Storage Area Network (SAN)

A storage area network (SAN) provides clients with block-level access to data over a network (such as Fiber Channel or Ethernet). Devices on a SAN do not belong to a single server, but can be used by all clients on the storage network. It is possible to divide disk space into logical volumes that are allocated to separate host servers. These volumes are independent of the SAN components and their placement. Clients access the datastore using a block type of access, just like with a DAS connection, but since the SAN uses a network, the storage devices can be located far away from the clients.

Currently, SAN architectures use the Small Computer System Interface (SCSI) protocol to transmit and receive data. Fiber Channel (FC) SANs encapsulate the SCSI protocol in Fiber Channel frames. SANs using iSCSI (Internet SCSI) use SCSI TCP / IP packets as transport. Fiber Channel over Ethernet (FCoE) encapsulates the Fiber Channel protocol in Ethernet packets using relatively new technology DCB (Data Center Bridging), which brings a set of enhancements to traditional Ethernet and can currently be deployed on 10GbE infrastructure. Because each of these technologies allows applications to access data storage using the same SCSI protocol, it becomes possible to use them all in one company or migrate from one technology to another. Applications running on the server cannot distinguish FC, FCoE, iSCSI, or even DAS from SAN.

There is a lot of discussion about the choice of FC or iSCSI for building a SAN. Some companies focus on the low cost of the initial iSCSI SAN deployment, while others choose the high reliability and availability of Fiber Channel SANs. Although low-end iSCSI solutions are less expensive than Fiber Channel, as iSCSI SAN performance and reliability increase, the cost advantage disappears. At the same time, there are some FC implementations that are easier to use than most iSCSI solutions. Therefore, the choice of a particular technology depends on the business requirements, existing infrastructure, expertise and budget.

Most large organizations that use SANs choose Fiber Channel. These companies typically require proven technology, need high throughput, and have the budget to buy the most reliable and productive equipment. They also have staff to manage the SAN. Some of these companies plan to continue investing in Fiber Channel infrastructure, while others are investing in iSCSI solutions, especially 10GbE, for their virtualized servers.

Smaller companies are more likely to choose iSCSI because of the low cost entry barriers, while still being able to scale up their SANs further. Inexpensive solutions usually use 1GbE technology; 10GbE solutions are significantly more expensive and are generally not considered entry-level SANs.

Unified storage

Unified Storage combines NAS and SAN technologies in a single, integrated solution. These versatile repositories allow for both block and file access to shared resources, and are easier to manage with centralized management software.

In the simplest case, a SAN consists of storage systems, switches and servers, united by optical communication channels. In addition to directly disk storage systems, you can connect disk libraries, tape libraries (streamers), devices for storing data on optical disks (CD / DVD and others), etc. to the SAN.

An example of a highly reliable infrastructure in which servers are connected simultaneously to the local network (left) and to the storage area network (right). This scheme provides access to data located on the storage system in the event of failure of any processor module, switch or access path.

Using a SAN allows you to provide:

  • centralized management of resources of servers and data storage systems;
  • connection of new disk arrays and servers without interrupting the operation of the entire storage system;
  • using previously purchased equipment in conjunction with new storage devices;
  • prompt and reliable access to data storage devices located at a great distance from servers * without significant performance losses;
  • acceleration of the process of backup and data recovery - BURA.

History

The development of network technologies has led to the emergence of two network solutions for storage systems - Storage Area Networks (SANs) for block-level data exchange supported by client file systems, and Network Attached Storage (NAS) file-level storage servers. To distinguish traditional storage systems from network storage systems, another retronym was proposed - Direct Attached Storage (DAS).

The successive DAS, SAN and NAS appearing on the market reflect the evolving chains of communication between the applications that use the data and the bytes on the media containing that data. Once upon a time, application programs themselves read and wrote blocks, then drivers appeared as part of the operating system. In modern DAS, SAN and NAS, the chain consists of three links: the first link is the creation of RAID arrays, the second is the processing of metadata that allows you to interpret binary data in the form of files and records, and the third is services for providing data to the application. They differ in where and how these links are implemented. In the case of DAS, the storage system is "bare", it only provides the ability to store and access data, and everything else is done on the server side, starting with the interfaces and the driver. With the advent of SAN, the RAID provision is transferred to the storage side, everything else remains the same as in the case of DAS. And NAS differs in that metadata is also transferred to the storage system to provide file access, here the client only needs to support data services.

The SAN became possible after the Fiber Channel (FC) protocol was developed in 1988 and in 1994 it was approved by ANSI as a standard. The term Storage Area Network dates back to 1999. Over time, FC gave way to Ethernet, and iSCSI-connected IP-SANs became common.

The idea for a networked NAS storage server belongs to Brian Randall of Newcastle University and was implemented in machines on a UNIX server in 1983. This idea was so successful that it was taken up by many companies, including Novell, IBM, and Sun, but ultimately changed the leaders NetApp and EMC.

In 1995, Garth Gibson developed the principles of NAS and created Object Storage (OBS). He began by dividing all disk operations into two groups, one containing the more frequent ones, such as reads and writes, and the other more rare ones, such as those with names. Then he proposed another container in addition to blocks and files, he called it an object.

OBS has a new type of interface called object interface. Client data services interact with metadata using the Object API. OBS not only stores data, but also supports RAID, stores metadata related to objects, and supports an object interface. DAS, SAN, NAS and OBS coexist in time, but each type of access is more appropriate for a specific type of data and applications.

SAN architecture

Network topology

SAN is a high-speed data network designed to connect servers to storage devices. A variety of SAN topologies (Point-to-Point, Arbitrated Loop, and Switching) replace traditional server-to-storage bus connections and offer greater flexibility, performance and reliability over them. At the heart of the SAN concept is the ability to connect any server to any Fiber Channel storage device. The principle of interaction of nodes in a SAN with point-to-point topologies or switching is shown in the figures. In a SAN with an Arbitrated Loop topology, data transfer occurs sequentially from node to node. In order to start data transmission, the transmitting device initiates arbitration for the right to use the data transmission medium (hence the name of the topology - Arbitrated Loop).

The SAN transport is based on the Fiber Channel protocol, which uses both copper and fiber-optic device connections.

SAN components

SAN components are categorized as follows:

  • Storage resources;
  • SAN infrastructure devices;

Host Bus Adaptors

Storage resources

Storage resources include disk arrays, tape drives, and Fiber Channel libraries. Storage resources realize many of their capabilities only when they are included in the SAN. Thus, high-end disk arrays can replicate data between arrays over Fiber Channel networks, and tape libraries can implement data transfer to tape directly from disk arrays with Fiber Channel interface, bypassing the network and servers (Serverless backup). The most popular in the market were disk arrays from EMC, Hitachi, IBM, Compaq (the Storage Works family, inherited by Compaq from Digital), and from the manufacturers of tape libraries we should mention StorageTek, Quantum / ATL, IBM.

SAN Infrastructure Devices

The devices that implement the SAN infrastructure are Fiber Channel switches (FC switches), Fiber Channel Hubs, and Fiber Channel-SCSI routers. Hubs are used to aggregate devices operating in Fiber Channel Arbitrated Loop (FC_AL ). The use of hubs allows you to connect and disconnect devices in a loop without stopping the system, since the hub automatically closes the loop if a device is disconnected and automatically opens the loop if a new device was connected to it. Each loop change comes with a complex initialization process. The initialization process is multi-stage, and until it is completed, data exchange in the loop is not possible.

All modern SANs are built on switches that allow for a full network connection. Switches can not only connect Fiber Channel devices, but also delimit access between devices, for which so-called zones are created on the switches. Devices placed in different zones cannot communicate with each other. The number of ports in the SAN can be increased by connecting switches to each other. A group of interconnected switches is called Fiber Channel Fabric, or simply Fabric. The links between switches are called Interswitch Links, or ISL for short.

Software

The software allows for redundancy of server access paths to disk arrays and dynamic load balancing between paths. For most disk arrays, there is an easy way to determine that the ports accessible through different controllers are for the same disk. Specialized software maintains a table of access paths to devices and provides path disconnection in the event of an accident, dynamic connection of new paths and load balancing between them. Typically, disk array manufacturers offer specialized software of this type for their arrays. VERITAS Software manufactures VERITAS Volume Manager software designed to organize logical disk volumes from physical disks and provide disk access path redundancy and load balancing for most of the known disk arrays.

Protocols used

SANs use low-level protocols:

  • Fiber Channel Protocol (FCP), SCSI transport over Fiber Channel. Most commonly used on this moment protocol. Available in 1 Gbit / s, 2 Gbit / s, 4 Gbit / s, 8 Gbit / s and 10 Gbit / s options.
  • iSCSI, SCSI transport over TCP / IP.
  • FCoE, FCP / SCSI transport over pure Ethernet.
  • FCIP and iFCP, FCP / SCSI encapsulation and transmission in IP packets.
  • HyperSCSI, SCSI transport over Ethernet.
  • FICON transport over Fiber Channel (used by mainframes only).
  • ATA over Ethernet, ATA transport over Ethernet.
  • SCSI and / or TCP / IP transport over InfiniBand (IB).

Advantages

  • High reliability of access to data located on external storage systems. The independence of the SAN topology from the storage systems and servers used.
  • Centralized data storage (reliability, security).
  • Convenient centralized management of switching and data.
  • Offloading heavy I / O traffic to a separate network - offloading the LAN.
  • High performance and low latency.
  • Scalability and flexibility logical structure SAN
  • The geographic size of the SAN, in contrast to the classic DAS, is practically unlimited.
  • The ability to quickly distribute resources between servers.
  • The ability to build fault-tolerant cluster solutions at no additional cost based on the existing SAN.
  • Simple backup scheme - all data is in one place.
  • Availability of additional features and services (snapshots, remote replication).
  • High degree of SAN security.

Sharing storage systems typically simplifies administration and adds a fair amount of flexibility because cables and disk arrays do not need to be physically transported and rewired from one server to another.

Another benefit is the ability to load servers directly from the storage network. With this configuration, you can quickly and easily replace the faulty