See what "DRAM" is in other dictionaries. How to find out the frequency of RAM RAM frequency 665.1 MHz how to change

Operating principle

How DRAM Read Works for a Simple 4x4 Array

How DRAM Writing Works for a Simple 4x4 Array

Physically, DRAM is a collection of memory cells that are made up of capacitors and transistors located inside semiconductor memory chips.

Initially, memory microcircuits were produced in DIP-type cases (for example, the K565RUxx series), then they began to be produced in more technologically advanced cases for use in modules.

SPD (Serial Presence Detect) was installed on many SIMMs and the overwhelming majority of DIMMs - a small EEPROM memory chip that stores the module parameters (capacity, type, operating voltage, number of banks, access time, etc.), which were programmatically available as hardware , in which the module was installed (used for auto-tuning parameters), and to users and manufacturers.

SIPP modules

SIPP (Single In-line Pin Package) modules are rectangular boards with contacts in the form of a row of small pins. This type of design is practically not used anymore, since it was later supplanted by modules of the SIMM type.

SIMM modules

SIMMs (Single In-line Memory Modules) are long rectangular boards with a number of contact pads along one of its sides. The modules are fixed in the connector (socket) of the connection using latches, by installing the board at a certain angle and pressing on it until it is brought to a vertical position. Modules for 4, 8, 16, 32, 64, 128 MB were produced.

The most common are 30- and 72-pin SIMMs.

DIMMs

DIMMs (Dual In-line Memory Modules) are long rectangular boards with rows of contact pads along both sides, installed vertically into the connector and fixed with latches at both ends. Memory chips on them can be placed either on one or on both sides of the board.

SDRAM memory modules are most common in the form of 168-pin DIMMs, DDR SDRAM memory modules are in the form of 184-pin modules, and DDR2, DDR3, and FB-DIMM SDRAM modules are in the form of 240-pin modules.

SO-DIMMs

For portable and compact devices (motherboards of the Mini-ITX form factor, laptops, laptops, tablets, etc.), as well as printers, network and telecommunications equipment, etc., structurally reduced DRAM modules (both SDRAM and DDR SDRAM) - SO-DIMM (Small outline DIMM) - DIMM counterparts in a compact design to save space.

SO-DIMMs are available in 72-, 100-, 144-, 200- and 204-pin designs.

RIMM modules

RIMM (Rambus In-line Memory Module) modules are less common, they produce RDRAM memory. They are represented by 168- and 184-pin varieties, and on the motherboard such modules must be installed only in pairs, otherwise special dummy modules are installed in empty slots (this is due to the design features of such modules). There are also 242-pin PC1066 RDRAM modules RIMM 4200, which are not compatible with 184-pin connectors, and a smaller version of RIMM - SO-RIMM, which are used in portable devices.

Chip manufacturers and module assemblers

Top five largest manufacturers DRAM in the first quarter of 2008 entered

Other options identical in purpose: Memory Frequency, DRAM Clock By, MEM Clock Setting, Memory Clock (Mhz), New MEM Speed ​​(DDR), System Memory Frequency.

To the category of the most commonly used BIOS options related to configuration of work random access memory computer, the DRAM Frequency option applies. It allows the user to set one of the most important parameters RAM - the frequency of operation of memory chips.

RAM is one of the most important components personal computer... Its purpose is to store data that is used operating system and application programs during the current session. On the hardware level random access memory is made in the form of special modules, on which microcircuits are located, containing the actual storage cells. These modules fit into special expansion slots on the motherboard.

Generally, computer RAM is categorized as dynamic memory. Dynamic memory () differs from static memory by lower performance, but at the same time, lower price. Also, a feature of dynamic memory is its need for dynamic updating of data in microcircuits installed on it.

Currently, RAM modules are manufactured using DDR (Double Data Rate) Synchronous DRAM technology. DDR modules use a synchronous, that is, clock-determined mode of operation, and have twice the bandwidth of conventional synchronous memory modules (SDRAM).

The frequency of work of dynamic random access memory can be considered one of the most important parameters of its work, since it largely determines its performance. Usually, the memory frequency refers to the frequency of the memory bus on the motherboard.

It is necessary to distinguish between the real frequency of the memory bus, which means the number of pulses generated by the clock generator, from the effective one. The effective frequency is, in fact, the real speed of operations performed during memory operation, and for modern types of RAM modules, such as DDR2 and DDR3, it can be several times higher than the real one.

DDR-type RAM modules, as a rule, operate at frequencies of 200, 266, 333, 400 MHz. DDR2 modules usually have twice the effective frequency characteristics as compared to DDR2 - 400, 533, 667, 800, 1066 MHz, and, therefore, double the performance. DDR3 memory, in turn, has twice the effective frequency compared to DDR2 - 800, 1066, 1333, 1600, 1800, 2000, 2133, 2200, 2400 MHz.

To set the frequency of operation of RAM modules in many BIOSes, there is a DRAM Frequency function, as well as options similar to it.

These options are usually found only on those motherboards that have RAM controllers that allow it to operate in asynchronous modes, that is, at frequencies independent of the system bus frequency. Since memory controllers in modern motherboards are usually built into a chipset, chipsets with such controllers are called asynchronous. Motherboards with asynchronous chipsets provide the user with ample opportunities for overclocking RAM.

The DRAM Frequency option can have a variety of values. The Auto value means that the speed of the RAM is automatically detected by the BIOS. The value by SPD means that the operating frequency is determined by special microcircuits built into the memory modules - SPD (Serial Presence Detect) microcircuits.

Also, this option often allows you to select the exact values ​​\ u200b \ u200bof the RAM frequency from a certain set of values ​​supported motherboard... These values ​​are always indicated in megahertz.

In some BIOS, there may be options like 1: 1, Linked, Host Clk. These options imply setting the frequency of memory modules equal to the frequency of the system bus.

Which option should you choose?

For most users, it is best to set it to Auto so that the BIOS can automatically find the optimal value. However, it sometimes happens that the BIOS can set the frequency lower than that which is the nominal for the RAM. To fix this, you can set the option to by SPD or manually select the desired frequency option.

Also manual installation memory frequency is often used when overclocking a computer. As you know, increasing the frequency of the RAM in most cases improves the performance of the computer, although not to the same extent as increasing the speed of the processor. Typically, the performance gain when overclocking RAM can be from 4 to 12%. Besides the point overclocking of a specific PC component, there are options options such as which can adjust complex overclocking.

To overclock the memory, the user can specify the required frequency value in the option, and then test its operation using special test programs. If the RAM works without errors, then the set value can be left as constant.

However, not only setting too high values ​​of the frequency of the RAM can have negative consequences. In some cases, setting too low values ​​that go beyond the specifications of RAM modules can also lead to errors, including errors during computer startup.

In the previous section, we saw that DRAM chips multiplex addresses to conserve resources. We have also seen that accessing DRAM cells takes time, as the capacitors in these cells discharge in such a way that they do not immediately produce a stable signal. We have also seen that DRAM cells need to be recharged. Now it's time to put it all together and see how these factors determine the DRAM access details.

We will focus on modern technology, we will not discuss asynchronous DRAM and its variants, as they are irrelevant. Readers interested in this topic are referred to and. We will also not talk about Rambus DRAM (RDRAM), although this technology is not obsolete. It is simply not widely used in system memory... We will focus exclusively on Synchronous DRAM (SDRAM) and its successor Double Data Rate DRAM (DDR).

Synchronous DRAM, as its name suggests, operates on a time source. The memory controller has a clock generator that determines the frequency of the front side bus (FSB), the memory controller interface used by DRAM chips. At the time of this writing, frequencies of 800MHz, 1066MHz, and even 1333MHz are used, and 1600MHz has been announced for the next generation. This does not mean that the bus frequency is really that high. Instead, data is transmitted two or four times per clock cycle. Larger numbers sell better, so manufacturers advertise a 200MHz quad-rate bus as an "effective" 800MHz bus.

Today, for SDRAM, one data transfer chunk is 64 bits - 8 bytes. Therefore, the data transfer rate for FSB is 8 bytes multiplied by the effective bus frequency (6.4Gb / s for a 200MHz bus with a quadruple data transfer rate). It seems like a lot, but this is peak speed, a maximum that cannot be surpassed. As we will see, the protocol for exchanging data with the RAM module assumes the presence of many periods of time when no data is transmitted. These are exactly the times that we must learn to understand and minimize in order to achieve the best performance.

2.2.1 Read access protocol

Figure 2.8: Timing diagrams of the SDRAM read access protocol

Figure 2.8 shows the activity on some outputs of the DRAM module, which can be divided into three phases, which are colored differently in the figure. As usual, time flows from left to right. Many details have been omitted. Here we are only talking about the bus clock, RAS and CAS signals, and the address and data buses. The read cycle begins with the memory controller sending the line address on the address bus and decreasing the RAS signal level. All signals are read while the clock signal (CLK) rises, so it doesn't matter if the signals are not exactly rectangular as long as they are stable when they are read. Setting the line address causes the RAM chip to fix the address line.

The CAS signal can be sent in t RCD (RAS -to-CAS Delay) ticks. Then the column address is transmitted over the address bus and the CAS signal level is lowered. Here we see how two parts of an address (practically halves) can be transmitted over the same address bus.

Finally, addressing is complete and data can be transferred. The RAM chip takes a while to prepare this. This latency is commonly referred to as CAS Latency (CL). In Figure 2.8 it is 2. It can be higher or lower, depending on the quality of the memory controller, motherboard, and DRAM module. It can also be half the value. With CL = 2.5, the first data will start to be transmitted on the first downslope of the clock signal in the blue region.

With all these preparations, it would be wasteful to transmit just one word of data. This is why DRAM modules allow the memory controller to dictate the amount of data transferred. Usually the choice is between 2, 4, or 8 words. This allows entire cache lines to be populated without a new RAS / CAS sequence. The memory controller can also send the CAS signal without a new line selection. This way, you can read or write sequential memory addresses much faster, because you do not need to send a RAS signal and deactivate the line (see below). The memory controller must decide whether to keep the string "open". In theory, keeping it open all the time can have negative consequences in existing applications(cm. ). When to send a new CAS signal is determined by the Command Rate property of the RAM module (usually denoted as T x, where x this value, such as 1 or 2, will be 1 for high-performance DRAM modules that receive new instructions every cycle).

In this example, SDRAM outputs one word per cycle. This is what the first generation can do. DDR can transfer two words per cycle. This shortens the transmission time but does not change the latency. Basically, DDR2 works the same, although in practice it looks different. There is no need to go into details here. Suffice it to say that DDR2 can be made faster, cheaper, more reliable, and more energy efficient (see for more details).

2.2.2 Pre-charge and activation

Figure 2.8 does not cover the full cycle. It only shows a fraction of the complete DRAM access cycle. Before a new RAS signal can be sent, the currently selected line must be deactivated and the new line must be charged. We can concentrate here on the case where this is done by an explicit command. There are protocol improvements that, in some cases, make it possible to bypass this extra step. However, the delay caused by charging still affects the operation.

Figure 2.9: Pre-charge and SDRAM activation

Figure 2.9 shows activity starting from one CAS signal and ending with a CAS signal for another line. The data requested by the first CAS signal appears as before through CL cycles. In this example, two words are requested and SDRAM takes two cycles to transmit. You can imagine four words on a DDR chip.

Even on DRAM modules with a command rate of 1, the precharge command cannot be started immediately. It is necessary to wait while the data is being transferred. In our case, these are two cycles. It turns out the same as CL, but it's just a coincidence. The precharge signal does not have a dedicated dedicated line. Instead, some implementations use the simultaneous lowering of the Write Enable (WE) and RAS levels. This combination doesn't make sense on its own (see coding details in).

After the command to precharge is transmitted, it is necessary to wait for t RP (Row Precharge time) cycles before the row can be selected. In Figure 2.9, most of this time (indicated in purple) intersects with data transmission (light blue). This is good! But t RP is longer than the transmission time, so the next RAS signal is delayed by one cycle.

If we continued the time axis in the diagram, we would find that the next data transfer begins 5 cycles after the end of the current one. This means that the data bus is only used two out of seven cycles. Multiply that by the FSB speed, and the theoretical 6.4GB / s for an 800MHz bus becomes 1.8GB / s. This is bad and should be avoided. The techniques described in Chapter 6 will help increase this speed. But the programmer must try to do this.

There is one more time constant for SDRAM modules that we have not discussed. In Figure 2.9, the precharge command is only limited by the data transfer time. Another limitation is that the SDRAM module takes time after the RAS signal before it can charge another line (this time is denoted by t RAS). This number is usually quite high, two or three times the t RP value. This is a problem if there is only one CAS signal following the RAS signal and the data transfer ends after several cycles. Suppose in Figure 2.9 the first CAS signal is immediately preceded by the RAS signal and t RAS is 8 cycles. Then the precharge command must be delayed by one cycle, since the sum of t RCD, CL, and t RP (since it is longer than the data transfer time) is only 7 cycles.

DDR modules are often described using a special notation: w-x-y-z-T. For example: 2-3-2-8-T1. It means:

w 2 CAS Latency (CL)
x 3 RAS -to-CAS delay (t RCD)
y 2 RAS Precharge (t RP)
z 8 Active to Precharge delay (t RAS)
TT1Command Rate

There are also many temporary constants that affect how commands should be issued and executed. But in practice, these five constants are enough to determine the performance of a module.

It is sometimes useful to know this information about the computer being used in order to correctly interpret certain measurements. And it is definitely useful to know these details when buying a computer, as they, together with the FSB and SDRAM speeds, are some of the most important factors in determining the performance of a computer.

The adventurous reader may also try to tune the system. Sometimes the BIOS allows you to change some or all of these values. SDRAM modules have programmable registers where you can set these values. Typically, the BIOS chooses the best of the defaults. If the RAM module is of high quality, it may be possible to reduce one of the delays without affecting the stability of the computer. Numerous overclocking sites on the Internet offer a ton of documentation on this. Do this at your own peril and risk, but do not say later that you were not warned.

2.2.3 Recharge

The most overlooked topic when considering DRAM access is recharge. As discussed in Section 2.1.2, DRAM cells need to be constantly refreshed. And it doesn't go unnoticed by the rest of the system. When a string is reloaded (the unit of measurement here is string (see) although other literature states otherwise), it cannot be accessed. The research shows that "surprisingly, DRAM recharging can have a dramatic effect on performance."

According to the JEDEC (Joint Electron Device Engineering Council) specification, each DRAM cell must be recharged every 64ms. If the DRAM array has 8192 lines, this means that the memory controller must send a recharge command on average every 7.8125 microseconds (these commands can be queued and therefore in practice the maximum interval between the two can be longer). It is the responsibility of the memory controller to manage the schedule of recharge commands. The DRAM module remembers the address of the last recharged line and automatically increments the address counter for each new command.

The programmer can have little influence on the recharge and the moments in time when these commands are given. But it's important to have this part life cycle DRAM in mind when you interpret measurements. If an important word is to be read from a string, and the string is being recharged at this moment, the processor can be idle for quite a long time. How long the charging takes depends on the DRAM module.

2.2.4 Types of memory

It is worth spending a little time describing the existing types of memory and their immediate successors. We will start with SDR (Single Data Rate) SDRAM, as they are the basis for DDR (Double Data Rate) SDRAM. SDRs were very simple. The speed of memory cells and data transfer was the same.

Figure 2.10: SDR SDRAM Operations

In Figure 2.10, a DRAM memory cell can eject the contents of the memory at the same rate at which it is transported across the memory bus. If the DRAM cell can operate at a frequency of 100MHz, then the data transfer rate of the bus will be 100Mb / s. Frequency f is the same for all components. It is expensive to increase the bandwidth of a DRAM chip because the power consumption increases with the frequency. Given the sheer number of cell arrays, this is prohibitively expensive. ( Power Consumption = Dynamic Capacitance × Voltage 2 × Frequency). This is actually an even bigger problem, as increasing the frequency also requires increasing the voltage to maintain system stability. In DDR SDRAM (later called DDR1), the bandwidth was increased without increasing any of the frequencies involved.

Figure 2.11: DDR1 SDRAM Operations

The difference between SDR and DDR1, as you can see in Figure 2.11 and understand from the name, is that double the amount of data is transferred in one cycle. That is, the DDR1 chip can transmit data by increasing and decreasing the signal level. This is sometimes referred to as a "double inflated" tire. To make this possible without increasing the frequency of the array of memory cells, a buffer is used. The buffer stores two bits for each data line. This, in turn, requires that the cell array in Figure 2.7 have a two-line data bus. The implementation of this is trivial: you need to use the same column address for two DRAM cells and access them in parallel. Changes to the cell array will be minimal.

SDR DRAMs were known simply for their frequency (eg PC100 for 100MHz SDR). To improve the sound of DDR1 DRAM, marketers had to change this circuit, as the frequency did not change. They adopted a name that contains the baud rate that the DDR supports (it has a 64-bit bus):

100MHz × 64bit × 2 = 1600Mb / s

Hence, the 100MHz DDR module is called PC1600. From 1600> 100 all marketing requirements are met - it sounds much better, although the real improvement is only twice. ( I would understand if I doubled it, otherwise the numbers are inflated.}

Figure 2.12: DDR2 SDRAM Operations

To get even more out of the technology, DDR2 includes a few more innovations. The most obvious change, as you can see in Figure 2.1, is the doubling of the bus frequency. Doubling the frequency means doubling the bandwidth. Since doubling the frequency is not economically feasible for a cell array, the I / O buffer is now required to receive four bits per cycle, which it then transmits over the bus. This means that the changes to the DDR2 module are to increase the speed of the DIMM I / O buffer. This is definitely possible and does not require significantly more power - it's just one small component, not the whole module. The name that the marketers came up with for DDR2 is similar to the name for DDR1, only in the calculation of the value the multiplier two is replaced by four (now we have a "quadruple pumped" bus). Figure 2.13 shows the module names in use today.

Frequency
array
Frequency
tires
Speed
data
Name
(speed)
Name
(FSB)
133MHz266MHz4256Mb / sPC2-4200DDR2-533
166MHz333MHz5312Mb / sPC2-5300DDR2-667
200MHz400MHz6400Mb / sPC2-6400DDR2-800
250MHz500MHz8000Mb / sPC2-8000DDR2-1000
266MHz533MHz8512Mb / sPC2-8500DDR2-1066

Figure 2.13: Names of DDR2 modules

There is another trick in the title. The FSB speed used by the CPU, motherboard, and DRAM module is expressed in terms of the "effective" frequency. That is, it is multiplied by 2 due to the fact that the data transfer occurs when the signal level of the clock generator rises and falls, and the number increases. So, a 133MHz module with a 266MHz bus has an FSB "frequency" of 533MHz.

DDR3 specification (real, not GDDR3 used in graphic cards) implies further changes that continue the logic of the transition to DDR2. The voltage will be reduced from 1.8V for DDR2 to 1.5V for DDR3. Since power consumption is proportional to the square of the voltage this alone gives a 30% improvement. Add to that chip reduction and other electrical improvements, and DDR3 can consume half the power at the same frequency. And at a higher frequency, get along with the same amount. Or, you can double the container for the same amount of heat generation.

The DDR3 cell array will operate at a quarter of the external bus speed, which will require an eight-bit I / O buffer, more than the four-bit in DDR2. Figure 2.14 shows the diagram.

Figure 2.14: DDR3 SDRAM Operations

Most likely, DDR3 will initially have slightly higher CAS latency than DDR2 because DDR2 is more mature technology. Therefore, it will only make sense to use DDR3 at higher frequencies than are achievable with DDR2, or when bandwidth is more important than latency. There is already talk of 1.3V modules that will have the same CAS latency as DDR2. In any case, the ability to achieve more high speeds due to faster buses will outweigh the increase in latency.

One possible problem with DDR3 is that at 1600MB / s and higher, the number of modules per channel can be reduced to one. V early versions this limitation was present for all frequencies, so one can hope that over time it will be removed for all frequencies. Otherwise, the capacity of the systems will be severely limited.

Figure 2.15 shows the expected names of the DDR3 modules. JEDEC had by this time approved the first four types. Considering that 45nm Intel processors have a FSB speed of 1600MB / s, it is necessary to have 1866MB / s for the overclocker market. We will most likely see this towards the end of the DDR3 lifecycle.

Frequency
array
Frequency
tires
Speed
data
Name
(speed)
Name
(FSB)
100MHz400MHz6400Mb / sPC3-6400DDR3-800
133MHz533MHz8512Mb / sPC3-8500DDR3-1066
166MHz667MHz10667Mb / sPC3-10667DDR3-1333
200MHz800MHz12800Mb / sPC3-12800DDR3-1600
233MHz933MHz14933Mb / sPC3-14900DDR3-1866

Figure 2.15: Names of DDR3 modules

All DDR memory has one problem - increasing the bus frequency makes it difficult to create parallel data buses. The DDR2 module has 240 pins. All connections to the data and address pins must be made so that they are approximately the same length. Even more problematic is that when there are multiple DDR modules on the same bus, the signals become more and more distorted for each additional module. DDR2 specification allows using only two modules on one bus (channel), DDR3 - only one module at high frequencies. With 240 contacts per channel, a single Northbridge cannot handle more than two channels well. Alternatively, external memory controllers can be used (see figure 2.2), but this is very expensive.

All this means that motherboards mainstream computers can have a maximum of four DDR2 or DDR3 modules. This severely limits the amount of memory the system can have. Even older 32-bit IA-32 processors supported up to 64GB of RAM, and the need for more memory is growing even for home systems, so something needs to be done.

One solution is to add memory controllers to each processor, as shown at the beginning of this chapter. AMD is doing this on the Opteron line of processors, and Intel will be doing it in CSI technology. This can help as long as the amount of memory the processor is capable of using can be attached to each processor. In some situations, this is not the case, and this approach leads to the NUMA architecture with its negative effects. For some situations, you need a completely different solution.

Intel's solution for large server machines, at least for years to come, is called Fully Buffered DRAM (FB-DRAM). FB-DRAM modules use the same components as today's DDR2 modules, making them relatively cheap to manufacture. The difference is in connection with the memory controller. Instead of a parallel data bus, FB-DRAM uses a serial bus (the same was with Rambus DRAM and SATA, a PATA follower, and PCI Express after PCI / AGP). The serial bus can be driven at a significantly higher frequency, overcoming the negative effects of serialization and even increasing throughput. The main effects of using a serial bus are:

  1. more modules can be used on one channel,
  2. more channels can be used on one Northbridge / memory controller,
  3. the serial bus is full duplex (two lines).

The FB-DRAM module has only 69 pins, instead of 240 for DDR2. It is much easier to use multiple FB-DRAM modules together because the electrical effects of such a bus are easier to control. The FB-DRAM specification allows up to 8 modules per channel.

Given the connection requirements of the dual-channel Northbridge, it is now possible to drive six FB-DRAM channels with fewer pins: 2x240 versus 6x69. The path on board to each channel is also much easier, which can help keep the price of motherboards down.

Full duplex parallel buses are too expensive for traditional DRAM modules - it is very expensive to double the number of lines. With serial lines (even if they are differential, as FB-DRAM requires) this is not the case, so the serial bus is made full duplex, which means, in some situations, that the bandwidth is doubled just because of this. But this is not the only case where parallelism has been used to increase throughput. Since the FB-DRAM controller can handle up to six channels simultaneously, the throughput using FB-DRAM can be increased even for systems with little RAM. Where a DDR2 system with four modules has two channels, the same capacity can be served through four channels with a conventional FB-DRAM controller. The actual serial bus bandwidth depends on which DDR2 (or DDR3) chips are used in the FB-DRAM modules.

We can summarize the benefits like this:

There are downsides to FB-DRAM when using multiple DIMMs on the same channel. The signal is delayed, albeit minimally for each DIMM in the chain, which means an increase in latency. But for the same amount of memory at the same frequency, FB-DRAM will always be faster than DDR2 and DDR3, since only one DIMM is needed per channel. For systems with a lot of memory, DDR simply doesn't have a mainstream solution.

2.2.5 Conclusions

This section was meant to show that accessing DRAM cannot be as fast as you like. At least compared to the speed of the processor and the speed of processor access to registers and cache. It is important to keep in mind the distinction between CPU and memory frequencies. Intel processor The Core 2 is 2.933GHz, and the 1.066GHz front side bus will be clock frequencies 11: 1 (note that data is fed to the bus four times faster than its frequency). One memory cycle idle means 11 processor cycles idle. Most machines actually use slower DRAMs, which further increases latency. Keep these numbers in mind as we talk about downtime in later chapters.

The graphs for the read instructions show that the DRAM modules are capable of transferring data at a high and steady rate. Entire lines of DRAM can be transferred without a single delay. The data bus can remain 100% loaded. For DDR modules, this means that two 64-bit words are transmitted in each cycle. For DDR2-800 modules on two channels it is 12.8Gb / s.

But access to DRAM is not always consistent, unless of course it is specifically organized so. Remote memory locations are used, which means pre-charge and new RAS signals are inevitable. That's when things slow down and the DRAM modules need help. The sooner the precharge happens and the RAS signal is sent, the lower the cost of using the new line.

Hardware and software prefetching is used to reduce downtime and create more time overlap between the operations of their callers (see Section 6.3). It also helps to move memory operations over time so that fewer resources can be used later before the data is needed directly. A problem often arises when the data produced in one round needs to be saved, but the data needed in the next round needs to be read. By moving reads in time, we will ensure that read and write operations do not need to be done at the same time.