December 17, 2014

SRAM Memories

A Static Random Access Memory, best known as SRAM, is a type of memory that keeps the data stored as long as it's powered, therefore considered a volatile memory. Unlike the DRAM memories, the SRAM doesn't need to be periodically refreshed, which allows for higher access speeds, however they're usually more expensive, consume more power, take up more space and heat more than the DRAM alternative.
Here I'll explain how I designed a small 32 bit SRAM memory (8 rows of 4 bits) for academic purposes. The goal was to design the memory cells and the necessary circuits to its operation, and then to implement the respective layout. The complete circuit was needed to be able to simulate the read and write cycles, while analysing the results.

An SRAM is composed of rows and columns forming a matrix of memory cells. The rows are known as Word Lines (WL) and control the access to the cells of that line, defining when it's possible to get or change the stored data. On the other hand, the columns are known as Bit Lines (BL), Digit Lines or Data Lines, and are responsible to conduct the data to and from the cells, either in a write or read operation, respectively. Bit Lines generally exist in pairs, opposite to each other (BL and ~BL). Finally, there are the memory cells, small circuits which are able to store 1 bit.

This image represents the basic structure of an SRAM memory:
Each one of the memory cells is accessible through a unique memory address, these indicate where the cell is located. The memory address is composed of M+N bits, where M is the number of bits of the row address and N the number of bits from the column address. To select a memory cell, these addresses must be sent to the row and column decoders. However, the decoders will not be the focus of this post, since each row and column will be accessed individually, for testing purposes.

Let's analyse the cell itself. The cell is able to store 1 bit, and its internal structure is composed of 6 transistors, where 4 of these, form 2 crossed inverters: Q1, Q2, Q3, Q4, which are responsible for keeping the data stored, as long as the memory is powered. The other 2 transistors, Q5 and Q6 serve as gateways for bidirectional access between stored data and the Bit Lines. These access transistors are controlled with the Word Line.

The following diagram represents a typical 6T memory cell:
Now let's move to another important part of the SRAM. During the memory operation, the voltage across the Bit Lines may not always be quite as perceptive as it should, which makes the circuit unstable and can lead to errors in the read and write cycles. To avoid such situations, we use the Sense Amplifier:
A Sense Amplifier is a circuit formed by 2 crossed inverters between the Bit Lines. However these inverters are only connected to VDD and VSS when it's necessary to sense the data from the lines. They are controlled by the "Sense" input (φS), through the transistors Q5 and Q6 (these transistors must work at the same time, for that the PMOS transistor needs the φS input to be inverted). With this circuit, even the smallest voltage across the Bit Lines will be enough to define the actual logic level on the lines.

When φS is equal to '1', the higher voltage line will define the state of the inverters, and consequently assign the correct logic levels to the respective Bit Lines. When φS is equal to '0', the amplifier is turned off. There will be a Sense Amplifier for each of the columns in the memory.

Each column is also equipped with a Pre-charge and Equalizing circuit:
Before every read/write operation, the Bit Lines must be pre-charged and equalized. This is done to assure that small voltages can be easily detected by the Sense Amplifier. The transistors Q8 and Q9 are responsible for charging the Bit Lines with VDD/2 while Q7 equalizes the voltage between them. The Pre-charge circuit is activated through the precharge input (φP).

Combining all the previous circuits together, we get the column internal structure:
Now let's analyse how the SRAM works. For this, I'll consider that the memory cell has the logic level '1' stored.

First, the reading operation:
Before the reading can be performed, the Bit Lines are previously loaded with VDD/2 through the Pre-charge and Equalizing circuit. Inside the cell we have Q = VDD and ~Q = 0. When the word line is selected, the gateway transistors Q5 and Q6 start conducting and establish the connections between Q and BL, and between ~Q and ~BL. A current will come from VDD through Q4 and Q6 to the Bit Line, charging up its parasite capacity CB, increasing the voltage in the line. On the opposite side, a current will come from ~BL, passing through Q5 and Q1 until reaching VSS, which will discharge the parasite capacity C~B, decreasing the voltage. The difference between BL and ~BL will allow the Sense Amplifier to detect the logic level stored in the cell and perform a correct reading.

Now the writing operation, considering we want to write '0' and that '1' is stored:
Once again, the lines are pre loaded with VDD/2 before the operation. After this, they're loaded with the information we pretend to write and the Sense Amplifier is activated, which will turn BL = 0 and ~BL = VDD. When the word line is selected (WL = VDD), the gateway transistors Q5 and Q6 turn on. A current will come from Q to BL, discharging the parasite capacity CQ, decreasing the voltage in Q. Similarly on the other side a current will flow from ~BL to ~Q, which will charge the parasite capacity C~Q, increasing the voltage in ~Q. When Q = ~Q = VDD/2, the state of the inverters will start to change, the positive feedback will control the state of the cell and the previous figure will no longer apply. The cell will memorize the new logic level '0'.

Now that the diagrams and functioning of the cell are comprehended we can move on to the semiconductor layout. In the following layouts I've used the rules of CMOS 0.18 μm technology.

Here is the layout of the SRAM cell itself:
In the SRAM design, the transistor's size is of most importance to its right operation. Starting by the inverters, I'm using a width for NMOS of W = 0.6 μm, and since the NMOS are approximately twice as conductive as the PMOS, I'm using W = 1.2 μm for the PMOS transistors, in order to balance their conductivity. On the other hand, the gateway transistors must be 2 a 3 times larger than the NMOS transistors from the inverters, so that they are conductive enough to change the logic level stored by these transistors. However they shouldn't be much larger so that they won't occupy much space. I opted for a width of 1.2 μm for these gateway transistors.

The cell layout has the following size:
Width: 5.8μm (56 lambda)
Height: 8.0μm (80 lambda)
Surface: 46.4μm2 (0.0 mm2)

Now the Sense Amplifier layout:
For the Sense Amplifier I used the same transistor sizes as before, W = 0.6 μm for NMOS and W = 1.2 μm for PMOS, with the exception of one NMOS transistor that connects to VSS, simply to make use of the space that was available. Besides all the transistors that were shown before on the Sense Amplifier diagram, here there are 2 extra transistors, which compose an inverter for the φS input that will activate the PMOS transistor connected to VDD.

These are the Sense Amplifier specifications I've obtained:
Width: 7.8 μm (78 lambda)
Height: 7.8 μm (78 lambda)
Surface: 60.8 μm2 (0.0 mm2)

Finally the Pre-charge and Equalizing circuit layout:
Here I also used the same transistor sizes, W = 0.6 μm for NMOS transistors.

These are the Pre-charge and Equalizing transistors specifications:
Width: 3.0 μm (30 lambda)
Height: 7.6 μm (76 lambda)
Surface: 22.8 μm2 (0.0 mm2)

Now we can unite all the previous elements: the cell, the sense amplifier and the pre-charge and equalizing circuit, and put together an SRAM memory of 1x1 bit:
Besides those previous elements, 2 more transistors were added to select between the operation, connecting the Bit Lines with the Data In or Data Out terminals.This is necessary for simulation purposes since the line and column decoders, as well as the I/O buffers, haven't been designed. As such, 2 new inputs have also been created, the Write Enable input (WE) and the Read Enable input (RE), which when high, will turn on their respective transistor, selecting the line for input or output of data.

These are the properties of this layout:
Width: 24.9μm (249 lambda)
Height: 8.6μm (86 lambda)
Surface: 214.1μm2 (0.0 mm2)

*NOTE: On this layout the column is represented horizontally and the lines vertically, only for practicality reasons during its design.

With the 1x1 bit SRAM finished it's time to move on to the simulation. For this let's consider we want to save the the logic level '1' in the memory, then we will perform a reading operation. After this, I'll store the value '0' and read the memory once again.

These are the necessary proceedings to write on the memory cell:
  1. Turn off the Sense Amplifier (φS = 0) and activate the Pre-charge and Equalizing circuit momentarily (φP = 1).
  2. Shut down the Pre-charge circuit (φP = 0), prepare the data to be written (Data In = 1) and give the write instruction (WE = 1).
  3. Activate the Sense Amplifier (φS = 1) to amplify the logic levels on the Bit Lines.
  4. Activate the Word Line (WL = 1) to turn on the gateway transistors, connecting the Bit Lines with the cell, allowing its value to be changed.
On the other hand, to perform a reading operation, these are the necessary steps:
  1. Turn off the Sense Amplifier (φS = 0) and activate the Pre-charge and Equalizing circuit momentarily (φP = 1).
  2. Shut down the Pre-charge circuit (φP = 0).
  3. Activate the Word Line (WL = 1) to turn on the gateway transistors, connecting the Bit Lines with the cell. The stored data will propagate to the Bit Lines. 
  4. Activate the Sense Amplifier (φS = 1) to amplify the logic levels on the Bit Lines and give the read instruction (RE = 1).
With this in mind, we must configure the input signals like this:
The timings used on the previous diagram were obtained through multiple iterations of trial and error, so that they were as short as possible but without interfering with the correct operation of the memory.

These are the results of the first 1 bit SRAM simulation:
A netlist was also created to be able to simulate the circuit on Winspice.

The following diagram is the result of that Winspice simulation:
*NOTE: Each signal was added a multiple of 3V, only to achieve a better representation on the diagram, keeping the signals all separated.

As we can observe on both simulations, the first action is the activation of the Pre-Charge circuit when t = 1 ns, which immediately changes the voltage on both Bit Lines, stabilizing them around VDD/2. Then, when t = 2 ns, the logic level '1' is preloaded onto the Data In input, while at the same time is given the write instruction and the Sense Amplifier is turned on. This affects the Bit Lines, where in BL the voltage rises towards VDD and in ~BL decreases towards VSS. After a little instant, the word line is activated when t = 3 ns, which makes the voltage in Q rise up to VDD, changing the logic level to '1' on the memory cell, while the opposite occurs in ~Q. After this, the word line is deactivated, together with the Sense Amplifier and the write instruction. At this point, it no longer matters what signal is placed on the Data In input, because as we can see, the memory cell already memorized the value '1' in Q. This is also true for the Bit Line, however, the voltage only remains the same in here because of the parasitic capacity of the Bit Lines.

At 6 ns begins the reading operation of the cell, the Pre-Charged is activated again, stabilizing the Bit Lines around VDD/2. At 7 ns the word line is turned on, allowing the stored values on the cell to propagate onto the Bit Lines, overcoming the residual voltages they have left, while the Sense Amplifier is also activated and the read instruction is given, which makes the values reach the Data Out terminal. When t = 8 ns the word line is turned off, together with the Sense Amplifier and the reading instruction, while the stored value will keep itself on the Bit Line and Data Out terminal, again because of their parasite capacities.

It should be noted that, although the desired value to be read should be VDD, the Data Out terminal presents a slightly smaller value, which doesn't happen on the Bit Line, this occurs because of the Body Effect. Since the read/write selector was made only with NMOS transistors, there is no symmetry on this part of the circuit, so instead of VDD, the output will be VDD-VT.

From the 10 ns mark the process repeats itself, where in this time it will be stored the logic level '0' onto the memory cell.

Now let's finally focus on the real objective of this work, the 32 bit SRAM memory cell:
Since the goal was to design an 8x4 SRAM, I added 7 new cells onto the previous layout and I obtain a 1x8 bit memory. Then I copied the entire column 3 times, getting an 8x4 bit SRAM. Each of the columns has its own Sense Amplifier, Pre-charge and Equalizing circuit and read/write selector transistors. The columns were vertically inverted to be able to match the VDD and VSS lines, reducing the total surface of the design. There are also 8 word lines now (visible on the bottom of the previous image) as well as 4 Bit Lines (visible on the right).

These are the properties of the layout:
Width: 62.7 μm (627 lambda)
Height: 29.4μm (294 lambda)
Surface: 1843.4μm2 (0.0 mm2)

Moving on to the 32 bit SRAM simulation, let's consider we want to write the data in the respective positions, indicated by the previous image (in yellow). In summary I'll attempt to write:
  • '1' - On column 2, row 1 - Q[2,1]
  • '0' - On column 1, row 2 - Q[1,2]
  • '0' - On column 2, row 4 - Q[2,4]
  • '1' - On column 1, row 6 - Q[1,6]
After the write operation, I'll read the memory cells with the same order. To do that I configured the signals like this:
The writing operation will occur from 0 to 20 ns, in which the memory will store the data sequence '1001' and then it will proceed to the reading operation from 20 ns to 36 ns. This time there will be more input signals for the simulation: 4 word lines (WL[1], WL[2], WL[4] and WL[6]) and 2 Data Inputs, one for each column (DIN[1] and DIN[2]).

This is the result of the 32 bit SRAM simulation:

The correct values are stored in the desired positions. The first '1' is stored in Q[2,1] at 3 ns, then '0' at Q[1,2] around 8 ns, although it is unnoticeable on the diagram, since the previous value was also zero. The same happens for the '0' in Q[2,4] at 13 ns. Then the final '1' is stored in Q[1,6] at 18 ns.

For the reading operation we can observe at 22 ns the value of Q[2,1] being transferred to DOUT[2] and then Q[1,2] to DOUT[1] at 26 ns, again difficult to see because of the same previous value, also happening for Q[2,4] and DOUT[2] at 30 ns. Finally, the last reading takes place at 34 ns when the value '1' is copied from Q[1,6] to DOUT[1].

By analysing this diagram we can conclude that the memory is working properly.

And this is it. Overall this has been quite an interesting work since it has allowed me to understand how digital memories work, how they are designed, what basic elements and circuits are necessary, and how these elements can be put together to achieve a functional memory.

I've also made the SRAM 1x1 bit Winspice simulation file (.cir) available here.

6 comments:

  1. Excellent explanation. Very complete. Good job.

    ReplyDelete
  2. Can you explain those 2 transistor part which you did for cenencting bitlines with data in or data out terminals.

    ReplyDelete
    Replies
    1. That's just for simulation purposes. Usually a finished memory design has line and column decoders and also I/O buffers. Since I needed to connect the Bit lines to the data in or data out lines, according to the respective simulation procedure, write/read cycle. So basically it's simply a switch, where the controller is the read/write inputs.

      Delete
  3. How did you use read/write and dataIn/DataOut in your circuitry. I read your previous comment where you mentioned them as a switch but can you please show us that part as well. I am trying to design SRAM but got stuck in that part, so I am referring to your work as a starting point. You can reply me in my email:
    wqer1988@gmail.com

    Best regards
    Waseem Qader

    ReplyDelete
    Replies
    1. Well it's basically just following the procedures I described. As I explained above, 4 steps for the write cycle and 4 steps for the read cycle. If you're asking about how to actually do it for yourself, well I guess it will depend on the program you're using, usually you can assign properties to each terminal/label such as a clock, a pulse or a PWL, which allows you to create a table and define your own function in time and voltage.

      Delete