Basic MIPS implementation
Basic MIPS Instructions
MIPS, which was an acronym for Microprocessor without Interlocking Pipe Stages, was a very successful microprocessor in the 1990s. Over time it was unable to compete with the resources thrown at the Intel family in the PC business or with the success of the ARM chips in the mobile phone devices, but it has carved out a solid niche in the embedded market – routers, cable modems, set-top boxes, and gaming consoles such as the PlayStation II. Its main advantages are its relatively simple (and regular) instruction set, and its low cost and low power requirements. The regular (though complete) instruction set is the main reason MIPS processors are used in Computer Architecture courses. It helps that the definitive work on the topic was written by the original proponent of the architecture at Stanford University in the late 1980s and early 1990s.
MIPS was revolutionary for its time. It was probably the chip that settled the longstanding competition between RISC chips and the then-dominant CISC chips. All new microprocessor chips made today are RISC. The only surviving CISC chip in use is the Intel family, and it could be argued that it is only Intel’s Herculean effort and resources (and their large installed base) that has kept this technology in the forefront.
The MIPS instruction set is easier to program, use, and implement for several reasons
- every instruction is a standard length – for our family, 32 bits.
- there is a large number (32) of uniform registers. For 31 of these registers their use is restricted only by convention. (But in reality only 26 registers are generally available)
- there are very few different instruction encodings
- there are no condition codes
- it can function in either big- or little-endian mode
- the procedure calling convention is lightweight, enabling simple leaf procedures to be called with very little overhead
- the assembly language is easily optimized and its regularity (and large number of registers) makes “smart” compilers easier to write. (In fact, early MIPS chips relied on the compiler to generate code in a special way to avoid the necessity of pipeline interlocks, hence its acronym)
MIPS future was in question several times over its history. As recently as ten years ago it seemed destined for limited use in embedded devices. Since then, however, a huge investment has been made in enhancing and licensing the technology from various sources, culminating in the purchase of the company itself by a Chinese consortium a few years ago. This was preceded by many years of independent development of MIPS-based processors (named Gobson and Loongson. It was at that time on the short list of processors being considered to be chosen for a national Chinese architecture. It is curious that the company was purchased (thus alleviating any outstanding patent issues) soon afterwards. The winner of the national architecture was never announced in the media. You can read a lot of details about the status of the various Chinese MIPS processors on wikipedia here.
All MIPS instructions are 32-bits. To ease the writing of assembly instructions given this restriction, the MIPS assembler allows pseudoinstructions, which appear to be simple instructions, but, instead generate a sequence of up to three actual MIPS instructions. To facilitate the use of pseudoinstructions, register 1 ($1 or $at) is reserved as the assembler temporary.
All operations are performed in registers on 32-bit quantities. The only instructions that access memory are loads and stores, and these are the only instructions that indicate data size. Loads and stores must be properly aligned. Thus a load of a four-byte quantity must be done from an address that is a multiple of four. Loads of small sizes (halfword or byte) have signed and unsigned versions.
As indicated, MIPS has a convention for register usage. Registers are divided into register classes:
- special-use. These registers are reserved for a special purpose. This includes $0, $at, $v0/$v1 (function result), the stack, frame, and global pointer registers, the return address register, and the two kernel registers (10 total). Except for the kernel registers, the stack pointer, and $0, you may use these registers for other uses if you obey the rules.
- $a0-$a3 are argument registers. We will discuss them later.
- $s0-$s7 are “saved” registers. These are usually used to “hold” a value for a long period of time. Again, later.
- $t0-$t9 are “temporary” registers. Use these registers for calculations.
Basic instruction encoding.
MIPS instructions are divided into fields, just like our Simple Machine. The fields indicate the operation, and the operands. Let’s take this a step at a time
- the operation, strangely, is split into two fields – the opcode field and, in some instructions, the function field, each 6 bits wide. The opcode field comprises the most-significant bits of the instruction. The function field comprises the least-significant part. Basically this design “moves” the indication of the operation between the two fields based on which instruction encoding is used, as we will see.
- every instruction has space for two registers, each encoded in 5 bits.
- some instructions have space for a third register, a shift field, and the optional function field. Other instructions use this space for a 16-bit constant.
Let’s see how this works:
Arithmetic instructions are encoded as R-type instructions. They have the following parts
- an opcode field that basically says the opcode is in the function field instead.
- two source registers (rs and rt)
- a destination register (rd)
- an optional 5 bit shift amount (for shift instructions)
They are used, of course, for arithmetic operations such as add, shift, and, and or. (Note that the use of the registers changes in the shift instructions. See below.)
add rd,rs,rt # rd = rs + rt (signed addition. overflow is possible)
addu rd,rs,rt # rd = rs + rt (unsigned addition)
and rd,rs,rt # rd = rs AND rt
sll rd,rt, N # rd = rt << N (0<N<31)
If the integers a, b, and c are in registers $t0, $t1 and $t2 respectively, write code for the following:
c = a-b+c
First, lets rewrite this as
c += a-b
One way of coding this would be
add $t2,$t2,$t0 # c += a
sub $t2,$t2,$t1 # c -= b
An alternate way would use a temporary register
sub $t3, $t0, $t1 # t3 = (a-b)
add $t2, $t2, $t3 # c += t3
add vs addu
Bitwise addition proceeds the same whether the operands are signed or unsigned. Thus, there does not seem to be any reason for add and addu instructions.
Instead of indicating any difference in how the basic instruction operates, the difference between add and addu has to do with overflow.
If you recall, overflow is indicated by comparing the signs of the operands to that of the result. The only question when using addition is do you want to know if there was an overflow?
The add instruction should be used when you want to know about overflow. The addu instruction should be used when you dont want to know about overflow. Generally, when dealing with signed numbers, you want to use add, and when dealing with unsigned numbers, you want to use addu.
Note that both addi and addiu (where one operand is a constant) use sign-extension when the constant is converted to 32-bits.
Using constants in arithmetic instructions
Besides the shift operators, some arithmetic instructions have forms that allow the second source operand to be a small constant. These include add, as well as the bitwise operators and, or and xor. This form uses the lower 16-bits of the instruction to hold a 16-bit constant. For example
addi rt,rs,imm e.g. addi $t0,$t2,1024
In the case of add, the constant is signed. In the case of the bitwise operators, the constant is unsigned.
You should note for later that the destination register in this instruction is rt(!)
This form of the instruction uses a different encoding, called I-type. We will discuss the encodings in a later section. The I-type encoding is important, however, as this is the encoding used for load and store operations.
As mentioned previously, all operations in MIPS code is done in registers. Thus, before an operation can be performed, data must be loaded into registers. After the operation is complete, the data must be stored from the register(s) to its permanent location in memory.
Although all operations are done on the entire register (i.e., 32-bits), load and store operations must deal with the three different sizes available – bytes, halfwords (shorts) and words (ints). This is indicated by the size letter in the load or store operation:
lw – load 32-bits
lb – load 8-bits
Loading and storing 32-bits is simple. For the store instruction, storing smaller quantities (bytes or halfwords) simply ignores the most-significant bits of the register. When loading smaller quantities, we must do something about the other bits in the register.
If the datum being loaded is a signed quantity, the small datum is sign-extended to fill the entire register. This is the default of the load instructions, thus, lb is inherently signed.
If the datum being loaded is an unsigned quantity, the most-significant bits of the register should be zeroed. These load instructions have the additional character u for unsigned. Thus, lbu is load byte unsigned. It is inherently unsigned.
If you load the upper (most-significant) part of the instruction (with lui, as we shall see) the lower 16 bits are zeroed.
The load and store instructions have a register (the destination for load instructions, and the source for store instructions) followed by a memory specification. Without getting hung up on the memory specifier right now, let’s look at how loading a datum affects the register. If the datum being loaded has the value -2, here is what the register looks like after the load instruction:
The Memory Operand
There is a single way to address memory in instructions, by using the addressing mode base + displacement. This means a register holds a memory address, and the effective memory address (the one loaded from or stored to) is that address plus a constant. The constant, which is a 16-bit signed constant, is specified in the instruction.
For example, if you want to load a 32-bit word from an address contained in register $t0, placing the word in register $t1, the instruction would be
Thus, 0 is added to the contents of $t0 to get the address of the 32-bit quantity to load. If $t0 is the base of an array, and the element of the array you want has a constant index like 1, you could access the element (array) by adjusting the constant. But there is a trick
the constant must be multiplied by the size of the array element!
Thus, if the array is an integer array, and you want element #1 (array), the instruction would be
If the array is a character array and you want element #1, the instruction is
Suppose you are operating on an array and you need to access array[i], where the array is an integer array, the address of the start of the array (the base of the array) is in $t0, and the value of i is in $t2. Here is what you do:
sll $t3,$t2,2 # multiply $t2 by 4 (the size of an element)
Note that storing to memory takes the same format, and the memory address is still the right-hand operand. Thus, the equivalent store operation to that above is
Loads and stores must be aligned. This means that you can only load an N-byte quantity (N=1,2,4,…) from an address that is a multiple of N bytes. Thus, while it is legal to load a byte from any memory address, it is only legal to load a word from an address that is a multiple of 4.
Loading an initial address
Loading the base address of the array into the $t0 register in the example above is a bit tricky. Remember, an instruction only has room for a 16-bit constant, and addresses are 32-bits! This means we must initialize $t0 in two steps:
- the upper 16 bits of the register are set with the lui instruction (load upper immediate)
- the lower 16 bits of the register are set using an ori instruction OR an add instruction.
If the desired address is 0x01004008:
lui $t0, 0x100 # $t0 now has 0x01000000
addi $t0,$t0,0x4008 # $t0 now has 0x01004008
Note that the add instruction has a 16-bit signed constant. If the address wanted was 0x01008008, we could not use an add instruction! This is what would happen:
lui $t0, 0x100 # $t0 now has 0x01000000
addi $t0,$t0,0x8008 # $t0 now has 0x00ff8008
There are two solutions to this. First, we could simply add one to the constant used in the lui instruction
lui $t0, 0x101 # $t0 now has 0x01010000
addi $t0,$t0,0x8008 # $t0 now has 0x01008008
an alternative solution is to use an ori instruction instead of an add. (The constant inandi andori is unsigned)
lui $t0, 0x100 # $t0 now has 0x01000000
ori $t0,$t0,0x8008 # $t0 now has 0x01008008
This operation (loading a 32-bit address into a register) is so common that it has pseudoinstruction in MIPS assembler: la (for load address). This means the instruction la can be used in MIPS but it is translated to the appropriate lui/ori combination. In our case this would be
la $t0, 0x1008008
The global pointer
As you see, loads from 32-bit addresses (“static” data) are cumbersome. To streamline such loads, MIPS programs are laid out in memory so that “static data” (as opposed to data on the stack) is in blocks of 64kB. A register is then initialized with an address that is in the middle of the current block of data. Later, that register can be used as the base register, and loads and stores can reference memory using a signed offset from it. If there is more than one 64kB block of data for a program, the global pointer can be initialized each time a function is entered. If a function’s data is larger than 64kB, the global pointer will have to be manipulated more often.
In our programs, data will be very small. We could simply use a global pointer that points to the beginning of our memory area. That way offsets are always positive.
The MIPS assembler(s) (including MARS) relax the strict MIPS assembly language specification by letting us be sloppy about the exact operand in some cases, as well as adding complex instructions that must be implemented as a sequence of instructions. Let me give you a few examples:
For convenience, several uses of add appear as pseudoinstructions. For example
addi $t0,$t1,-16 can be coded as the pseudoinstruction subi $t0,$t1,16
add $t0,$zero,$t1 can be coded as the pseudoinstruction move $t0,$t1
addi $t0,$zero,1024 can be coded as the pseudoinstruction li $t0,1024
In most cases, an instruction that allows a 16-bit immediate operand has a corresponding pseudoinstruction for the form with a 32-bit immediate operand. Thus
lw $t0,0x1000($s0) is a real instruction but
lw $t0,0x10000($s0) is a pseudoinstruction
Last, the assembler is forgiving about the exact instruction ending. Thus it will allow the instruction
when the instruction is actually addi.
A complete list of pseudoinstructions implemented in MARS is available on its help page.