$24
Figure 1 shows a digital system that contains a number of 16-bit registers, a multiplexer, an adder/subtracter, and a control unit (finite state machine). Information is input to this system via the 16-bit DIN input, which is loaded into the IR register. Data can be transferred through the 16-bit wide multiplexer from one register in the system to another, such as from register IR into one of the general purpose registers r0; : : : ; r7. The multiplexer’s output is called Buswires in the figure because the term bus is often used for wiring that allows data to be trans-ferred from one location in a system to another. The FSM controls the Select lines of the multiplexer, which allows any of its inputs to be transferred to any register that is connected to the bus wires.
The system can perform different operations in each clock cycle, as governed by the FSM. It determines when particular data is placed onto the bus wires and controls which of the registers is to be loaded with this data. For example, if the FSM selects r0 as the output of the bus multiplexer and also asserts Ain, then the contents of register r0 will be loaded on the next active clock edge into register A.
Addition or subtraction of signed numbers is performed by using the multiplexer to first place one 16-bit number onto the bus wires, and then loading this number into register A. Once this is done, a second 16-bit number is placed onto the bus, the adder/subtracter performs the required operation, and the result is loaded into register G. The data in G can then be transferred via the multiplexer to one of the other registers, as required.
16
16
16
A in
16
16
r0in
r7in
r0
r7
A
Clock
AddSub
Adder/Subtracter
16
Gin
Multiplexer
G
16
Select
Buswires
4
IR in
DIN
16
IR
Run
Control unit FSM
Resetn
Done
Figure 1: A digital system.
1
A system like the one in Figure 1 is often called a processor. It executes operations specified in the form of instructions. Table 1 lists the instructions that this processor supports. The left column shows the name of an instruction and its operands. The meaning of the syntax rX Op2 is that the second operand, Op2, is loaded into register rX. The operand Op2 can be either a register, rY, or immediate data, #D.
Instruction
Function performed
mv
rX, Op2
rX
Op2
mvt
rX, #D
rX15
8
D15 8
add
rX, Op2
rX
rX + Op2
sub
rX, Op2
rX
rX
Op2
Table 1: Instructions performed in the processor.
Instructions are loaded from the external input DIN, and stored into the IR register, using the connection indicated in Figure 1. Each instruction is encoded using a 16-bit format. If Op2 specifies a register, then the instruction encoding is III0XXX000000YYY, where III specifies the instruction, XXX gives the rX register, and YYY gives the rY register. If Op2 specifies immediate data #D, then the encoding is III1XXXDDDDDDDDD, where the 9-bit field DDDDDDDDD represents the constant data. Although only two bits are needed to encode our four instructions, we are using three bits because other instructions will be added to the processor later. Assume that III = 000 for the mv instruction, 001 for mvt, 010 for add, and 011 for sub.
The mv instruction (move) copies the contents of one register into another, using the syntax mv rX,rY. It can also be used to initialize a register with immediate data, as in mv rX,#D. Since the data D is represented inside the encoded instruction using only nine bits, the processor has to zero-extend the data, as in 0000000D8 0, before loading it into register rX. The mvt instruction (move top) is used to initialize the most-significant byte of a register. For mvt, only eight bits of the D field in the instruction are used, so that mvt rX,#D loads the value D15 800000000 into rX. As an example, to load register r0 with the value 0xFF00, you would use the instruction mvt r0,#0xFF00. The instruction add rX,rY produces the sum rX + rY and loads the result into rX. The instruction add rX,#D produces the sum rX + D, where D is zero-extended to 16 bits, and saves the result in rX. Similarly, the sub instruction generates either rX rY, or rX #D and loads the result into rX.
Some instructions, such as an add or sub, take a few clock cycles to complete, because multiple transfers have to be performed across the bus. The finite state machine in the processor “steps through” such instructions, asserting the control signals needed in successive clock cycles until the instruction has completed. The processor starts executing the instruction on the DIN input when the Run signal is asserted and the processor asserts the Done output when the instruction is finished. Table 2 indicates the control signals from Figure 1 that have to be asserted in each time step to implement the instructions in Table 1. The only control signal asserted in time step T0, for all instructions, is IRin. The meaning of Select = rY or IR in the table is that the multiplexer selects either register rY or the immediate data in IR, depending on the value of Op2. For the mv instruction, when IR is selected the multiplexer outputs 0000000DDDDDDDDD, and for mvt the multiplexer outputs DDDDDDDD00000000. Only signals from Figure 1 that have to be asserted in each time step are listed in Table 1; all other signals are not asserted. The meaning of AddSub in step T2 of the sub instruction is that this signal is set to 1, and this setting causes the adder/subtracter unit to perform subtraction using 2’s-complement arithmetic.
The processor in Figure 1 can perform various tasks by using a sequence of instructions. For example, the sequence below loads the number 28 into register r0 and then calculates, in register r1, the 2’s complement value 28.
mv
r0, #28
//
original number = 28
mvt
r1, #0xFF00
add
r1, #0x00FF
//
r1
=
0xFFFF
sub
r1,
r0
//
r1
= 1’s-complement of r0
add
r1,
#1
//
r1
=
2’s-complement of r0
= -28
2
T0
T1
T2
T3
mv
IRin
Select = rY or IR,
rXin, Done
mvt
IRin
Select = IR,
rXin, Done
add
IRin
Select = rX,
Select = rY or IR,
Select = G, rXin,
Ain
Gin
Done
sub
IRin
Select = rX,
Select = rY or IR,
Select = G, rXin,
Ain
AddSub, Gin
Done
Table 2: Control signals asserted in each instruction/time step.
Part I
Implement the processor shown in Figure 1 using Verilog code, as follows:
Make a new folder for this part of the exercise. Part of the Verilog code for the processor is shown in parts a to c of Figure 2, and a more complete version of the code is provided with this exercise, in a file named proc.v. You can modify this code to suit your own coding style if desired—the provided code is just a suggested solution. Fill in the missing parts of the Verilog code to complete the design of the processor.
module proc(DIN, Resetn, Clock, Run, Done); input [15:0] DIN;
input Resetn, Clock, Run;
output Done;
parameter T0 = 2’b00, T1 = 2’b01, T2 = 2’b10, T3 = 2’b11;
: : declare variables assign III = IR[15:13]; assign IMM = IR[12]; assign rX = IR[11:9]; assign rY = IR[2:0];
dec3to8 decX (IR[4:6], 1’b1, Xreg);
// Control FSM state table
always @(Tstep_Q, Run, Done)
case (Tstep_Q)
T0: // data is loaded into IR in this time step if (~Run) Tstep_D = T0;
else Tstep_D = T1;
T1: :::
: : :
endcase
Figure 2: Skeleton Verilog code for the processor. (Part a)
3
// mv rX, rY // mv rX, #D
parameter mv = 3’b000, mvt = 3’b001, add = 3’b010, sub = 3’b011;
// selectors for the BusWires multiplexer
parameter Sel_R0 = 4’b0000, Sel_R1 = 4’b0001, : : :, Sel_R7 = 4’b0111, Sel_G = 4’b1000, Sel_D = 4’b1001, Sel_D8 = 4’b1010;
control FSM outputs always @(*) begin
Done = 1’b0; Ain = 1’b0; : : : // default values for variables case (Tstep_Q)
T0: // store DIN into IR
IRin = 1’b1;
T1: // define signals in time step T1
case (III)
mv: begin
if (!IMM) Sel = rY; else Sel = Sel_D; Rin = Xreg; Done = 1’b1;
end
mvt: // mvt rX, #D
: : :
endcase
T2: // define signals in time step T2
case (III)
: : :
endcase
T3: // define signals in time step T3
case (III)
: : :
endcase
default: ;
endcase
end
// Control FSM flip -flops
always @(posedge Clock, negedge Resetn)
if (!Resetn)
: : :
regn reg_0 (BusWires, Rin[0], Clock, R0); regn reg_1 (BusWires, Rin[0], Clock, R1);
: : :
regn reg_7 (BusWires, Rin[0], Clock, R7);
: : instantiate other registers and the adder/subtracter unit Figure 2: Skeleton Verilog code for the processor. (Part b)
4
define the internal processor bus always @(*)
case (Sel)
Sel_R0: BusWires = R0;
Sel_R1: BusWires = R1;
: : :
Sel_G: BusWires = G;
Sel_D: BusWires = : : :; // used for mv, add, ..., with #D
Sel_D8: BusWires = : : :; // used for mvt
default: BusWires = 16’bxxxxxxxxxxxxxxxx; endcase
endmodule
module dec3to8(W, Y);
input [2:0] W;
output [0:7] Y;
reg [0:7] Y;
always @(*)
case (W)
3’b000: Y = 8’b10000000;
3’b001: Y = 8’b01000000;
3’b010: Y = 8’b00100000;
3’b011: Y = 8’b00010000;
3’b100: Y = 8’b00001000;
3’b101: Y = 8’b00000100;
3’b110: Y = 8’b00000010;
3’b111: Y = 8’b00000001;
endcase
endmodule
Figure 2: Skeleton Verilog code for the processor. (Part c)
Set up the required subfolder and files so that your Verilog code can be compiled and simulated using the ModelSim Simulator to verify that your processor works properly. An example result produced by using ModelSim for a correctly-designed circuit is given in Figure 3. It shows the value 0x101C being loaded into IR from DIN at time 30 ns. This pattern represents the instruction mv r0,#28, where the immediate value D = 28 (0x1C) is loaded into r0 on the clock edge at 50 ns. The simulation results then show the instruction mvt r1,#0xFF00 at 70 ns, add r0,#0xFF at 110 ns, and sub r1,r0 at 190 ns.
You should perform a thorough simulation of your processor with the ModelSim simulator. A sample Verilog testbench file, testbench.v, execution script, testbench.tcl, and waveform file, wave.do are provided along with this exercise.
5
Figure 3: Simulation results for the processor.
Part II
In this part we will implement the circuit depicted in Figure 4, in which a memory module and counter are connected to the processor. The counter is used to read the contents of successive locations in the memory, and this data is provided to the processor as a stream of instructions. To simplify the design and testing of this circuit we have used separate clock signals, PClock and MClock, for the processor and memory. Do the following:
A Quartus project file is provided along with this part of the exercise. Use the Quartus software to open this project, which is called part2.qpf.
A sample top-level Verilog file that instantiates the processor, memory module, and counter is shown in Figure 5. This code is provided in a file named part2.v; it is the top-level file for the Quartus project part2.qpf. The code instantiates a memory module called inst_mem. You have to create a Verilog file that represents this memory module by using the Quartus software, as described below.
Memory
Processor
Counter
n
data
16
DIN
addr
Done
Done
Resetn
Run
MClock
PClock
Resetn
Run
Figure 4: Connecting the processor to a memory module and counter.
6
module part2 (KEY, SW, LEDR);
input [1:0] KEY;
input [9:0] SW;
output [9:0] LEDR;
wire Done, Resetn, PClock, MClock, Run;
wire [15:0] DIN;
wire [4:0] pc;
assign Resetn = SW[0];
assign MClock = KEY[0];
assign PClock = KEY[1];
assign Run = SW[9];
proc U1 (DIN, Resetn, PClock, Run, Done); assign LEDR[9] = Done;
inst_mem U2 (pc, MClock, DIN);
count5 U3 (Resetn, MClock, pc);
endmodule
module count5 (Resetn, Clock, Q);
input Resetn, Clock;
output reg [4:0] Q;
always @ (posedge Clock, negedge Resetn)
if (Resetn == 0)
Q <= 5’b00000;
else
Q <= Q + 1’b1;
endmodule
Figure 5: Verilog code for the top-level module.
A diagram of the memory module that you need to create is depicted in Figure 6. Since this memory module has only a read port, and no write port, it is called a synchronous read-only memory (synchronous ROM). Note that the memory module includes a register for synchronously loading addresses. This register is required due to the design of the memory resources in the Intel FPGA chip.
Use the Quartus IP Catalog tool to create the memory module, by clicking on Tools IP Catalog in the Quartus software. In the IP Catalog window choose the ROM: 1-PORT module, which is found under the Basic Functions On Chip Memory category. Select Verilog HDL as the type of output file to create, and give the file the name inst_mem.v.
Follow through the provided dialogue to create a memory that has one 16-bit wide read data port and is 32 words deep. Figures 7 and 8 show the relevant pages and how to properly configure the memory.
7
Address
5
5
Clock
32 x 16 ROM
16
DataOut
Figure 6: The 32 x 16 ROM with address register.
Figure 7: Specifying memory size.
To place processor instructions into the memory, you need to specify initial values that should be stored in the memory when your circuit is programmed into the FPGA chip. This can be done by initializing the memory using the contents of a memory initialization file (MIF). The appropriate screen is illustrated in Figure 9. We have specified a file named inst_mem.mif, which then has to be created in the folder that contains the Quartus project. Clicking Next two more times will advance to the Summary screen, which lists the names of files that will be created for the memory IP. You should select only the Verilog file inst_mem.v. Make sure that none of the other types of files are selected, and then click Finish.
An example of a memory initialization file is given in Figure 10. Note that comments (% : : : %) are included in this file as a way of documenting the meaning of the provided instructions. Set the contents of your MIF file such that it provides enough processor instructions to test your circuit.
The code in Figure 5, and the Quartus project, includes the necessary port names and pin location assign-ments to implement the circuit on a DE-series board. The switch SW9 drives the processor’s Run input, SW0 is connected to Resetn, KEY0 to MClock, and KEY1 to PClock. The Run signal is displayed on LEDR0 and Done is connected to LEDR9.
8
Figure 8: Specifying which memory ports are registered.
Figure 9: Specifying a memory initialization file (MIF).
9
Use the ModelSim Simulator to test your Verilog code. Ensure that instructions are read properly out of the ROM and executed by the processor. An example of simulation results produced using ModelSim with the MIF file from Figure 10 is shown in Figure 11. The corresponding ModelSim setup files are provided along with this exercise.
Once your simulations show a properly-working circuit, you may wish to download it into a DE-series board. The functionality of the circuit on the board can be tested by toggling the switches and observing the LEDs. Since the circuit’s clock inputs are controlled by pushbutton switches, it is possible to step through the execution of instructions and observe the behavior of the circuit.
DEPTH = 32;
WIDTH = 16;
ADDRESS_RADIX = HEX;
DATA_RADIX = BIN;
CONTENT
BEGIN
00 : 0001000000011100;
% mv r0, #28
%
01 : 0011001011111111;
% mvt r1, #0xFF00
%
02 : 0101001011111111;
% add r1, #0xFF
%
03 : 0110001000000000;
% sub r1, r0
%
04 : 0101001000000001;
% add r1, #1
%
05 : 0000000000000000;
: : (some lines not shown) 1F : 0000000000000000; END;
Figure 10: An example memory initialization file (MIF).
Figure 11: An example simulation output using the MIF in Figure 10.
Enhanced Processor
It is possible to enhance the capability of the processor so that the counter in Figure 4 is no longer needed, and so that the processor has the ability to perform read and write operations using memory or other devices. These enhancements involve adding new instructions to the processor, as well as other capabilities—they are discussed in the next lab exercise.
10