Thursday, October 30, 2008

Lecture 9


 

Lecture 9


 

  • Character Strings
    • Storage techniques
      • 1.1
        • Counting – not used
        • Count number of characters in a string, one by one until string is fully read
      • 1.2
        • Sentinel - used
          • Adds a null character '\0' to the end of the string
    • DATA Segment
      • A way to put data into main memory
      • Open with assembler directive '.data' pus string into main memory
      • Msg: .ascii "the sum is: "
        • The directive ascii puts the string into memory without null termination
      • CRLF: .asciiz "\n"
        • The directive asciiz puts string into memory with null termination
      • System services strings
        • Used for input / output
        • Very primitive
        • Unique to system (not the same on all platforms)
        • Spim on unix
        • See text appendix A 48 and the lesson nmber two
        • Invoked using 'syscall'
      • Memory access by bytes
        • Byte load and save
          • Load byte
            • Lb ch, 0(aa)
              • Aa is he base
              • 0 is the offset
              • Ch is the absolute address of register to load to
              • Lb is 'load byte' to register from main memory
              • Actual address = base + offset
          • String copy
            • Sub $sp, $sp, 4
            • Sw $s, 0($sp)
            • Save the saved register on stack to start
          • Void strcopy (char x[], char y[])
          • {int i = 0;
          • While((x[i] = y[i]) !=0_
          • i = i+1

Lecture 11


 

Hardware Software interface

Hardware            software


 

                

Hardware Software Intersection

  • Design interface and aspects
  • Execution time = # of instructions * execution time per instruction
  • CCycles = # of instructions * CCycles Per Instruction
  • Average Clock Cycle Per Instruction
    • A = 2
    • B = 4
    • C= 1
  • Solution =    
  • This solution depends on them happening just as often, this is naive
  • Mix of instructions

10

A

6

B

4

C

20

 


 


  • Clock Cycles per Instruction multicycle machine
  • Instruction types A,B,C
  • Example
    • Suppose we have the following:
    • Where CPI is –clocks per instruction

Mc

ISA

Clock Cycles

CPI

A

A

250 picoseconds

2.0

B

A

500 picoseconds

1.2

  • Which machine is faster by how much??
  • CPU CC A = 2x
  • CPU CC B = 1.2x
  • Cpu time A = CPU clock cycles A * Clock cycle time for A
    • =x * 2.0 * 250ps = 500xps
  • Cpu time B = x * 1.2 * 500ps = 600ps
  • b/a = 600/500 = 1.2
  • a = 1.2x faster


 

Processor Datapath and Control

  • in a single cycle machine all instructions are carried out in a single cycle
  • Fetch Execute Cycle

_LOOP

||    Fetch Instruction

||    Execute Instruction

||_    End Loop


 

FETCH / EXECUTE CYCLE


 

  • Functioning of a Computer at the Lowest Level Virtual Machine
    • No compiling
    • Like interpreter
    • Sequential
  • PC keeps track of where we are in the program
    • PC = Program Counter

Major Issues

  • Single clock cycle per instruction
  • Data path resource used at most once per instruction
  • Need to replicate some data path resources if need more than once: memory, ALU/adders
  • Timing, testing of single cycle
  • Single cycle datapath

Single Cycle

  • All instructions in one cycle
  • Cycle needs to be equal to the time for longest instruction

How to Design a Processor

  • Analyze instruction set =>datapath requirements
    • The meaning of each instruction is given by the register transfers
    • Datapath must include storage element for ISA registers
      • Possibly more
    • Datapath must support each register transfer
  • Select set of datapath components and establish clocking methodology
  • Assemble datapath meeting the requirements
  • Analyze implementation of each instruction to determine setting of control points that effects the register transfer
  • Assemble the control logic

Wednesday, October 15, 2008

Lecture 10


 

Lecture 10


 

  • Two dimensional array pointers
    • Aa is the address of C(0,0)
       

      0

      1

      2

      3

      4

      0

      Aa

          

      1

         

      c

       

      2

           
    • From 2-3 array index to vector index of imbedded 2-d array
  • C = (i,j) à y = n*i + j
  • From imbedded index to array indicies
  • I=floor[y/n]
  • J=rem[y/n]=y-floor[y/n]*n

Machine Language


 

  • Instruction sequence of zeros and ones
  • 32 bits
  • Keep simple fixed size
  • Simplicity of instruction implies regularity implies design simple
    • Instructions in MIPS are symbolic and must be mapped into binary numbers
    • L1 machine language and L2 MIPS assembly language are not too different
  • Machine Language to Assembler
    • MIPS instruction is translated to 32 bit mc instruction
    • Translator follows rules for writing these statements as machine code
  • Machine instruction basic format
    • Simplify and tradeoffs
      • Common fields
      • Common format
      • Fields have names
      • Instruction size fixed
  • Field use depends on instruction type
  • Type R arithmetic

Bits

6

5

5

5

5

6

 

High order

Op

Rs

Rt

Rd

Shamt

Funct

Low order

  • Op is operation
  • Rs and rt are source registers
    • First and second operands
  • Rd is the 'register for destination

Bits

6

5

5

16

 

High order

Op

Rs

Rt

Address or constant

Low order


 

Example: add $t0 $s1 $s2


 

Add is the ops code, t0 is the destination s1, s2 are the operands

Translated


 

0

7

18

8

0

32

Op

Rs

Rt

Rd

Shamt

funct


 

^machine language in decimal


 

6

5

5

5

5

6

000000

10001

10010

01000

00000

100000

Op

Rs

Rt

Rd

Shamt

funct

^ machine language in binary / machine language code


 

  • Using the MIPS instruction sheet
  • Add(u)     reg0    reg1    reg2    type0x        order

    R20(21)        1,2,0

  • This tells us the MIPS instruction format
  • The purple tells us the machine format it changes to

    R 20 (21)    àr Type instruction

            àop code always 0 & funct code given as 20


     

    1,2,0    àorder of MIPS registers in machine format

            àreg1 àrs reg2àrt, reg0 àrd


     

    Machine format R type

    op rs rt rd shamt funct


 

CONSTANTS OR IMMEDIATE OPERANDS

  • lw $t0, address Constantx($0)

    add $sp, $sp, $t0

    memory access: memory address of constant x

    t0 ßconstant x

  • instruction with constant in it à no memory access

    addi $sp, $sp, 4    spß sp + 4

    addi $sp, $sp, -8    spßsp – 8


     

Op

Rs

Rt

Immediate constant


 

Constant can be negative: range [-32768, 32767]


 

  • what if the constant is larger than 16 bits?
  • Set upper (higher order) 16 bits to 255
    • #upper t0, ß 255
    • Use 'lui'
    • Example: lui $s0, 255

31 16

15 0

255

0


 

Example 2:


 

Lui $s0, 61


 

31 16

15 0

61

0


 

Addi $s0, $s0, 2304

31 16

15 0

61

2304


 

Becomes

0000|0000|0011|1101

0000|1001|0000|0000


 


 

JUMP INSTRUCTION


 

  • J label à j exit
  • J address à j 1000
  • Jump to actual address in words

J TYPE FORMAT


 

Bits

6

31 26

26

25 0

 

High order

Op

address

Low order


 

Example:


 

J 1000


 

High order

2

1000

Low order


 

  • The jump address is in words this increases the addressing space by four times

Addressing Space


 

  • Note that only 26 bits in jump thus can address 226 = 67108864 different words or 228 = different bytes
  • But we could have addresses up to 32 bits or 16 times as many if our address were in a 32 bit word
  • This means all instructions of one program should be in one block of memory
  • Means a program can be up to 6.7 million statements (assembly or machines language) without doing anything extra

Blocks from a memory with 32 bit addresses

0000

0001

0010

0011

0100

0101

0110

0111

1000

1001

1010

1011

1100

1101

11110

1111


 

CONDITIONAL BRANCH


 

bne    a,b,label

compare two operands (a and b) and go to label if the condition holds

if a C b then go to label


 

instruction format

Bits

6

5

5

16

 

High order

5

16

17

exit

Low order


 

PC RELATIVE ADDRESSING


 


 

  • Is the word in the CPU for keeping track of the place of the next instruction location in main memory
  • PC Register R + branch address
  • 232 cells in program all cells are nonnegative integers thus can use whole word no sign required
  • 232 = 4, 294, 867, 296 ~ 4 billion bytes = 1 billion words
  • 4 gigabytes = 1 giga words
  • Would register R be?
    • What should register R be?
    • Note: most conditional branching
      • Loops
      • If conditionals
      • Therefore close by close to where we are now in the program
  • Therefore R = current PC


     

Summary of addressing modes


 

Register addressing

Operand is a register

Base or displacement

Operand is at the memory location whose address is sum of a register and a constant

Immediate addressing

Operand is a constant within the instruction

Pc relative addressing

Address is the sum of the PC plus the constant in the instruction

Psuedodirect addressing

Jump address is the 26 bits of the instruction concatenated with the upper bits of the PC


 


 

Monday, September 22, 2008

Lecture 6


 

  • IDEA for switch
    • Need to jump to right place in program or right label or right address
    • The label or address is dependent on what value of k
  • Unconditional Jump
    • Based on value of register = address to jump to is in the register
      • jr
        • jr , jump register
  • Modularity
  • Why use subprograms?
    • Easy to understand
    • Better organization
    • Reusable
    • Reduce complex tasks to simpler sets of task
    • Reliable, less errors
  • All programs are implemented / run sequentially
  • jal
    • jump and link procedure
      • jumps to a procedure address
      • saves the return address in $ra
  • jr
    • jump register
      • jump based on what is in $ra
  • $a0, $a1, $a2, $a3
    • Arguments
    • Caller calls the subprogram with arguments from $a*
    • The callee places subprograms result in $v0 and $v1
  • $ra
    • Return address
  • $sp
    • Stack pointer

The Stack


 

  • Procedure has only a total of 6 memory locations for data $a1-$a4 & $v0-$v1
  • A very specific kind of memory
    • It's in main memory but
    • Has a particular structure
    • It is a subset of main memory and it's run in a particular way
    • It has a top and a bottom but all we can do is push stuff on it and take stuff off the top of it

Wednesday, September 17, 2008

Lecture 5

Lecture 5


 

  • Syscall
    • Used for input / output
    • Very primitive
    • Unique to system
    • Spim on unix
    • See text appendix A48 and the lesson number one
    • Invoked using syscall
    • $v0 will contain the service number (there are 10 services)
    • $a0..$a3 arguments
    • Print integer service number n = 1
  • Multiply In MIPS
    • mult rs, rt
  • Division in MIPS
    • div rs, rt
      • rs/rt
  • Structured loops
    • While (save[i] == k ) i = i+j
      • Loop in MIPS
  1. Get address of save[i]
  2. Load save[i]
  3. Go to exit if save != k
  4. I ß i+ j increment i
  5. Go to loop
  • Loop    

Add $t1, $s3, $s3

Add $t1, $t1, $t1

Lw $t0, 0($t1)

Bne $t0, $s5, exit

Add $s3, $s3, $s4

J loop

Exit:


 


 


 


 


 


 

  • Branch less than
  • Set on less than
    • Slt a, b, c
    • Set a to 1 when b < c
    • This was on the TEST we wrote TWO DAYS AGO. This guy needs a new job.

Monday, September 15, 2008

Lecture 4

Lecture 4


 

  • Main memory is one long single memory array of cells
    • The cells are bytes
    • These are collapsed into words
      • Instructions are stored in words
      • Words take 4 bytes of main memory
  • Memory access
    • Data from memory to register
      • lw a, r(b)
        • Where 'b' is the base address
        • 'r' is the relative address
          • Relative address MUST be a constant, cannot be a variable
        • 'a' is the absolute (actual) address of register
    • Data to memory from register
      • sw a, r(b)
        • 'b' is base address
        • 'r' is relative address
          • Relative address MUST be a constant, cannot be a variable
        • 'a' is absolute address from register STORE WORD to register from main memory
  • Compile Assignment and using main memory
    • Change c program into MIPS
      • g = h + A[8]
    • MIPS program ( in Arial font)
      • lw $t0, 32($s3)
        • A[8]'s relative address is 32 ( 4*8 )
      • add $s1, $s2, $t0
    • multiplying with mips
      • using previous example how do i get 4*i (without just knowing it)?
        • add $t1, $s4, $s4
          • find four times i
        • add $t1, $t1, $t1
          • find absolute address
        • add $t1, $s3, $t1
          • t1 ß base + relative addresses
            • $s3 is the base address
        • Lw $t0, 0($t1)
          • Load
        • Add $s1, $s2, $t0
          • Add h + A[i]
    • Beq reg1, reg2, label
      • Operands (register1, register2, label)
      • If (r1=r2) then go to L1
    • Bne reg1, reg2, label
      • Operands; (register, register2, label)
      • If r1 != r2 then go to L2

---


 


 


 


 

If ( i == j ) then

    F = g + h

    Else

    F = g – h


 

If ( i != j ) then go to else

    F = g + h;

    Go to exit;

Else

    F = g – h;

Exit

MIPS

bne $s3, $s4, Else

add $s0, $s1, $s2

j Exit #jump to exit

Else: sub $s0, $s1, $s2

Exit:

Loops

Example:

Loop: g=g + A[i]

I=i+j;

If (i != h) go to loop;


 

Loop

  1. Get address of A[i]
  2. Load A[i]
  3. G ßg+A[i]
  4. I ßi+j
  5. If (i != h) go to loop

Steps 1-3 are the body, 4 is the modift, 5 is the stopping rule


 

MIPS

Loop: add $t1, $s3, $s3

Add $t1, $t1, $t1

Add $t1, $s5, $t1    #where $s5 is the base address of the array

Lw $t0, 0($t1)

Add $s1, $s1, $t0

Add $s3, $s3, $s4

Bne $s3, $s2, loop

Exit:


 


 

Thursday, September 11, 2008

Lecture 3


 

  • Compile Two C Assignments
    • C is one level above assembler
    • Change c program segment into MIPS assembly language:

      • a = b + c;
      • d = a – e;
    • Solution
      • Add a,b,c
      • Sub d,a,e

Data Path Memory: MIPS Registers


 

  • Register = STORAGE LOCATION (ONE WORD LONG)
  • 1 word long = 32 bits
  • Numbered in reverse direction
  • 32 registers
    • named (each register has a unique specific name)
    • each register has 32 bits
      • why? We can't have too much memory because this memory is very expensive, and we don't want too few because we want to do interesting things. (broad answer)
    • each register has a specific name
    • there are temporary registers and save registers
      • 7 S registers
        • Addressed like this: $t0
      • 10 T registers
        • Addressed like this: $s0

Main Memory as an Array of Words


 

  • A 'word' is 4 bytes
    • A byte is 8 bits
  • Big Endian is a 32 bit word
  • The array structure G is imbedded in word memory array at Byte 16(base address)
  • Word address of G[3] is base address + offset
  • Base address = 16
  • Offset = relative word * bytes/word = 3*4=12

Address of G[3] = 16+3*4 = 28

Monday, September 8, 2008

Lecture 2

Lecture 2


 

  • Calculator example
    • Hardware
      • Was the calculator
    • Software
      • The user
  • Virtual machine
    • Virtual concept came to being in the 60's when machines started to get more complex
  • Translator
    • Is written in L1
      • The lower level language
    • And translates L2 into L1 so that it is understandable by the machine
    • The difference between L2 and L1 should not be that big
    • L2 should be easier for the users to write in
    • L1 is typically called 'machine language'
  • Layers
    • Packages ( level 6 ) -> problem oriented languages -> assembly language -> OS -> conventional machine -> micro programming -> digital logic (level 0)
  • Levels of representation
    • High level language
      • C programming language
    • Assembly language
      • MIPS
    • Machine Language
      • MIPS
      • 0's and 1's
    • Machine interpretation
      • Hardware architecture description
    • Architecture implementation
      • Logic circuit description

Layered Computer Design


 

  • Series of layers built on the predecessor
    • Allows
      • Independence of design
        • Reduced complexity
        • Easier to understand
        • Easier to design and analyze
    • Hierarchical nature of computer system
    • Essential to design and description
    • Only need to deal with one level at a time
    • Each level has structure and function

Understanding architecture and organization

  • Architecture
    • Set of
      • Data types
      • Operations
      • Features at each level
    • What is visible to the user
      • As memory available
      • We don't care about how the memory is constructed
        • What we care about is that it's available
    • Implementation aspects
      • Such as what kind of chip technology are not part of the architecture
  • Computer architecture
    • The study of how to design parts of a computer that are visible to the programmers
  • Definition for Tanenbaum
    • Computer architecture IS computer organization
  • Definition for Stalling
    • Computer architecture
      • Attributes of a system visible to the programmer
        • Instruction set
        • Data types
        • I/O mechanisms
        • Memory addressing techniques
    • Computer organization
      • Operational units and interconnections that realize that architecture
      • Transparent to the user
        • Control signals
        • Interfaces
        • Memory technology
    • Families of computers
      • Architecture remains the same organization differs

Instruction Set Architecture


 

  • A very important abstraction
    • Interface between hardware and low-level software
    • ISA
      • Acronym: Instruction set architecture
    • Standardizes instructions, machine language bit patterns, etc.
    • Advantage: different implementations of the same architecture
    • Disadvantage: sometimes prevents using new innovations


     

  • Microsoft will do anything to get features and go around structural issues that might prevent new innovation. However there is a price to be paid: it prevents using new innovations and software breaks down regularly
  • The Instruction Set
    • Attributes of a computing system as seen by the programmer, the conceptual structure and functional behaviour, as distinct from the organization of the data flows and controls the logic design and the physical implementation
  • Computer Function
    • Data processing
      • Something that actually changes and transforms the data
    • Data storage
      • Places to store before after and during the data processing
    • Data movement
      • The movement of the data
    • Control
      • Control mechanism found in the center of all these functions
  • Functioning at the lowest level virtual machine
    • No compiling
    • Like interpreter
    • Sequential
  • Main memory
    • Works as a loop    
      • Loop
        • Fetch instruction
        • Execute instruction
      • End loop
  • Datapath
    • Are cyclic
    • Fetch
      • From main memory
      • Through the `datapath`
      • From a set of instructions
    • Sends to registers
      • Which goes to the register memory
      • And is passed off the ALU
        • ALU –acronym: arithmetic logic unit

MIPS language


 

  • Assembly language
    • Higher level than machine language
    • (one step higher than machine language)
  • RISC
    • Acronym: reduced instruction set computer
    • A minimum set of machine or assembly language instruction
    • Minimum set of isa instructions
    • Make each run very fast
    • Concentrate effort
  • Instruction representation
    • Add 4 variables
    • Solution
      • Add a,b,c
        • # a ßb+c
      • Add a,a,d
        • #aßa+d
      • Add a,a,e
        • #aßa+e

s

Wednesday, September 3, 2008

Lecture 1

    • Email protocol
      • From your York computer science student account
      • Subject line cs2021A......
      • Identify yourself in the email by name and student number
    • Office hours
      • CSE3012
        • Monday and Wednesday 1-2pm
    • Phone
      • 416 736 2100
    • Course webpage
    • Grade Weighting
      • 24% - LABS
      • 26% - MIDTERM
      • 50% - FINAL EXAM


 


 

Lecture 1

  • What is a computer?
    • Components
      • Input (mouse, keyboard)
      • Output (display, printer)
      • Memory (disc drives, DRAM, SRAM, CD)
      • Processor
      • Network
  • Technology
    • Processor
      • Logic capacity
        • About 30% per year
      • Clock rate
        • About 20% per year
    • Memory
      • DRAM capacity
        • About 60% per year
      • Memory speed
        • About 10% per year
      • Cost per bit
        • Decreases about 25% per year
    • Disk
      • Capacity
        • ~60% per year
  • Rapidly changing field
    • In order
      • Vacuum tube
      • Transistor
      • IC
      • VLSI
    • Doubling every 1.5 years
      • Memory capacity
        • Ram has grown by 100,000x in 44 years
        • Memory capacity has grown exponentially as a trend
        • Random note: 1GB = 1,073,741,824 bytes
      • Processor speed
        • This is due to advances in technology and organization
        • Processor performance has grown exponentially as a trend
        • Moore's law
          • 2x transistors/chip every 1.5 years
  • Machine language
    • Electronic circuits recognize and execute
      • A limited set of instructions
      • Keep limited and simple to reduce cost and complexity
      • Hard for humans to understand
      • Is in raw digital form
        • Binary digits (0 , 1)
    • Book of instructions
      • 001100 – Store
      • 001110 – Add
      • 111000 – Subtract
      • 101000 - Load
      • This is just an example for a theoretical machine
  • Design Forms
    • Build what we can
      • Implies language
      • Therefore: Machine implies Language


         

    • Decide what we want to do by what we build
      • Implies Machine
      • Therefore: Language implies Machine
      • All programs translate into this set first
      • Designers must decide on the instruction set
    • Translation comes in two forms:
      • Compilation
        • Replace each statement in L2 (L2 is the language built on top of L1 to make it more understandable by humans)
        • At end have program in L1
        • Execute this level L1 set of statements
        • Program in L2 -> compiler -> program in L1
      • Interpretation
        • write program in L1 language that will take as input a program in L2
        • each statement in L2 is translated to L1 statements and executed immediately

Program in L2 -> statement in L2 -> interpreter -> statement in L1 -> go to next statement in L2