Assembly Language CS2253 Owen Kaser, UNBSJ
Assembly Language ● Some insane machine-code programming ● Assembly language as an alternative ● Assembler directives ● Mnemonics for instructions
Machine-Code Programming (or, Why Assemblers Keep Us Sane) ● Compute 10+9+8+7+6+5+4+3+2+1 – Put the constant 0 into R1 – Put the constant 10 into R2 – Add R1 and R2, put the result into R1 – Subtract the constant 1 from R2 and set the status flags – If the Z flag is not set, reset the PC to contain the address of the 3 rd instruction above. ● Let's try to make some machine code.
Put 0 into R1 ● There's a Move instruction, or you could subtract a register from itself, or EOR a register with itself, or... let's use Move. ● Book Fig 1.12 ● ● cond = 1110 means unconditional ● S=0 means don't affect status flags ● I=1 means constant; opcode = 1101 for Move ● Rn = ???? say 0000; Rd = 0001 for R1 ● bits 8-11: 0000 Rotate RIGHT by 0*2 ● bits 0-7: 0x00 = 0x00 ● So machine code is 1110 00 1 1101 0 0000 0001 0000 00000000 = 0xE3A01000
Put 10 into R2 . ● ● cond = 1110 means unconditional ● S=0 means don't affect status flags ● I=1 means constant; opcode = 1101 for Move ● Rn = ???? say 0000; Rd = 0010 for R2 ● bits 8-11: 0000 (rotate right by 2*0 ) bits 0-7: 0x0A ● So machine code is 1110 00 1 1101 0 0000 0010 0000 00001010 = 0xE3A0200A
● Add R1 and R2, put result into R1 ● Same basic machine code format as Move ● cond = 1110 for “always” ; I=0 (not constant) ● opcode = 0100 for ADD; S=0 (no flag update) ● Rn = R1, Rd = R1 ● shifter_operand = 0x002 for R2 unmolested ● Having fun yet?? ● 1110 00 0 0100 0 0001 0001 0000 0000 0010 = 0xE0811002
● Subtract 1 from R2, result into R2 ● Same basic machine code format as Move ● cond = 1110 for “always” ; I=1 (constant) ● opcode = 0010 for Subtract; S=1 (yes flag update) ● Rn = R2, Rd = R2 ● shifter_operand = 0x001 for 1 rotated right 0 positions ● 1110 00 1 0010 1 0010 0010 0000 0000 0001 = 0xE2522001
Maybe Rinse and Repeat ● If the Z flag is not set, we want go back 2 instructions before this one. ● book Fig 3.2 ● cond = 0001 means “when Z flag is not set” ● L=0 means “don't Link” (Link changes R14) ● signed offset should be -4. The PC is already 2 instructions ahead of this one, and we want to go back 2 more than that. ● 0001 101 0 111111111111111111111100 = 0x1AFFFFFC ● Are you REALLY having fun yet ??
How'd you know the cond codes?
How'd You Know the Shifter Magic?
An Assembler ● Rather than making you assemble together all the various bit fields that make up a machine instruction, let's make a program do that. ● You are responsible for breaking the problem down into individual instructions, which will be given human friendly names (mnemonics). ● You give these instruction names to the assembler, along with various other directives (aka pseudo-ops) that control how the assembler does its job. ● It is responsible for producing the binary machine code. ● It also produces symbol table information needed by a subsequent linker program, if you write a multi-module program.
Assembly Language ● You communicate with the assembler via assembly language (mix of mnemonics, directives, etc.) ● Assembly language is line-oriented. ● A line consists of – an optional label in column 1 – an optional instruction or directive (and any arguments) – an optional comment (after a ; ) ● Example: here b here ; create infinite loop. ● “here” is a label that marks a place ● b is a branch instruction, forces the PC to a new location (here).
The Bad News ● Anyone who creates an assembler gets to define their own assembly language (ignoring manufacturer's suggestions). Dialects? ● Textbook shows code for Keil and Code Composer Studio. But we use Crossware's assembler, which is yet another dialect and it's hard to find documentation on it. ● Textbook talks about “Old ARM format” and “UAL format”. Crossware is a mixture (more old).
Our Program in Assembly mymain mov r1,#0 ← mymain is the label mov is the instruction # precedes the constant ; nice comment, eh? mov r2,#10 ; put 10 into r2 (bad comment) myloop add r1, R1, r2 ← case insensitive for reg names subs r2, r2, #1 ← final s means to affect flags bne myloop ← condition is “ne” (z flag false) sticky b sticky ← so we don't fall out of pgm end ← directive to assembler: you're done ;don't use “end”; it seems to be buggy in Crossware
Register Names ● r0 to r15 (alias R0 to R15) ● SP or sp, aliases for R13 ● LR or lr, aliases for R14 ● PC or pc, aliases for R15 ● cpsr or CPSR (the status registers etc) ● spsr or SPSR, apsr or APSR (later) ● not s0-s3 or a1-a4 (unlike book page 63)
Popular Assembler Directives ● Textbook Section 4.4 describes the set of directives supported by the Keil assembler and the TI assembler. ● Our Crossware assembler is different than both (but closer to Keil). ● Let's look at directives to – set aside memory space for variables/arrays – define a block of code or data – give a symbolic name to a value
Directive to Set Aside Memory ● The SPACE directive tells the assembler to set aside a specified number of bytes of memory. These locations will be initialized to 0. ● Usually have a label, since you need a name to refer to the allocated memory. ● Example – myarray SPACE 100 – myarr2 SPACE 100*4 ←constant expression's ok ● Later, instructions can load and store things into the chunks of memory by referring to the names used. ● If myarray starts at address 1234, myarr2 starts at 1234+100
Use of SPACE ● An assembly language programmer uses SPACE for the same reasons that a Java programmer uses an array.
Directives for Memory Variables ● Use DCB to declare an initialized byte variable. ● DCW for initialized halfword, DCD for word. ● Example myvar1 DCB 50 ← decimal constant myvar2 DCB 'x' ← ASCII code of 'x' myvar3 DCB 0x55 + 3 ← constant expression ● If myvar1 ends up being at address 1234, then myvar2 will be at 1235 and myvar3 at 1236
Alignment ● DCW assumes you want the memory variable to start at a multiple of 2 (“halfword aligned”) ● DCD assumes you want alignment to a multiple of 4. ● To achieve this, assembler will insert padding. ● If you really want to set aside a word without padding, use DCDU. The “U” is for unaligned. ● There's also DCWU.
Alignment Example v1 DCB 10 v1 DCB 10 v2 DCW 20 v2 DCWU 20 v3 DCB 30 v3 DCB 30 v4 DCD 40 v4 DCDU 40 If v1 is at address 3000, then If v1 is at 3000, then v2 starts at 3002 (1 byte of v2 starts at 3001 padding) v3 is at 3003 v3 is at 3004 v4 starts at 3004 (aligned by luck) v4 starts at 3008 (3 bytes padding)
More Alignment Control ● Keil assembler has an ALIGN directive that can force alignment to the next word boundary (inserting 0-3 bytes of padding). ● In Crossware, the directive takes a numeric argument. So ALIGN 4 (or ALIGN 8)
DCB with Several Values ● You can use DCB with several comma-separated values ● Several consecutive memory locations are set aside. A label names the first of them. ● Example: foo DCB 1,2,3,4 ● We can access the location initialized to 3 as “foo+2” ● A quoted string is equivalent to a comma separated list of ASCII values. DCB “XY” is same as DCB 'X','Y' or DCB 88,89 ● DCW and DCD can also take a comma-separated list. ● Common use: make a small initialized table.
DCB: Signed or Unsigned? ● DCB's argument must be in the range -128 to +255. ● -ve values are 2's complement ● +ve values are treated as unsigned ● So DCB -1, 255 is same as DCB 255, 255 ● Similarly DCW's arguments in range -32768 to +65535. ● DCD from -2 31 to +2 32 -1
AREA directive ● In general, an assembly language program can have several blocks of data and several blocks of code. And it can be written in several different source-code files. ● The AREA directive marks the beginning of a new block. You give it a new name and specify its type. – eg AREA fred,code – You can go back to a previous area by using an old name ● A tool called a linker runs after the assembler to put your various sections (and any library routines you need) into a single program. ● Much more on linkers later in the course
AREA Example AREA mycode,code foo add R1, R2, R3 add R4, R5, #10 AREA mydata, data var1 dcb “cs2253” AREA mycode ← continues mycode where it left off add R6, R7, R8 This feature allows for us to show our data declarations near the code that uses them (maybe good software engineering), even if the different sections end up being far apart in memory. Memory picture on board...
Code in Data, Data in Code ● Q: Is this allowed; if so, what does it do? AREA mycode, CODE starthere add R1, R2, R3 DCD 0x1234567 ; this line is fishy add R2, R3, R4 AREA mydata, DATA var1 DCD 1234 var2 add R2, R3, R4 ; this line is also fishy var3 DCB “hello world”,0
Recommend
More recommend