Portable Code

Writing non portable code in the x86 assembly programming language

  • About

Print That Register

Posted by portablecode on February 25, 2012
Posted in: coding example. Tagged: asm, assembler, assembly, print register, programming, register, x86 asm, x86 assembler. Leave a Comment

When learning assembly programming, I found it extremely beneficial to be able to print registers. This is actually a very good program to write yourself while learning asm, but I will post it here anyways. Printing a string is much different than printing binary registers.  If I have a register with values 1AB3529F, that will be printed as “..R.” Where “.” is non displayable characters.

We are going to write a very short routine that I will call printr (for print register).  We are also going to write an even shorter driver program that will call printr and have it print our register that we pass to it.  Let us start with the driver.

; Executable name - NONE
; Version         - 1.0
; Created Date    - 20111108
; Author          - Jason Torola
; Description     - A driver program to test our print routine
; Regs used       - the value to be printed is passed to printr
;                   eax  - number to be printed
;

extern printr

section .data
section .bss
section .text

  global _start

_start:
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  ;  begin logic                   ;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  mov    eax,01f039453h    ; Get number to print
  call   printr            ; Call our print routine

  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  ; end logic                      ;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Exit:
  mov    eax,1             ; sys_exit
  mov    ebx,0             ; Set RC = 0
  int    80h               ; and return

As you can see, this is a very basic driver program.  All it does is call our routine and exits. Call is actually a special instruction as it pushes the address of the next instruction after itself onto the stack (In our case it is `mov  eax,1`). Call then transfers execution to the address represented by the label printr.  Ret is the instruction that is associated with Call.  Ret pops the address of the instruction pushed onto the stack by Call, and transfers execution to that address.  It is very simple.

Somebody very wise told me (and still tells me actually), “It isn’t magic, it’s all just bits and bytes.”  He is very correct. It may look like it is doing some sort of voodoo magic to get it’s work done, but when you look under the covers, it’s all very simple.

Let us look at our very simple print routine.

; Executable name - NONE
; Version         - 1.0
; Created Date    - 20111108
; Author          - Jason Torola
; Description     - A print routine for Linux
; Regs used       - the value to be printed is passed in
;                   with eax
;
section .data
section .bss
  output     resb 10         ; Output buffer that will hold our number
  carriage   resb 1          ; Newline character
section .text
  global printr
printr:
  pushad                     ; Save all regs
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  ;  begin logic                    ;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  mov    edi,eax             ; save number to be printed

  mov    ecx,10              ; base 10 output
  lea    ebx,[output+9]      ; set output field pointer to last byte
  mov    [carriage],byte 10  ; move newline character to output field

;   Divide the number you want to print by 10, the remainder will be
;   the  number and so on. E.G. if your number is 936
;      936/10 = 93 rem 6
;      93/10  = 9  rem 3
;      9/10   = 0  rem 9
;
pt_loop01:
  xor    edx,edx             ; clear for divide
  div    ecx                 ; Divide edx:eax by 10

;   In order to convert the decimal remainder to character,
;   you simply add character 0 to the number. For example
;   if your decimal number is 6 you take
;   6 + '0' = 6 + 48 = 54 = '6'
  add    edx,'0'             ; convert digit to char
  mov    [ebx],byte dl       ; save character to output field
  dec    ebx                 ; decriment output pointer

;   Done printing? We are done when our number is zero.
  test   eax,eax             ; Is our number zero?
  jnz    pt_loop01           ;  no, divide again

pt_done:
  inc    ebx                 ; get true start of number

;   Print value
  mov    eax,4               ; sys_write
  mov    ecx,ebx             ; get output buffer location
  lea    edx,[output+11]     ; Get pointer to byte past output buffer
  sub    edx,ebx             ; subtract both pointers to get length
  mov    ebx,1               ; STDOUT
  int    80h
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  ; end logic                       ;
  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Exit:
;   Restore regs that were used
  popad
  ret                        ; return to caller

Again, notice how I do not do sys_exit. A sys_exit would return control back to the operating system, which is what I do not want to do. The stack hasn’t been cleaned up yet (there is still the return address on it). Not to mention our program could still have more work to do.

If you look closely also, you will notice that this program uses a “do while” loop.  This is important because if we use a “while” loop, we will not be able to print a register that is zero.  It will simply just print a newline character.

Let’s see how it turns out

jason@fitz:~/programming/programs/printr$ make
nasm -w+all -f elf -l printr.lst printr.asm
nasm -w+all -f elf -l driver.lst driver.asm
ld -m elf_i386 -o run printr.o driver.o
jason@fitz:~/programming/programs/printr$ ./run
520328275
jason@fitz:~/programming/programs/printr$

As we can see 1F039453 hex is indeed 520328275 decimal.

There is another way to print numbers which involves a very useful algorithm that will allow somebody to create their own bignum class. Using this method I was able to print numbers with as many digits as my computer had memory to hold.  This was all in assembly.  Remember, it’s all just bits and bytes.

Jason Torola

Actual Work

Posted by portablecode on February 23, 2012
Posted in: coding example. Tagged: assembly, programming, x86 asm, x86 assembly. 1 comment

Well now, we have seen a basic hello, world program and how it works.  I’m not going to get into what instructions are what in this blog (unless of course I find one worth talking about). This post is to show you what an assembly program can do.

This is problem 1 from Project Euler (http://projecteuler.net/problems). It says “Add all the natural numbers below one thousand that are multiples of 3 or 5.”

A few things to note here.

  1. We want to find numbers divisible by 3 or 5. If it is divisible by 3 and 5, we only count it once
  2. I use my printj asm routine to print registers edx:eax.  This is a routine written by me and I will post soon.
  3. This code is simply to show how easy it is to do work in assembly. A lot of people can’t imagine how work is done in assembly. They need to visualize it.

; Executable name - NONE
; Version         - 1.0
; Created Date    - 20110722
; Author          - Jason Torola
; Description     - nasm program for Linux that adds all the
;                   natural numbers below 1,000 that are
;                   divisible by 3 or 5
;
extern printj

section .data
section .bss
section .text
    global _start
_start:
    nop                      ; Keep debugger happy
;   Save regs that will be changed
    pushad
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ;  begin logic                        ;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    mov    ebx,999           ; numbers below 1000
    xor    ecx,ecx           ; zero total

    mov    esi,5             ; Divisor 1 to check divisibility
    mov    edi,3             ; Divisor 2 to check divisibility

; Go through each number below 1,000, starting at 999,
; and add all that are divisible by 3 and 5
loop01:
    xor    edx,edx           ; clear high reg for divide
    mov    eax,ebx           ; get value into reg for divide
    div    edi               ; and divide

;   Check if natural number is divisible by 3
    test   edx,edx           ; number divisible by 3?
    jnz    check5            ;  no, check 5
    add    ecx,ebx           ;  yes, add to total
    jmp    next              ; check next number

;   Check if natural number is divisible by 5
check5:
    mov    eax,ebx           ; get value into reg for divide
    xor    edx,edx           ; clear high reg for divide
    div    esi               ; and divide

    test   edx,edx           ; number divisible by 5?
    jnz    next              ;  no, get next number
    add    ecx,ebx           ;  yes, add to total

;   Check next natural number
next:
    dec    ebx               ; decrement natural number
    jnz    loop01            ; loop if not zero

    ; Print total
    xor    edx,edx           ; clear high order reg
    mov    eax,ecx           ; get total into reg to print
    call   printj            ; call print

    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ; end logic                           ;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Exit:
    nop
;   Restore regs that were used
    popad
    mov    eax,1             ; sys_exit
    mov    ebx,0             ; Set RC = 0
    int    80h               ; and return

A few more things to note.  Notice how virtually every line is commented.  I even have comments before code blocks that do a certain thing.  If you break up your code in this way, it will become much easier to read (as well as maintain).  More importantly it will become much more fun to code (since that is the whole point in coding right?).

When I run the program I get the following which is the correct answer.

jason@fitz:~/programming/euler/prob001$ make
nasm -w+all -f elf -l prob001.lst prob001.asm
ld -m elf_i386 printj.o prob001.o -o run
jason@fitz:~/programming/euler/prob001$ ./run
233168

Jason Torola

Why Learn Assembly Programming

Posted by portablecode on February 17, 2012
Posted in: Uncategorized. Tagged: asm, assembler, assembly, assembly programming, Jason Torola, x86, x86 asm, x86 assembler, x86 programming. Leave a Comment

This is a question I do not like.  Actually I do not like most questions that begin with “Why”.  The reason is, this question, and most that begin with “Why” are usually asked with disdain.  “Why would you want to do something like that?”  “Why do you want to build a Linux kernel?” “Why do you care how many cycles an instruction takes?”

These are not nice questions.  They are not meant out of curiosity.  They are meant to say what you are doing or asking is beneath them.

However, if this is a question that is just a question, then it is a great question.  It is a question that hasn’t been answered for me until just recently.  I read an article, I believe it was written by a man named Randall Hyde.  Randall essentially said the following. I am unable to find the article so I will summarize.

“It used to be that when you wanted a faster, more efficient program, you wrote it in assembly language.  With the advent of better and more efficient compilers, this was no longer the case.  It got to the point where compilers where better at doing things than most programmers where.  These old assembly programmers said it was no longer useful to program in assembly anymore.”

“The thing is, these guys already knew assembly.  They already reaped the benefits from learning the language.  When a new generation came in and listened, they never got the knowledge that learning assembly gives.”

Learning assembly I got a much deeper knowledge of data structures, memory, and architecture.  Using this knowledge, I was able to have a much deeper understanding on how to write better data structures in higher level languages.  I had much better memory management in my higher level languages. And I was much more careful about the code I was writing.

Code was not a mystery anymore.  I knew what it was underneath.  I knew what C and C++ statements turned into.  I knew what happened when you called a routine, and I knew how that routine was called.  I learned a lot about the stack and how it worked. I am a better programmer because of Assembly.

When I find the article I will post it.

Jason Torola

hello, world

Posted by portablecode on November 30, 2011
Posted in: coding example. Tagged: asm, assembler, assembly, assembly programming, Jason Torola, x86, x86 asm, x86 assembler, x86 programming. 1 comment

One of the first things a programmer looks for when starting a new language, or when they have written their very first computer program is visual results. We want something we can see, we want to know our programs are doing something.  That is why ‘hello, world’ programs exist.

x86 assembler is no different.  The first thing I did was to attempt to print characters to my terminal (‘hello, world’ in fact).  Let’s look at what I wrote.

; Written using nasm
; assembled and linked using the following commands
; nasm -f elf hello.asm
; ld -m elf_i386 -o run hello.o

section .data
    string  db  'hello, world',10
    strlen  equ $-string
section .bss

section .text

    global _start

_start:
    nop
;   Save regs that will be changed
    pushad
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ;  begin logic                                  ;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

    mov    eax,4             ; sys_write
    mov    ebx,1             ; STDOUT
    mov    ecx,string        ; output buffer
    mov    edx,strlen        ; buffer length
    int    080h

    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ; end logic                                     ;
    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Exit:
    nop
;   Restore regs that were used
    popad
    mov    eax,1             ; sys_exit
    mov    ebx,0             ; Set RC = 0
    int    80h               ; and return

One of the first things you should notice is there are three “sections”; data, bss and text.  Without going into too much detail, essentially the three sections are as follows.

The data section holds your data constants.  Data that is defined in your program and will actually be a part of your executable. So the more stuff you have in this section, the larger your executable will be.

The bss section holds your uninitialized data.  This data will be obtained at runtime. This is useful for when you don’t have data to use yet, but you will when your program runs.  This is useful for buffers.

The text section is where your instructions or “code” goes.  This is where your program does it’s work.

So, how does this program work exactly?  Well it uses something called an interrupt or syscall.  Essentially Linux (Or all operating systems for that matter) have jobs that need special access to do.  Your program doesn’t have access to do these jobs, so we ask the kernel to do them for us.  This is called an interrupt.  This interrupt takes 4 arguments.

    mov    eax,4             ; sys_write
    mov    ebx,1             ; STDOUT
    mov    ecx,string        ; output buffer
    mov    edx,strlen        ; buffer length
    int    080h

  1. When EAX is 4, the kernel knows to do a sys_write, or write output.
  2. When EBX is 1, the kernel knows this is STDOUT or standard output.
  3. The output string is stored in ECX.
  4. The string length is stored in EDX.
  5. We then start the interrupt by using the int 080h instruction.

We have another interrupt at the end of the program


mov    eax,1             ; sys_exit
mov    ebx,0             ; Set RC = 0
int    80h               ; and return

This is similar to the last interrupt, only it takes two arguments

  1. When EAX is 1, the kernel does an exit call
  2. EBX is the return code.  This value is returned to the caller.

And when this program is run, it looks like every other hello, world program ever written.

jason@fitz:~/programming/programs/helloworld$ ./run
hello, world

Jason Torola

Posts navigation

  • Recent Posts

    • Print That Register
    • Actual Work
    • Why Learn Assembly Programming
    • hello, world
  • Archives

    • February 2012
    • November 2011
  • Categories

    • coding example
    • Uncategorized
  • Meta

    • Register
    • Log in
    • Entries RSS
    • Comments RSS
    • WordPress.com
Blog at WordPress.com. Theme: Parament by Automattic.
Follow

Get every new post delivered to your Inbox.

Powered by WordPress.com