The Art of Assembly Language, 2nd Edition

Randall Hyde

Mentioned 3

Presents an introduction to High Level Assembler, covering such topics as editing, compiling, and running HLA programs; declaring and using constants; translating arithmetic expressions; and converting high-level control structures.

More on Amazon.com

Mentioned in questions and answers.

I've been programming for about 11 years by now, and used a lot of different programming languages ranging from Python to C.

However, what I'm ashamed of is that I'm still missing a lot of the lower-level basic knowledge on which all of this is built on:

  • How exactly are stack and heap of executables built up and how do they work

  • How does a CPU work

  • What is a clock cycle

  • What is a data bus

  • How do north and southbridge on my motherboard work

  • Low level binary logic / calculations

Those are just examples, what I'm searching for is some good introduction on this, as I feel that this is simply required knowledge to become a good programmer.

I'm sure there are online resources for this type of thing, but this is also pretty nicely covered in a Computer Architecture course like this one. I also rather liked the book for that course.

However, it didn't really cover enough of the practical x86 side of things for my liking (we designed a MIPS processor and wrote assembly code for it and eventually a C compiler for it).

To fill in the gaps for what was different between our contrived example and my actual machine, I suggest the Windows Internals book. And possibly taking an OSR course.

If you're more on the Linux side, there are similar courses and books.

Two suggestions.

Some books:

Windows Internals (though not all info applies to other OS'es, obviously)

Write Great Code: Volume 1 (and perhaps subsequent volumes)

The Art of Assembly Language (ties in with 2nd suggestion)

Learn assembly language:

Assembly language is very low-level. In fact, its just a human-readable form of machine code (the ones and zeros, that CPU's understand). To understand assembly language, you must understand the low-level workings. This is because very little (if anything) is automatically managed for you, unlike in higher level languages like C# and Java.

In my opinion the best way to learn it by having fun. Learning compilers, system design and architecture is a lot of fun working with micro-processing interfacing. So my suggestion is to start to get hands on with an Atmel AVR kit or Motorola MSP kit. Another starting point is to make a micro simulator in any language of your preference and simulate the SRC Simple RISC computer this material, which is from this book.

This is the project I made in class using an MSP430, again it was a lot of fun.

Everything I've seen on *nix has been a set of abstractions off hardware, but I'm curious as to how the hardware works.
I've programmed in assembly, but that's still only a set of abstractions.

How does a processor understand assembly opcodes (as bytecode)?
How do device drivers work (with an explanation at a lower level (of abstraction))?

"The Art of Assembly" is a good, yet kind of outdated book with explanations on pretty much everything hardware and low-level. You should give it a read.

It's available legally online and in print form.

The book, online

On amazon

EDIT: Commenter Samoz mentions a new edition, so now it's probably up to date!

I was reading The Art of Assembly Language (Randall Hyde, link to Amazon) and I tried out a console application in that book. It was a program that created a new console for itself using Win32 API functions. The program contains a procedure called LENSTR, which stores the length of string in the EBP register. The code for this function is as follows:

LENSTR PROC
ENTER 0, 0
PUSH  EAX
;----------------------
CLD
MOV   EDI, DWORD PTR [EBP+08H]
MOV   EBX, EDI
MOV   ECX, 100 ; Limit the string length
XOR   AL, AL
REPNE SCASB ; Find the 0 character
SUB   EDI, EBX ; String length including 0
MOV   EBX, EDI

DEC   EBX
;----------------------
POP   EAX
LEAVE
RET   4
LENSTR ENDP

Could you explain the usage of the enter and leave commands here?

This is the setup for the stack frame (activation record) for the function. Internally it normally looks something like this:

push( ebp );         // Save a copy of the old EBP value

mov( esp, ebp );     // Get ptr to base of activation record into EBP

sub( NumVars, esp ); // Allocate storage for local variables.

Then when the stack frame is to be destroyed again, you have to do something along the following lines:

   mov( ebp, esp );    // Deallocate locals and clean up stack.

   pop( ebp );         // Restore pointer to caller's activation record.

   ret();              // Return to the caller.

Here is a better explanation of it using HLA. Though it is well explained in the book you're reading, as I have that book too, and I've read the section explaining it.