|
Intro to Intel Assembly Language: Part 2
|
|
03-14-2010, 03:29 AM
(This post was last modified: 03-14-2010 03:32 AM by se7en.)
Post: #1
|
|||
|
|||
|
Intro to Intel Assembly Language: Part 2
Intro to Intel Assembly Language: Part 2
First read this Intro to Intel Assembly Language: Part 1 With my last tutorial I showed you how to move bytes around in the registers with the mov, movzx, and movsx instructions. That's great, but it doesn't do much, especially since you can't even touch memory. In this tutorial I'm going to teach you the basic addressing modes and how to use them, and when you can't. Warning: I'm going by IA32 architecture here. Most of this will apply to IA64 as well, but not everything. When I remember I'll point out the differences, but we won't hit the first for another tutorial or so. Don't even try this on a 16-bit processor, it'll explode. Another thing I forgot to mention: I'm using Intel syntax for this. AT&T syntax reverses the order of the operands and obfuscates addressing for the programmer who uses it, in order to make life easier for the clown who writes the compiler for it. Memory Models The modern Intel processor can be programmed to view memory in several different ways - two and a half, actually. Each of these memory models dictates an algorithm for how the processor accesses memory. The two and a half models are: Segmented (Virtual 8086 Emulation Mode) Designed to emulate the 16-bit 8086; this is also known as real mode. All Intel processors start in this mode when they power up; it's the operating system's job to switch into protected mode later. (Hold on...I'll explain in a sec.) Without getting into too much detail, real mode is quite bare as far as modern processor features go - no memory protection, paging, multitasking, not even instruction privilege levels. There's no distinction between user code and operating system code here. What does this have to do with addressing? Well, the processor behaves like an 8086 in one key way: the way addresses are calculated. The 8086 view of memory is segmented, i.e. memory is divided into blocks of up to 64K each. (I say up to because the programmer could mess with it to make it smaller.) Because the 8086 was a 16-bit processor, this meant that a memory pointer would consist of a 16-bit segment and a 16-bit offset. Oops, the designers thought. That'd give us access to 4GB of memory! (Remember, this was at least five years before Bill Gates allegedly made his infamous 640K comment.) So what did they decide to do? Scrunch the 32 bits into 20. How? Like this: ADDRESS = (SEGMENT << 4) + OFFSET Immediately we have problems here...this means that 0000:1000, 0001:0FF0, 0002:0FE0, ... 0010:0000 all point to the exact same byte. In fact, certain bytes on segment boundaries can have up to 4096 aliases! No wonder that didn't last too long as the main addressing scheme. Flat Model In this mode, memory is treated like one honking big array of bytes. No segments, no shifting, no duplicate addresses - just pure and simple. It requires some more work on the part of the operating system and the processor designers to get this to work on a multitasked system, though, because if unmodified that would mean that application A could easily overwrite application B, maliciously or accidentally. Paged Model (Protected Mode) (Told you I'd explain.) Protected mode is halfway between the flat model and the segmented model. Memory is segmented, but the segment (still 16 bits even today) merely points to a table that contains a base offset into memory, as well as protection bits describing what is contained in the segment (code or data), access rights (read, write, execute) and the privileges needed to access the segment. DEP and other high-level features use this. For example: 49C0:8598E31F --> processor looks at the segment table entry 49C0 and finds: PRIVILEGE: 3 (this is a user-level segment) ACCESS: RW TYPE: DATA BASE: 0x2138 LIMIT: 0x80000000 (highest valid address) The final address sent to the bus is (0x00002138 << 4) + 0x8598E31F. Notice that we still have the shifting going on, but since we're not restricted to 16 bits anymore, addresses don't wrap around like they used to, so we can access in theory up to 64GB, if the address bus is wide enough to allow it. Even then, some operating systems won't be able to handle the extra memory because they use pure 32-bit pointers. Accessing Memory The IA32 architecture supports about eight distinct addressing modes. As you've probably guessed, these are different ways of accessing memory. You can choose to use one or another depending on what best suits your application. Immediate Addressing 16-bit? Yes (16-bit addresses only) | 32-bit? Yes | 64-bit? Yes This is by far the simplest - moving data to and from a hard-coded address. Hard-coded addresses are typically found in BIOS interrupt routines and firmware, where code and data can be relied upon to be where they need to be. For this kind of access, the default segment is specified by the DS register unless explicitly specified otherwise. This is important to remember in real and protected mode, as it can mean the difference between your application working or overwriting something it shouldn't and crashing. Code: ; *((uint32_t *)0xDEADBEEF) = eax;Did you notice something strange with the third example? What's with the WORD PTR stuff? This indicates to the compiler that we intend to represent 1337 in 16 bits, as opposed to 32, 64, or 80 bits. (Yes, 80. I'll get to that much later.) I didn't have to do this in either of the first two examples because the compiler knows the sizes of the registers; since the mov instruction requires operands to be the same size, it can figure everything out. But 1337 could be 0x0539, 0x00000539, or 0x0000000000000539, for all it knows. Hence we tell it that we want 16 bits. We just as easily could've put: Code: mov DWORD PTR [0x01234567], 1337This would force 1337 to be represented as a 32-bit integer. The size directives are: BYTE PTR (8 bits) WORD PTR (16 bits) DWORD PTR (32 bits) QWORD PTR (64 bits) TBYTE PTR (80 bits - only used in floating-point code.) *Belated side note: Intel assembly language is entirely case-insensitive, i.e. mov, Mov, and MOV are all the same thing. Register-Indirect Mode 16-bit? Somewhat (Fewer registers than on IA32/IA64) | 32-bit? Yes | 64-bit? Yes Register-indirect mode is like using a pointer variable in C/C++; the register serves as the offset, and the corresponding default segment register is the segment, unless overridden. The default segments are: EAX,EBX,ECX,EDX,ESI,EDI --> default to DS EBP,ESP --> default to SS Note that on 16-bit processors, you are limited to only BX, BP, SI and DI. The default segment registers are still the same. Code: ; eax = (ss << 4) + ebxWait...why can't we copy data from one memory location to another? Unfortunately, no. Initially it was because of architecture limitations, and later the way the instructions are encoded, that simply don't allow this. You have to copy from memory into a register, and then from that register back out to memory. Code: mov eax, [esi]I think this tutorial is long enough already (plus I'm sleepy). Next time I'll show you some more addressing modes and some new instructions to use. Right now you'll just have to content yourself with the fact that you can move bytes around in memory now. Written By - dargueta |
|||
|
03-14-2010, 04:01 AM
Post: #2
|
|||
|
|||
|
RE: Intro to Intel Assembly Language: Part 2
This second part is harder to understand than first !
There's a fine line between genius and insanity. I have erased this line. Oscar Levant There's a fine line between an administrator and black hat hacker. I have erased this line. Dr DEBCOL |
|||
|
03-14-2010, 04:05 AM
Post: #3
|
|||
|
|||
| RE: Intro to Intel Assembly Language: Part 2 | |||
|
03-14-2010, 04:10 AM
Post: #4
|
|||
|
|||
RE: Intro to Intel Assembly Language: Part 2
(03-14-2010 04:05 AM)se7en Wrote:Yeah some things (even in computing) can be abstract.(03-14-2010 04:01 AM)drdebcol Wrote: This second part is harder to understand than first ! There's a fine line between genius and insanity. I have erased this line. Oscar Levant There's a fine line between an administrator and black hat hacker. I have erased this line. Dr DEBCOL |
|||
|
03-14-2010, 04:12 AM
Post: #5
|
|||
|
|||
RE: Intro to Intel Assembly Language: Part 2
(03-14-2010 04:10 AM)drdebcol Wrote:(03-14-2010 04:05 AM)se7en Wrote:Yeah some things (even in computing) can be abstract.(03-14-2010 04:01 AM)drdebcol Wrote: This second part is harder to understand than first ! yes you are right! |
|||
|
08-12-2011, 02:05 AM
Post: #6
|
|||
|
|||
|
RE: Intro to Intel Assembly Language: Part 2
NOTE: This tutorial was copied without authorization from CodeCall.net:
http://forum.codecall.net/assembly-tutor...t-2-a.html If you have questions, you are requested to go there, as this is my primary account and I can answer all of them there. Thanks, dargueta |
|||
|
« Next Oldest | Next Newest »
|





