首页 > 其他分享 >CS:APP--Chapter03 : machine-level representation of program - part 1 basic(2)

CS:APP--Chapter03 : machine-level representation of program - part 1 basic(2)

时间:2022-12-14 15:12:57浏览次数:60  
标签:register level -- APP move sign zero registers words

CS:APP--Chapter03 : machine-level representation of program - part 1 basic(2)

标签(空格分隔): CS:APP

目录


3.4 accessing information

The only two things programmer can alter are registers and memory via some particular instructions and their operands.

3.4.1 register

keypoint:

  1. 16 general registers
  2. 8 traditional, 8 new
  3. different naming convention

x86-64 contains 16 general registers of 64 bits named
(1): %rax through %rbp: also originate from the 8 registers of 16 bits in 8086;
(2): %r8 through %r15 : another new 8 registers

8086 provides 8 general registers named ax through bp then extends to the 32 bits registers, starting with %e like %eax, in IA32, and finally the 64 bits registers, starting with %r like %rax. Another new eight registers follow the same rules as from ax to %eax and %rax whereas the naming convention is different(%r8-%r8d-%r8w-%r8b [%r8-%15])(%rax-%eax-%ax-%al [%rax-%bp])

3.4.1 registers characteristics

  1. each register serves one particular role in a typical program.
  2. owing to the historic reasons, each 64 bits register can be treated as several pieces of isolate registers of 8,16,32 bits from low to high order.

two important points are highlighted here:

1:
After a \(w\)-bits operation is executed,
a): w = 1 or 2, the remaining high-order 8-w/8 is left unchanged.
b): w = 4, the remaining high-order 4 bytes reset to zilch!
->In conclusion,mov instruction only updates the specific register and memory by the destination operand except in case b).

2:
%rsp -> the pointer to the end position of the run-time stack
(review:pc -> %rip)

3.4.2 operand specifier

It is the specifier which is used to locate the exact address we want.0x86-64 supports these various operand specifiers shown below which can be classified into their parts:(1)immediate number (2)register (3)memory.

form: code representation -> actual mathematic meaning

1. immediate number

$Imm => Imm (the number itself is)

2. register

\(r_a\) => R[\(r_a\)] (the value in the particular register)

3. memory

operand description
Imm \(M[Imm]\)
(\(r_a\)) \(M[r_a]\)
Imm(\(r_a\)) \(M[r_a+Imm]\)
(\(r_a\),\(r_b\)) \(M[R[r_a]+ R[r_b]]\)
Imm(\(r_a\),\(r_b\)) \(M[R[r_a] + R[r_b]+Imm]\)
(\(r_a\),\(r_b\),s[^1]) \(M[R[r_a] + R[r_b]\times s]\)
(,\(r_a\),s) \(M[R[r_a]\times s]\)
Imm(,\(r_a\),s) \(R[r_a]\times s + Imm]\)
Imm(\(r_a\),\(r_b\),s)[^2] \(M[R[r_a] + R[r_b]\times s + Imm]\) : [s must be 1,2,4,6 and \(r_a,r_b\) must be 64 bits register.]
[^1]: scalar -> to scale the previous operand up to scalar times.
[^2]: \(r_a:\)base register \(r_b:\)index register

3.4.3 data movement instructions(focus on all the complement of mov class instructions)

mov class instructions, the most frequently used instruction, here defines the action of copying data from source to destination, and its formats are shown here:

mov s,d from source to destination ;[both have the identical length]

instructions description
movb s,d move byte from s to d
movw s,d move word from s to d
movl s,d move double words from s to d
movq s,d mov quad words from s to d: source operand is only immediate operand that fits two's-complement representation of 32 bits.
movabsq s,d move absolute quad words from s to d: source operand only is the immediate number and destination operand only registers.

*key points:x86-64 just strictly imposes a restriction that a movement instruction cannot have both operands refer to the memory locations.
[personal assumption: There is no circuit for communication between any pair of units in the main memory.]

mov with the extension of data size was introduced right now, But what if we want to implement the zero extension or sign extension scenario such as casting a char variable to int data type?[source can be register and memory whereas destination only registers,]
=>
case 1: zero extension
case 2: sign extension

1. mov + z + suffix instruction : zero extension

movz instruction not only updates the specified register or memory bytes but also reset/fills out the remaining bit with zero.

movz[1][2] : \(R[2]=\) zero_extension( \(R[1]\) or \(M[2]\) )

instructions description
movzbw s,d move zero-extended byte to word
movzbl s,d move zero-extended byte to double words
movzbq s,d move zero-extended byte to quad words
movzwl s,d move zero-extended word to double words
movzwq s,d move zero-extended word to quad words

Explanation:

  1. why is there the absence of the case of casting 4 bytes of data to register of 8 bytes?
    As we said before, any 4 bytes operands lead to change the remaining bits to zero.No need to create a duplicated instruction. But less than 2 words operands left the remaining bytes unchanged.

2. mov + s + suffix instruction : sign extension

movs is similar as movz except the extension bit is the significant bit(also named as sign bit)

movs[1][2] : \(R[2]=\) sign_extension( \(R[1]\) or \(M[2]\) )

instructions description
movsbw s,d move sign-extended byte to word
movsbl s,d move sign-extended byte to double words
movsbq s,d move sign-extended byte to quad words
movswl s,d move sign-extended word to double words
movswq s,d move sign-extended word to quad words
movslq s,d move sign-extended double words to quad words
** cltq(NO OPERAND!)

3.4.4 one example of data movements

keypoint: reference and dereference

reference : &
dereference : $Imme , R[(Memory)] , %rax

3.4.5 the meaning of the left-hand and right-hand sides of one assignment

3.4.6 pushing and popping stack data

A stack is a data type where values can be added or deleted but only according to a "last-in,first-out" discipline.

The stack grows downward in the memory such that the top element of the stack has the lowest address of all stack elements.

1. top,pushq,popq

The stack can be implemented by an array, the top of the stack is the end of the array, and that's the reason why the stack grows upward.

2.A particular way to access stack

Despite the particular data type of stack, it also is implemented within the main memory which can be accessed by the standard memory addressing method.

Treat the program stack in the way we usually access an array.

3.5 arithmetic and logic operation

The operation is divided into four categories:

1)load effective address 2)unary 3)binary 4)shift

where unary instruction only has one operand but binary has two operands.

3.5.1 load effective address


instruction : leaq

why suffix is q??=> effective address in 0x86-64 is the 64-bit length.

In addition to loading the effective address, leaq can perform some direct and simple arithmetic operations such as a combination of addition and multiplication based on the standard addressing model, for example:

leaq 3(%rax,%rax,4),%rax
=>
%rax = 4*%rax+%rax + 3 = 5*%rax+3

It appears to be the synonym of & in C language.

3.5.2 unary and binary operations

the main difference between them is the number of operands they have, unary with one operand whereas binary with two operands.

3.5.3 shift operation

The principle of shift operation applied to multiplication should be divided into logic and arithmetic operations according to the mathematical properties.

(unsigned: logic and arithmetic->zero extension)
(sign: even though the positive number is performed as same as logic, for the negative number, only arithmetic shift works -> sign extension )

One thing needs to be mentioned here: generally speaking, the value of determining the length of a shift operation is specified directly by A specific number, but is the source from the register?

the shift amount is derived from the suffix the shift expression holds if the suffix is w, and the length of a word is 16, which can be represented by 4 bits. It can not only prevent it from shifting out of the boundary but also ensure the correctness of the shift amount.

3.5.4 special arithmetic operations

Unlike addition and subtraction, multiplication and division may cause the overflow if the maximum length of operands is up to quad words with 64 bits, because the result may only is represented by a 128 bits(16 bytes) format denoted by oct.

mult and div

标签:register,level,--,APP,move,sign,zero,registers,words
From: https://www.cnblogs.com/UQ-44636346/p/16982196.html

相关文章

  • cloudpickle —— Python分布式序列化的专用模块
    给出cloudpickle的GitHub地址:https://github.com/cloudpipe/cloudpickle    ======================================================= ......
  • 组策略中软件限制策略设置为所有软件都不允许后,无法登录桌面的解决办法
    今天,由于不当操作,以为该设置只针对用户安装的软件不针对系统软件,所以设置成所有软件都不允许结果重启电脑登录后无法显示桌面无法再次启动组策略恢复设置  通过思考......
  • 服务雪崩效应
    在微服务架构系统中通常会有多个服务,在服务调用中如果出现基础服务故障,可能会导致级联故障,即一个服务不可用,可能导致所有调用它或间接调用它的服务都不可用,进而造成整个系......
  • SiteFactory支持一键粘贴
    ​ ueditor粘贴不能粘贴word中的图片是一个很头疼的问题,在我们的业务场景中客户要求必须使用ueditor并且支持word的图片粘贴,因为这个需求头疼了半个月,因为前端方面因为......
  • 服务熔断器Hystrix
    了解服务雪崩效应产生的原因和应对的策略。熟悉Hystrix的使用及其工作原理。熟悉如何在Feign中使用Hystrix进行服务降级。熟悉HystrixDashboard和Turbine的使用。 ......
  • c++ 部署libtorch时对Tensor块的常用操作API
    一、前言使用pytorch可以很方便地训练网络,并且pytorch的官方网站中给出了很全的python对tensor的操作接口API,但是在部署libtorch的时候,c++对tensor的操作接口API资料甚少,......
  • 读书笔记-阿里巴巴Java开发手册-常用的命名风格
    命名风格强制类型不可以用​​_​​​或者​​$​​开始或者结束严禁使用拼音和英文混写类名使用UpperCamelCase风格方法名,参数名,成员变量,局部变量都统一使用lowerCamelCase......
  • LeetCode-Java-872. Leaf-Similar Trees
    题目Consideralltheleavesofabinarytree.Fromlefttorightorder,thevaluesofthoseleavesformaleafvaluesequence.假装有图Forexample,inthegiven......
  • 剑指Offer-Java-二叉树的镜像
    题目题目描述操作给定的二叉树,将其变换为源二叉树的镜像。输入描述:二叉树的镜像定义:源二叉树8/\610/\/\57911......
  • 16-咸鱼学Java-内部类补充
    上一篇文章,说了实例内部类和静态内部类,这篇文章重点说明,本地内部类和方法内部类本地内部类也叫本地方法内部类,局部内部类。指在一个方法内定义的类,只有在当前方法中才能对局......