首页 > 其他分享 >CS:APP--Chapter02 : representing and manipulating information

CS:APP--Chapter02 : representing and manipulating information

时间:2022-12-31 19:57:00浏览次数:58  
标签:information -- APP number unsigned vec bit data two

CS:APP--Chapter02 : representing and manipulating information

标签(空格分隔): CS:APP

目录


Most computers now store and process information with binary/two-valued singals/data.

For the representation of information/numbers,we consider Three most important representations of numbers:

  1. unsigned
  2. two's-complement
  3. float-point

succeeding by the details of arithmetic operation in which we indeed understand why our codes can generate some peculiar results and learn to optimize our codes.

1.information storage

First of all,byte,the smallest addressable unit of memory,consists of 8 bits,and main memory(later virtually described by virtual address space) are treated as an array of bytes (like a linear list).

The softwares running on a computer can be treated as one program,so all the information we need to consider are the program objects among these softwares.

->It simply treats each program object as a block of bytes and program as a sequence of bytes.

Then

->Each byte in main memory is labelled by a unique number,known as its address.

keyword:[1.virtual address space 2.byte:8bits 3.bit->addressed along virtual address space]

1.1 hexadecimal notation

Even if computers treat data as a run of binary numbers, it is still so tedious to understand these 0 and 1 bunches by humankind.
Hexadecimal can be handily converted into binary, vice-versa.

There are some special rules for mapping hexadecimal to decimal and binary respectively.
(For 0-9, hexadecimal is same as decimal does in terms of the representation and mapping to binaray.)

hexadecimal decimal binary
A 10 1010
B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111

Conversely, converting a string of binary to hexadecimal is to divide it into groups of 4 bits each. Then if the length of the binary is not a multiple of 4 resulting in the absence of the beginning elements of the leftmost group, just padding the number with leading zeros. (zero extension:perserve itself value.)

1.1.1 hexadecimal representation and addition

0x1111
0x1234

Technique: preceding the hexadecimal by 0x

Addition:
carry it as same as a decimal does! add 1 to the next high-order bit.

keywords:[1.hexadecimal 2.hexadecimal representation 3,conversion between H and B]

1.2 data sizes

1.2.1 word and the range of virtual address memory

As for the most important parameter in a computer,word size defines how far our computer can reach our computer. That mean, if one computer defines a word as w bits, its virtual address automatically ranges from 0 to \(2^w-1\) with the capacity of \(2^w\) bytes.

As a result of different word sizes, programs always are labelled by 32-bits or 64-bits.64 bits computer can execute both programs but 32-bits computer is compatible with only 32-bit program.

1.2.2 multiple data sizes of number

computer and compilers supports various data size to effectively represent integer and float.

often data size depends on how the program is compiled.So let's shed light on the data size defined in C languagae.

(C only define the minimum size of each data type.)



Different size on different computer may result in bug because of compatibility of data type.
In oorder to standardize data size,C also provides fix-sized data type denoted as int32_t,int64_t and so on.

1.2.3 addressing and byte ordering(寻址和字节排序)

Like the same two questions when we is confronted to how to restore data we fetch and is written into memory:

  1. where is these data? ----addressing
  2. how we will order the bytes ?(my way:how to find them correctly?) -- byte ordering
a.addressing

Generally speaking,a multi-bytes object occupies a block of continuous units/bytes in virtual address space,with the address of the objects given by the smallest address of the unit/bytes used.

b.byte ordering

There are two conventions named as big endian && little endian.
Given a bit-vector of length \(w,[x_{w-1},x_{w-2},...,x_0]\)

1.little endian:restore this number in memory ordered from the lest significant byte to the least.
2.big endian:restore this number in memory ordered from the most significant bytes to the least.

One problem:
byte ordering becomes an issue:

  1. data communicated over the network between the little-endian and big-endian devices.
  2. ?
  3. ?

1.3 logic algebra

It's hard to imagine how the binary can perform the function such addition,division,multiplication,subtraction and so on.

->A rich body of mathematical knowledge has evolved around the study of the value 1 and 0.

There are four basic operations and truth tables in logic algebra:

1.\(\neg\)

\(\neg\) 0 1
- 1 0

2.\(\&\)

\(\&\) 0 1
0 0 0
1 0 1

3.\(|\)
|\(|\)|0|1|
|--|--|--|
|0|0|1|
|1|1|1|

4.^

^ 0 1
0 0 1
1 1 0

1.4 bit-level operations in C

| & ~

Apart from logic operation bit by bit,mask operation is always used in these two operation:

1.preserve some pattern

perserve the three high-order bits
&:1110000...0000
|:0001111...1111

2.extract some pattern

1.5 logic operation in C

&& || !

1.6 shift operation in C

>>
<<

2. integer representation

Now we move on to the representation of integers which is divided into two types:

  1. unsigned integer
  2. sign(two's-complement) integer

2.1 unsigned integer

assume \(\vec{x}\) is a vector of bits beginning from right with the index of 0 to the left with the index w-1.,shown as \([x_{w-1},x_{w-2},...,x_1,x_0]\).

\(B2U_w(\vec{x})=\sum^{i}_{i=0}x_i2^i\)

\(U_{max}=\frac{1(1-2^n)}{1-2}=2^n-1\)

We will see later what I have recorded here can be closely related to the its mathematical properties and its real-lift implementation in assembly language.

2.2 signed integer

For non-negative number encoded by unsigned way,then Two's-complement way can lead to the encoding of zero,negative and positive number within a more narrow range in comparsion with unsigned encoding with vecttor x of the same length w.

\(B2T_w(\vec{x})=(-1)\cdot x_{w-1} +\sum^{w-2}_{i=0}x_i2^i\)
\(TMax_w=2^{w-1}-1\)
\(TMin_w=-2^{w-1}\)

The range of positve number extends one more number than the range of negative number in one two's complement.

\(TMax_w+TMin_w=1\)

2.3 the conversion between signed and unsigned number

The problem can be separated into three parts:
1): both are positive number
2): two's complement is for a negative number
3): positve or negative zero

case 1: there is no any difference between each bit representation so \(B2T_w(\vec{x})=B2U_w(\vec{x})\),case 1 is only valid when number is in the range from 0 to \(TMax_w\)

case 2:
2.1 Unsign->signed
a unsigned number is greater than \(TMax_w\) then conveted into signed number,For a bit vector \(\vec{x}\)

\(U2T_w(\vec{x})=B2T_w(U2B(\vec{x}))=x-2^w\)

2.2 signed->unsigned
\(T2U_w(\vec{x})=x+2^w\)

case 3:
positive and negative zero ~~~~ not understand yet!

All of these three cases can be conclude as one formula:

\[U2T_W(\vec{x})= x - x_{w-1}\cdot 2^w $$\]

T2U_w(\vec{x})=x+x_{w-1}\cdot 2^w

\[ ### 2.4 expanding and shrinking the bit-level representation of a integer #### 2.4.1 expanding the bit represenation its mathematical properties tells us that ,whatever the bit representation varies ,the number itself would perserve always. "Expanding" here means making more high-order bits available in the bit represenation.for example,[10]->[0010].So one question comes here soon,what criteria can we depend on to make sure the expanding bits? ##### two genrics of expansion 1. zero expension : stretch to left more with all bit zilch 0. 2. sign expansion : stretch to left more with all sign bit(the significant bit) the later section of assembly directives will uncover the practical implmentation on the hardware. we already mentioned the match between its mathematical properties and its encoding,let make the detail illustate more for us ! #### expansion of an unsigned number by zero extension we suppose here a vector x with the expension by zeo extension: $B2U_{w+i}(\vec{x})=B2U_w(\vec{x})$ #### expansion of a signed number by sign extension $B2T_{w+i}(\vec{x})=(-1)\cdot x_{w+i-1}+\sum^{w+i-2}_{i=0}=B2T_w(\vec{x})$ sign expansion can fulll maintain and perserve the original value itself. ### 2.5 truncate integer Shoterening w bits down to w-i bits raises the question of what we should do when truncating a number? modular operation makes it by dividing by the n-th power to 2. ### 2.6 intuitive behavior - implicitly cast signed number to unsigned number implicitly cast signed number to unsigned number when c handles an expression with either operand is an unsigned number. ## 3. integer arithmetic The nature of finite computer resources may result in some peculiar outcomes like positive addition producing a negative number. Here we mostly get used to starting with the simplest part : unsigned arithmetic then moving on to the more difficult parts of signed parts. **Notes:** Overflow is a significant problem that the capacity of the data type cannot hold the number to be restored here. ### 3.1 unsigned addition principle : $x+^u_w y = \left\{ \begin{aligned} x+y,x+y\leq UMax_w\\ x+y-2^w,x+y>UMax_w\\ \end{aligned} \right.$ Question: no idea on the underlying principle of integer addition on computer. ### 3.2 unsigned negation $$-^u_wx= \left\{ \begin{aligned} x,x=0\\ 2^w-x,x>0 \end{aligned} \right.\]

3.3 two's complement addition

\[x+^{u}_{w}y=\left\{ \begin{aligned} x+y-2^{w},x+y>Tmax_w\\ x+y,0\leq x+y \leq TMax_w\\ x+y,TMin_w\leq x+y < 0\\ x+y+2^{w},x+y<TMin_w\\ \end{aligned} \right.\]

3.4 two's complement negation

\[-^t_wx= \left\{ \begin{aligned} Tmin_w,x=Tmin_w\\ -x,x>TMin_w\\ \end{aligned} \right.\]

标签:information,--,APP,number,unsigned,vec,bit,data,two
From: https://www.cnblogs.com/UQ-44636346/p/17017159.html

相关文章

  • Allure10-测试环境信息与趋势信息
    测试环境信息测试环境信息无法通过allure特性实现,需要借助环境配置文件配置文件名必须是environment.properties文件必须放在allure生成的结果数据目录中才能生效文......
  • ARM 汇编指令
    ARM汇编指令格式如下  每一条汇编语句都可以转为32bit的数字 <c>:可选,不写表示无条件执行。举例:ADDEQ表示CPSR.Z等于1时执行ADD指令         ......
  • markdown的基本操作
    Markdown学习标题:二级标题三级标题字体加粗:Hello,World!倾斜:Hello,World!加粗倾斜Hello,World!划掉Hello,World!引用啦啦啦,小陈来学Java了分割线图片......
  • 2022年终总结
    前言一年了,一年没更新博客了,今年似乎变懒了很多,但是今年还是有很多值得写的东西,那么就趁着新年交界之际,简单对今年的生活和工作做一个总结吧。封在家的上半年时间回到3......
  • Pandas 空值数据的索引 位置 行号
    前言Pandas中的空值是非常多的,这体现了数据搜集的一个不可避免的方面。由于某些不可抗力的原因,例如用户授权,数据源数据格式的不同,会造成许多空值零散的遍布在数据中的各个......
  • Allure11-总结
    allure特性非动态特性@allure.epic、@allure.feature、@allure.story、@[email protected]、@[email protected][email protected]......
  • mp3 批量修改文件名和音乐标签
    批量修改文件名首选BulkRenameUtility,可以快速插入修改文件名,并且后面有预览,确定没有问题就点击下方的rename即可。音乐标签MusicTag,点击工具栏靠后的文件名相关,就......
  • Jetpack Compose 加载 Drawable
    DrawablePainterAlibrarywhichprovidesawaytouseAndroid drawables asJetpackCompose Painters.ThislibraryattemptstosupportmostDrawableconfigu......
  • 行止
    离别不是别离当对于离别尚未麻木的我面对别离时,还是会被那一盏孤灯所发出的微光刺痛双眼,扭头片刻间,早已泪流满面。星光黯淡,可能是因为城市的缘故,也可能是因为双眼已朦胧......
  • VuePress教程
    下面全部操作都基于VuePress1.X[1]VuePress初体验创建一个文件夹,博主就创建”VuePress“进入VuePress目录执行下面命令yarninit#npminit#安装VuePressyar......