https://itanium-cxx-abi.github.io/cxx-abi/abi.html#member-pointers
2.3 Member Pointers
2.3.1 Data Member Pointers
The basic ABI properties of data member pointer types are those of ptrdiff_t.
A data member pointer is represented as the data member's offset in bytes from the address point of an object of the base type, as a ptrdiff_t.
A null data member pointer is represented as an offset of -1.
Note that by [dcl.init], "zero initialization" of a data member pointer object stores a null pointer value into it. Under this representation, that value has a non-zero bit representation. On most modern platforms, data member pointers are the only type with this property.
Base-to-derived and derived-to-base conversions of a non-null data member pointer can be performed by adding or subtracting (respectively) the static offset of the base within the derived class. The C++ standard does not permit base-to-derived and derived-to-base conversions of member pointers to cross a virtual base relationship, and so a static offset is always known.
Data member pointers that identify members of their class will always store non-negative offsets. Unfortunately, it is possible to apply conversions to a non-null data member pointer that will cause it to hold a negative offset. If this value is -1, the member pointer will subsequently be treated as a null pointer. This is considered an irreparable defect in this ABI. Recommendation for new platforms: consider using a different representation for data member pointers, such as left-shifting the offset by one and using a non-zero low bit to indicate a non-null value.
It is relatively difficult to demonstrate this defect in well-defined code. It is possible to convert a member pointer to a derived class and then cast it back it to a different base class; if the second base class is stored after the first, the resulting offset will be negative. However, this cast has undefined behavior because the member is no longer a member of a base or derived class of the member pointer's class. To demonstrate the defect, either an empty base class or an empty data member must be involved. For example:
struct alignas(2) B1 {};
struct B2 : B1 {};
struct B3 : B1 {};
struct D : B2, B3 {
char a, b;
};
// The offset in D of the B3 base subobject is 2, but the
// offset of the data member b is 1.
auto mptr = static_cast<char B3::*>(&D::b);
2.3.2 Member Function Pointers
Several different representions of member function pointers are in use. The standard representation relies on several assumptions about the platform, such as that the low bit of a function pointer to a non-virtual member function is always zero. For platforms where this is not reasonable to guarantee, an alternate representation must be used. One such representation, used on the 32-bit ARM architecture, is also described here.
In all representations, the basic ABI properties of member function pointer types are those of the following class, where fnptr_t is the appropriate function-pointer type for a member function of this type:
struct {
fnptr_t ptr;
ptrdiff_t adj;
};
A member function pointer for a non-virtual member function is represented with ptr set to a pointer to the function, using the base ABI's representation of function pointers.
In the standard representation, a member function pointer for a virtual function is represented with ptr set to 1 plus the function's v-table entry offset (in bytes), converted to a function pointer as if by reinterpret_cast<fnptr_t>(uintfnptr_t(1 + offset)), where uintfnptr_t is an unsigned integer of the same size as fnptr_t.
In both of these cases, adj stores the offset (in bytes) which must be added to the this pointer before the call.
In the standard representation, a null member function pointer is represented with ptr set to a null pointer. The value of adj is unspecified for null member function pointers.
The standard representation relies on some assumptions which are true for most platforms:
The low bit of a function pointer to a non-static member function is never set. On most platforms, this is either always true or can be made true at little cost. For example, on platforms where a function pointer is just the address of the first instruction in the function, the implementation can ensure that this addresss is always sufficiently aligned to make the low bit zero for non-static member functions; often this is required by the underlying architecture.
A null function pointer can be distinguished from a virtual offset value. On most platforms, this is always true because the null function pointer is the zero value.
The offset to a v-table entry is never odd. On most platforms, the size of a v-table entry is even because the architecture is byte-addressed and pointers are even-sized.
A virtual call can be performed knowing only the addresss of a v-table entry and the type of the virtual function. On most platforms, a v-table entry is equivalent to a function pointer, and the type of that function pointer can be determined from the member pointer type.
However, there are exceptions. For example, on the 32-bit ARM architecture, the low bit of a function pointer determines whether the function begins in THUMB mode. Such platforms must use an alternate representation.
In the 32-bit ARM representation, the this-adjustment stored in adj is left-shifted by one, and the low bit of adj indicates whether ptr is a function pointer (including null) or the offset of a v-table entry. A virtual member function pointer sets ptr to the v-table entry offset as if by reinterpret_cast<fnptr_t>(uintfnptr_t(offset)). A null member function pointer sets ptr to a null function pointer and must ensure that the low bit of adj is clear; the upper bits of adj remain unspecified.
A member function pointer is null if ptr is equal to a null function pointer and (only when using the 32-bit ARM representation) the low bit of adj is clear.
Two member function pointers are equal if they are both null or if their corresponding values of ptr and adj are equal. Note that the C++ standard does not require member pointers to the same virtual member function to compare equal; implementations using this ABI will do so, but only if the member pointers are built using the same v-table offset, which they may not be in the presence of multiple inheritance or overrides with covariant return types.
Base-to-derived and derived-to-base conversions of a member function pointer can be performed by adding or subtracting (respectively) the static offset of the base within the derived class to the stored this-adjustment value. In the standard representation, this simply means adding it to adj; in the 32-bit ARM representation, the addend must be left-shifted by one. Because the adjustment does not factor into whether a member function pointer is null, this addition can be done unconditionally when performing a conversion.
A call is performed as follows:
- Add the stored adjustment to the this address.
- If the member pointer stores a v-table entry offset, load the v-table from the adjusted this address and call the v-table entry at the stored offset.
- Otherwise, call the stored function pointer.
举个例子:
#include<typeinfo>
#include<iostream>
#include <cxxabi.h>
using namespace std;
typedef struct {
int a;
int b;
} XX;
typedef struct : public XX
{
int k;
}YY;
#define VALUE_OF_PTR(p) (*(long*)&p)
extern "C" int printf(const char*, ...);
struct A {
virtual void foo() { printf("A::foo(): this = 0x%p\n", this); }
};
struct B {
virtual void bar() { printf("B::bar(): this = 0x%p\n", this); }
};
struct C : public A, public B {
virtual void quz() { printf("C::quz(): this = 0x%p\n", this); }
};
void (A::*pafoo)() = &A::foo; // ptr: 1, adj: 0
void (B::*pbbar)() = &B::bar; // ptr: 1, adj: 0
void (C::*pcfoo)() = &C::foo; // ptr: 1, adj: 0
void (C::*pcquz)() = &C::quz; // ptr: 9, adj: 0
void (C::*pcbar)() = &C::bar; // ptr: 1, adj: 8
#define PART1_OF_PTR(p) (((long*)&p)[0])
#define PART2_OF_PTR(p) (((long*)&p)[1])
int main() {
printf("&A::foo->ptr: 0x%lX, ", PART1_OF_PTR(pafoo)); // 1
printf("&A::foo->adj: 0x%lX\n", PART2_OF_PTR(pafoo)); // 0
printf("&B::bar->ptr: 0x%lX, ", PART1_OF_PTR(pbbar)); // 1
printf("&B::bar->adj: 0x%lX\n", PART2_OF_PTR(pbbar)); // 0
printf("&C::foo->ptr: 0x%lX, ", PART1_OF_PTR(pcfoo)); // 1
printf("&C::foo->adj: 0x%lX\n", PART2_OF_PTR(pcfoo)); // 0
printf("&C::quz->ptr: 0x%lX, ", PART1_OF_PTR(pcquz)); // 9
printf("&C::quz->adj: 0x%lX\n", PART2_OF_PTR(pcquz)); // 0
printf("&C::bar->ptr: 0x%lX, ", PART1_OF_PTR(pcbar)); // 1
printf("&C::bar->adj: 0x%lX\n", PART2_OF_PTR(pcbar)); // 8
C c;
int XX::*p = 0; // VALUE_OF_PTR(p) == -1
p = &XX::a; // VALUE_OF_PTR(p) == 0
p = &XX::b; // VALUE_OF_PTR(p) == 4
int YY::* kptr = &YY::k; // VALUE_OF_PTR(p) == 8
((&c)->*pcbar)();
int status;
char *ret = abi::__cxa_demangle(typeid(pcbar).name(), 0, 0, &status);
cout <<"pcbar type mangled name: "<<typeid(pcbar).name()<< " | pcbar type demangled name: " <<ret<<endl;
cout <<"pcbar take : "<<sizeof(pcbar)<< " bytes in memory"<< endl;
return 0;
}
生成的汇编如下:
main:
pushq %rbp
movq %rsp, %rbp
subq $96, %rsp
movl $0, -4(%rbp)
movq pafoo(%rip), %rsi
leaq .L.str(%rip), %rdi
xorl %eax, %eax
movb %al, -73(%rbp)
callq printf@PLT
movb -73(%rbp), %al
movq pafoo+8(%rip), %rsi
leaq .L.str.1(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pbbar(%rip), %rsi
leaq .L.str.2(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pbbar+8(%rip), %rsi
leaq .L.str.3(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pcfoo(%rip), %rsi
leaq .L.str.4(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pcfoo+8(%rip), %rsi
leaq .L.str.5(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pcquz(%rip), %rsi
leaq .L.str.6(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pcquz+8(%rip), %rsi
leaq .L.str.7(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pcbar(%rip), %rsi
leaq .L.str.8(%rip), %rdi
callq printf@PLT
movb -73(%rbp), %al
movq pcbar+8(%rip), %rsi
leaq .L.str.9(%rip), %rdi
callq printf@PLT
leaq -24(%rbp), %rdi
callq C::C() [base object constructor]
movq $-1, -32(%rbp)
movq $0, -32(%rbp)
movq $4, -32(%rbp)
movq $8, -40(%rbp)
movq pcbar+8(%rip), %rdx
movq pcbar(%rip), %rax
movq %rax, -72(%rbp)
leaq -24(%rbp), %rcx
addq %rdx, %rcx
movq %rcx, -64(%rbp)
andq $1, %rax
cmpq $0, %rax
je .LBB0_2
movq -72(%rbp), %rcx
movq -64(%rbp), %rax
movq (%rax), %rax
subq $1, %rcx
movq (%rax,%rcx), %rax
movq %rax, -88(%rbp)
jmp .LBB0_3
.LBB0_2:
movq -72(%rbp), %rax
movq %rax, -88(%rbp)
.LBB0_3:
movq -64(%rbp), %rdi
movq -88(%rbp), %rax
callq *%rax
leaq typeinfo for void (C::*)()(%rip), %rdi
callq std::type_info::name() const
movq %rax, %rdi
xorl %eax, %eax
movl %eax, %edx
leaq -44(%rbp), %rcx
movq %rdx, %rsi
callq __cxa_demangle@PLT
movq %rax, -56(%rbp)
movq std::cout@GOTPCREL(%rip), %rdi
leaq .L.str.10(%rip), %rsi
callq std::basic_ostream<char, std::char_traits<char>>& std::operator<<<std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&, char const*)@PLT
movq %rax, -96(%rbp)
leaq typeinfo for void (C::*)()(%rip), %rdi
callq std::type_info::name() const
movq -96(%rbp), %rdi
movq %rax, %rsi
callq std::basic_ostream<char, std::char_traits<char>>& std::operator<<<std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&, char const*)@PLT
movq %rax, %rdi
leaq .L.str.11(%rip), %rsi
callq std::basic_ostream<char, std::char_traits<char>>& std::operator<<<std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&, char const*)@PLT
movq %rax, %rdi
movq -56(%rbp), %rsi
callq std::basic_ostream<char, std::char_traits<char>>& std::operator<<<std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&, char const*)@PLT
movq %rax, %rdi
movq std::basic_ostream<char, std::char_traits<char>>& std::endl<char, std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&)@GOTPCREL(%rip), %rsi
callq std::ostream::operator<<(std::ostream& (*)(std::ostream&))@PLT
movq std::cout@GOTPCREL(%rip), %rdi
leaq .L.str.12(%rip), %rsi
callq std::basic_ostream<char, std::char_traits<char>>& std::operator<<<std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&, char const*)@PLT
movq %rax, %rdi
movl $16, %esi
callq std::ostream::operator<<(unsigned long)@PLT
movq %rax, %rdi
leaq .L.str.13(%rip), %rsi
callq std::basic_ostream<char, std::char_traits<char>>& std::operator<<<std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&, char const*)@PLT
movq %rax, %rdi
movq std::basic_ostream<char, std::char_traits<char>>& std::endl<char, std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char>>&)@GOTPCREL(%rip), %rsi
callq std::ostream::operator<<(std::ostream& (*)(std::ostream&))@PLT
xorl %eax, %eax
addq $96, %rsp
popq %rbp
retq
C::C() [base object constructor]:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rdi
movq %rdi, -16(%rbp)
callq A::A() [base object constructor]
movq -16(%rbp), %rdi
addq $8, %rdi
callq B::B() [base object constructor]
movq -16(%rbp), %rax
leaq vtable for C(%rip), %rcx
addq $16, %rcx
movq %rcx, (%rax)
leaq vtable for C(%rip), %rcx
addq $48, %rcx
movq %rcx, 8(%rax)
addq $16, %rsp
popq %rbp
retq
A::A() [base object constructor]:
pushq %rbp
movq %rsp, %rbp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
leaq vtable for A(%rip), %rcx
addq $16, %rcx
movq %rcx, (%rax)
popq %rbp
retq
B::B() [base object constructor]:
pushq %rbp
movq %rsp, %rbp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
leaq vtable for B(%rip), %rcx
addq $16, %rcx
movq %rcx, (%rax)
popq %rbp
retq
A::foo():
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rsi
leaq .L.str.14(%rip), %rdi
movb $0, %al
callq printf@PLT
addq $16, %rsp
popq %rbp
retq
C::quz():
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rsi
leaq .L.str.15(%rip), %rdi
movb $0, %al
callq printf@PLT
addq $16, %rsp
popq %rbp
retq
B::bar():
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rsi
leaq .L.str.16(%rip), %rdi
movb $0, %al
callq printf@PLT
addq $16, %rsp
popq %rbp
retq
pafoo:
.quad 1
.quad 0
pbbar:
.quad 1
.quad 0
pcfoo:
.quad 1
.quad 0
pcquz:
.quad 9
.quad 0
pcbar:
.quad 1
.quad 8
.L.str:
.asciz "&A::foo->ptr: 0x%lX, "
.L.str.1:
.asciz "&A::foo->adj: 0x%lX\n"
.L.str.2:
.asciz "&B::bar->ptr: 0x%lX, "
.L.str.3:
.asciz "&B::bar->adj: 0x%lX\n"
.L.str.4:
.asciz "&C::foo->ptr: 0x%lX, "
.L.str.5:
.asciz "&C::foo->adj: 0x%lX\n"
.L.str.6:
.asciz "&C::quz->ptr: 0x%lX, "
.L.str.7:
.asciz "&C::quz->adj: 0x%lX\n"
.L.str.8:
.asciz "&C::bar->ptr: 0x%lX, "
.L.str.9:
.asciz "&C::bar->adj: 0x%lX\n"
typeinfo name for void (C::*)():
.asciz "M1CFvvE"
typeinfo name for void ():
.asciz "FvvE"
typeinfo for void ():
.quad vtable for __cxxabiv1::__function_type_info+16
.quad typeinfo name for void ()
typeinfo name for C:
.asciz "1C"
typeinfo name for A:
.asciz "1A"
typeinfo for A:
.quad vtable for __cxxabiv1::__class_type_info+16
.quad typeinfo name for A
typeinfo name for B:
.asciz "1B"
typeinfo for B:
.quad vtable for __cxxabiv1::__class_type_info+16
.quad typeinfo name for B
typeinfo for C:
.quad vtable for __cxxabiv1::__vmi_class_type_info+16
.quad typeinfo name for C
.long 0
.long 2
.quad typeinfo for A
.quad 2
.quad typeinfo for B
.quad 2050
typeinfo for void (C::*)():
.quad vtable for __cxxabiv1::__pointer_to_member_type_info+16
.quad typeinfo name for void (C::*)()
.long 0
.zero 4
.quad typeinfo for void ()
.quad typeinfo for C
.L.str.10:
.asciz "pcbar type mangled name: "
.L.str.11:
.asciz " | pcbar type demangled name: "
.L.str.12:
.asciz "pcbar take : "
.L.str.13:
.asciz " bytes in memory"
vtable for C:
.quad 0
.quad typeinfo for C
.quad A::foo()
.quad C::quz()
.quad -8
.quad typeinfo for C
.quad B::bar()
vtable for A:
.quad 0
.quad typeinfo for A
.quad A::foo()
vtable for B:
.quad 0
.quad typeinfo for B
.quad B::bar()
.L.str.14:
.asciz "A::foo(): this = 0x%p\n"
.L.str.15:
.asciz "C::quz(): this = 0x%p\n"
.L.str.16:
.asciz "B::bar(): this = 0x%p\n"
代码运行结果:
&A::foo->ptr: 0x1, &A::foo->adj: 0x0
&B::bar->ptr: 0x1, &B::bar->adj: 0x0
&C::foo->ptr: 0x1, &C::foo->adj: 0x0
&C::quz->ptr: 0x9, &C::quz->adj: 0x0
&C::bar->ptr: 0x1, &C::bar->adj: 0x8
B::bar(): this = 0x0x7ffcc5066038
pcbar type mangled name: M1CFvvE | pcbar type demangled name: void (C:: *)()
pcbar take : 16 bytes in memory
汇编中关于RIP寻址的内容可参看如下:
%rip-relative addressing
x86-64 code often refers to globals using %rip-relative addressing: a global variable named a is referenced as a(%rip) rather than a.
This style of reference supports position-independent code (PIC), a security feature. It specifically supports position-independent executables (PIEs), which are programs that work independently of where their code is loaded into memory.
To run a conventional program, the operating system loads the program’s instructions into memory at a fixed address that’s the same every time, then starts executing the program at its first instruction. This works great, and runs the program in a predictable execution environment (the addresses of functions and global variables are the same every time). Unfortunately, the very predictability of this environment makes the program easier to attack.
In a position-independent executable, the operating system loads the program at varying locations: every time it runs, the program’s functions and global variables have different addresses. This makes the program harder to attack (though not impossible).
Program startup performance matters, so the operating system doesn’t recompile the program with different addresses each time. Instead, the compiler does most of the work in advance by using relative addressing.
When the operating system loads a PIE, it picks a starting point and loads all instructions and globals relative to that starting point. The PIE’s instructions never refer to global variables using direct addressing: you’ll never see movl global_int, %eax. Globals are referenced relatively instead, using deltas relative to the next %rip: movl global_int(%rip), %eax. These relative addresses work great independent of starting point! For instance, consider an instruction located at (starting-point + 0x80) that loads a variable g located at (starting-point + 0x1000) into %rax. In a non-PIE, the instruction might be written movq 0x400080, %rax (in compiler output, movq g, %rax); but this relies on g having a fixed address. In a PIE, the instruction might be written movq 0xf79(%rip), %rax (in compiler output, movq g(%rip), %rax), which works out beautifully no matter the starting point.
At starting point… The mov instruction is at… The next instruction is at… And g is at… So the delta (g - next %rip) is…
0x400000 0x400080 0x400087 0x401000 0xF79
0x404000 0x404080 0x404087 0x405000 0xF79
0x4003F0 0x400470 0x400477 0x4013F0 0xF79
标签:Function,member,function,Pointers,rip,rbp,Member,pointer,movq
From: https://www.cnblogs.com/DesertCactus/p/18442335