标签：rw const list Class method objective array class 底层

isa 走位图

在讲 OC->Class 底层类结构之前，先看下下面这张图：

isa走位

通过isa走位图得出的结论是：
1，类，父类，元类都包含了 isa, superclass

2，对象isa指向类对象，类对象的isa指向了元类，元类的 isa 指向了根元类，根元类 isa 指向自己

3，类的 superclass 指向父类，父类的 superclass 指向的根类，根类的superclass 指向的nil

4，元类的 superclass 指向父元类，父元类 superclass 指向的根元类，根元类 superclass 指向根类，根类 superclass 指向nil

这下又复习了 isa ，superclass 走位；那么问题这些类，类对象，元类对象当中的在底层展现的数据结构是怎样呢，这是我需要探索的，于是把源码贴出来展开分析下：

struct objc_class

struct objc_class : objc_object {
    // Class ISA;
    Class superclass; 
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;  
    class_rw_t *data() const {
        return bits.data();
    }
    const class_ro_t *safe_ro() const {
        return bits.safe_ro();
    }
}

从源码没见 isa 属性，其实它继承了objc_object ，而 objc_object 中有个isa ，在运行时类图生成中会产生一个isa 指向objc_object 这个类图，而 superclass 指向它的父类；根据上面 isa , superclass 走位图就知道它的指向关系。

cache_t & class_data_bits_t

cache 方法缓存，这个作用将常调用的方法缓存下来；便于下次直接查找调用，提高查找效率。
它的结构：

struct cache_t {
	struct bucket_t *buckets() const;//存储方法的散列表
	mask_t mask() const;//散列表缓存长度
	mask_t occupied() const;//已缓存方法个数
}

struct class_data_bits_t {
    class_rw_t* data() const;//类信息
}

bits 存储具体类信息，它需要&FAST_DATA_MASK来计算得到类心所有信息，源码如下：

FAST_DATA_MASK 掩码值

imageng

bool has_rw_pointer() const {
	#if FAST_IS_RW_POINTER
	        return (bool)(bits & FAST_IS_RW_POINTER);
	#else
	        class_rw_t *maybe_rw = (class_rw_t *)(bits & FAST_DATA_MASK);
	        return maybe_rw && (bool)(maybe_rw->flags & RW_REALIZED);
	#endif
}

通过源码确实需要这种方式计算能得到类的存储信息；那为什么要用这种方式去处理呢。
比如说我要得到存储在 class_rw_t 类信息信息我只要通过 FAST_DATA_MASK 掩码值就能得到它的地址信息，通过地址信息就能从内存中拿到所有类的存储信息。

那这样我的FAST_DATA_MASK掩码值不一样，我通过&计算，得到的数据信息也就不一样，不得不说苹果工程师想的周到，而且这种方式不仅isa也是这样，很多地方都用这种方式取值，大大提高访问速度，数据提取效率。

class_rw_t ，class_ro_t，class_rw_ext_t

struct class_rw_t {
     const class_ro_t *ro() const ;
     const method_array_t methods() const ;//如果是类对象：放对象方法，元类：元类对象方法
     
     const property_array_t properties() const;
     const protocol_array_t protocols() const;
     class_rw_ext_t *ext() const;
}
struct class_rw_ext_t {
    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;
    uint32_t version;
}

可以看出类的信息具体就存储在class_rw_t，class_ro_t，class_rw_ext_t 中，

剖析下class_rw_t
先看看method_array_t，property_array_t，protocol_array_t源码结构

class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr>
{
    typedef list_array_tt<property_t, property_list_t, RawPtr> Super;

 public:
    property_array_t() : Super() { }
    property_array_t(property_list_t *l) : Super(l) { }
};


class protocol_array_t : 
    public list_array_tt<protocol_ref_t, protocol_list_t, RawPtr>
{
    typedef list_array_tt<protocol_ref_t, protocol_list_t, RawPtr> Super;

 public:
    protocol_array_t() : Super() { }
    protocol_array_t(protocol_list_t *l) : Super(l) { }
};

看完之后，他们都继承list_array_tt，那么 list_array_tt 是什么鬼，它数据结构是怎样的，这下在取找下它。源码如下：

template <typename Element, typename List, template<typename> class Ptr>
class list_array_tt {
 protected:
    template <bool authenticated>
    class iteratorImpl {
        const Ptr<List> *lists;
        const Ptr<List> *listsEnd;
    }
        
    using iterator = iteratorImpl<false>;
    using signedIterator = iteratorImpl<true>;

 public:
    list_array_tt() : list(nullptr) { }
    list_array_tt(List *l) : list(l) { }
    list_array_tt(const list_array_tt &other) {
        *this = other;
    }

    void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            array_t *newArray =(array_t*)malloc(array_t::byteSize(newCount));
            newArray->count = newCount;
            array()->count = newCount;

            for (int i = oldCount - 1; i >= 0; i--)
                newArray->lists[i + addedCount] = array()->lists[i];
            for (unsigned i = 0; i < addedCount; i++)
                newArray->lists[i] = addedLists[i];
            free(array());
            setArray(newArray);
            validate();
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
            validate();
        } 
        else {
            // 1 list -> many lists
            Ptr<List> oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            for (unsigned i = 0; i < addedCount; i++)
                array()->lists[i] = addedLists[i];
            validate();
        }
    }
    
}

我把主要地方拿去出来，可以看到 attachLists 它的目的是将一个或多个列表（List 类型）附加到某个 list_array_tt对象中。这个对象可以包含零个、一个或多个列表，这些列表可以是单个指针，也可以是指针数组。函数的输入参数是一个指向 List 指针数组的指针 addedLists 和一个无符号整数 addedCount，表示要添加的列表数量。

由此我推断它是一个数组，而且是一个二维数组存储的，所有由此得出 class_rw_t 中methods，properties，protocols这几个属性利用二维数组取存储类的方法，协议等信息，而且是可读可写的属性。

那它设计这种二维数组有什么好处呢？当然有好处，它可以动态的给数组里面增加删除方法，很方便我们分类方法的编写完进行存储。

那搞清楚了 class_rw_t 几个重要数据存储信息，那 class_rw_t 它的作用是干什么的呢；

从class_rw_t 结构体定义来看；它是在应用运行时，将OC类，分类的信息直接写入到class_rw_t结构的数据结构中，在类的方法，协议进行调用时，从里面去读取，然后常调用的方法，又存储在cache_t这个结构体中，可想而知，苹果对OC类的处理，煞费苦心。

struct class_ro_t

在 class_rw_t结构体中有个 class_ro_t 结构体，在探索下这个东西做什么的，它的源码如下：

struct class_ro_t {
    WrappedPtr<method_list_t, method_list_t::Ptrauth> baseMethods;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;
    property_list_t *baseProperties;
}

先说说 ivars 这个属性修饰的结构体源码如下：

struct ivar_list_t : entsize_list_tt<ivar_t, ivar_list_t, 0> {
    bool containsIvar(Ivar ivar) const {
        return (ivar >= (Ivar)&*begin()  &&  ivar < (Ivar)&*end());
    }
};

这个貌似只有一个继承 entsize_list_tt,那在探索下源码：

struct entsize_list_tt {
    uint32_t entsizeAndFlags;
    uint32_t count;
     struct iteratorImpl {
     uint32_t entsize;
        uint32_t index;  // keeping track of this saves a divide in operator-

        using ElementPtr = std::conditional_t<authenticated, Element * __ptrauth(ptrauth_key_process_dependent_data, 1, 0xdead), Element *>;

        ElementPtr element;

        typedef std::random_access_iterator_tag iterator_category;
        typedef Element value_type;
        typedef ptrdiff_t difference_type;
        typedef Element* pointer;
        typedef Element& reference;

        iteratorImpl() { }

        iteratorImpl(const List& list, uint32_t start = 0)
            : entsize(list.entsize())
            , index(start)
            , element(&list.getOrEnd(start))
        { }
     }
}

可以看出这段代码定义了一个结构体 entsize_list_tt，它内部包含一个嵌套的结构体 iteratorImpl，用于实现一个迭代器。遍历容器（如列表、数组等）的对象。

到此可以得出ivars 是一个 ivar_list_t 数组，它存储了类的属性变量信息，那protocol_list_t结构体内部也是数组形式构建的。

baseProtocols，baseProperties 这两个属性对类的存储信息只能读取，不能写入。

所以总结的是：从 class_ro_t 结构体定义来看，它存储类的变量，方法，协议信息，而且这个结构体属于类的只读信息，它包含了类的初始信息。

class_rw_ext_t

这个结构体不在过多叙述，简单来说它是基于 class_rw_t 之后为了更好管理oc类的高级特性，比如关联属性等，衍生出来的一个结构体，包括：method_array_t ,property_arrat_t ,protocol_array_t 等定义属性类型

到这里类结构及存储所关联的信息都在这里了；来一张他们关联的结构思维图：

imageng

总结：一开始编译时，程序将类的初始信息放在 class_ro_t中，当程序运行时，将类的信息合并在一起的时候，它会将 class_ro_t 类的信息合并到 class_rw_t 结构体中去。

struct method_t

为什么要说method_t，因为它不仅在 class_ro_t 有使用，在OC底层其他地方也有使用；比如如下源码：

void method_exchangeImplementations(Method m1Signed, Method m2Signed)
{
    if (!m1Signed  ||  !m2Signed) return;

    method_t *m1 = _method_auth(m1Signed);
    method_t *m2 = _method_auth(m2Signed);

    mutex_locker_t lock(runtimeLock);

    IMP imp1 = m1->imp(false);
    IMP imp2 = m2->imp(false);
    SEL sel1 = m1->name();
    SEL sel2 = m2->name();

    m1->setImp(imp2);
    m2->setImp(imp1);


    // RR/AWZ updates are slow because class is unknown
    // Cache updates are slow because class is unknown
    // fixme build list of classes whose Methods are known externally?

    flushCaches(nil, __func__, [sel1, sel2, imp1, imp2](Class c){
        return c->cache.shouldFlush(sel1, imp1) || c->cache.shouldFlush(sel2, imp2);
    });

    adjustCustomFlagsForMethodChange(nil, m1);
    adjustCustomFlagsForMethodChange(nil, m2);
}

static IMP
_method_setImplementation(Class cls, method_t *m, IMP imp)
{
    lockdebug::assert_locked(&runtimeLock);

    if (!m) return nil;
    if (!imp) return nil;

    IMP old = m->imp(false);
    SEL sel = m->name();

    m->setImp(imp);

    // Cache updates are slow if cls is nil (i.e. unknown)
    // RR/AWZ updates are slow if cls is nil (i.e. unknown)
    // fixme build list of classes whose Methods are known externally?

    flushCaches(cls, __func__, [sel, old](Class c){
        return c->cache.shouldFlush(sel, old);
    });

    adjustCustomFlagsForMethodChange(cls, m);

    return old;
}

方法交换，实现中底层都有用到，我们探索下，先看看 method_t 源码：

struct method_t {

    // The representation of a "big" method. This is the traditional
    // representation of three pointers storing the selector, types
    // and implementation.
    struct big {
        SEL name;
        const char *types;
        MethodListIMP imp;
    };

    // A "big" method, but name is signed. Used for method lists created at runtime.
    struct bigSigned {
        SEL __ptrauth_objc_sel name;
        const char * ptrauth_method_list_types types;
        MethodListIMP imp;
    };

    // ***HACK: This is a TEMPORARY HACK FOR EXCLAVEKIT. It MUST go away.
    // rdar://96885136 (Disallow insecure un-signed big method lists for ExclaveKit)
#if TARGET_OS_EXCLAVEKIT
    struct bigStripped {
        SEL name;
        const char *types;
        MethodListIMP imp;
    };
#endif

}

可以看到这结构体中掐套了多个结构体；在把它简化下：

struct method_t {
    SEL name;//方法名
    const char *types;//包含函数具有参数编码的字符串类型的返回值
    MethodListIMP imp;//函数指针（指向函数地址的指针）
}

SEL ：函数名，没特别的意义；

特点：
1，使用@selector()，sel_registerName()获得
2，使用sel_getName()，NSStringFromSelector()转成字符串
3，不同类中相同名字方法，对应的方法选择器是相同或相等的

底层代码结构：

/// An opaque type that represents a method selector.
typedef struct objc_selector *SEL;

types:包含了函数返回值、参数编码的字符串

imageng
imageng

可以看到types在值：v16@0:8 ，可以看出name,types,IMP其实都在class_ro_t结构体中，这样确实证明了之前说的；class_ro_t结构体在运行时存储着类的初始状态数据。

v16@0:8说明下：

v:方法返回类型，这里说void，

16：第一个参数，

@：id类型第二个参数，

0：第三个参数

: :selector类型

8:第四个参数

那这种types参数又是什么鬼东西，查下了资料这叫：Type Encoding(类型编码)
怎么证明了，使用如下代码：
imagepng

苹果官网types encoding表格:
imageng

IMP 其实就是指向函数的指针，感觉这个就没有必要讲了。

struct cache_t

cache_t 用于 class的方法缓存，对class常调用的方法缓存下来，提高查询效率，这个上之前都已经说过；接下来看看 bucket_t。

struct bucket_t

struct bucket_t {
	cache_key_t _key;//函数名
	IMP _imp;//函数内存地址
}

这种散列表的模型，其实在底层用一个数组展现：

imagng

其实它的内部就是一个一维数组，那可能问了，数组难道它是循环查找吗，其实不然；在它元素超找时，它是拿到你的 函数名 & mask，而这个 mask 就是 cache_t 结构体中的 mask值；计算得到函数在 散列表 存储的索引值，在通过索引拿到函数地址，进行执行。

接下来看个事例：

int main(int argc, const char * argv[]) {

    @autoreleasepool {

        Student *stu=[Student new];

        [stu test];

        [stu test];

        [stu test];

        [stu test];

    }

    return 0;

}

如上方法：当首次调用它会去类对象中查找，在方法执行时，他会放入cache_t 缓存中，当第二次，第三次，第四次时，它就去缓存中查找。

imagpng