概览
类概述
BTreePage
:B+
树节点(叶子节点和内部节点)的公共父类,存储了父节点页号protected int parent
,并且使用protected final int keyField
存储在表记录的哪个字段建立索引。BTreeLeafPage
:B+
树的叶子节点,用来存放具体的表记录。BTreeInternalPage
:B+
树的内部节点,此外,private final Field[] keys
存储了索引的字段,private final int[] children
存储了孩子节点的页号,具体规则为keys[i]
的左孩子对应children[i]
,keys[i]
的右孩子对应children[i+1]
。(孩子节点可能是BTreeLeafPage
或BTreeLeafPage
)BTreeEntry
:对BTreeInternalPage
内部记录的封装BTreeHeaderPage
:作用类似HeapPage
中的header[]
数组,每个正在使用的BTreePage
都有一个对应的BTreeHeaderPage
,这些BTreeHeaderPage
组成链表,标记每一个页中记录的有效情况。BTreeRootPtrPage
:内部包含B+
树根节点的页号以及BTreeHeaderPage
链表中头节点的页号。
注意点
文件被转化为B+
树后,树中的任何节点都是顺序存储在磁盘上的,在查找字段为某个值的记录时,不需要像HeapFile
一样遍历每个页来搜索,而时通过读取文件头部存储的BTreeRootPtrPage
找到根节点位置,再从根节点依次向下索引找到对应的叶子节点,最后从叶子节点页中遍历搜索即可,省下了大量时间。
实验部分
实验1
BTreeFile.findLeafPage()
中Field f
参数不为空时,该方法要求返回可能含有f
的最左叶子节点,当Field f
为空时,返回整个B+
树的最左叶子节点,对索引过程中遍历到的内部节点,只请求读权限,对返回的叶子节点,请求Permissions perm
参数指定的权限。使用BTreeFile.getPage()
方法获取页,它是对BufferPool.getPage()
的封装,使用了Map<PageId, Page> dirtypages
记录了脏页(BufferPool
中得页中数据修改完之后才能将其标记为脏页,dirtypages
则是直接将请求写权限的页加入,相当于比BufferPool
提前标记了脏页)
private BTreeLeafPage findLeafPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreePageId pid, Permissions perm,
Field f)
throws DbException, TransactionAbortedException {
if(f != null){
if(pid.pgcateg() == BTreePageId.LEAF)
return (BTreeLeafPage) getPage(tid,dirtypages,pid,perm);
else {
BTreeInternalPage page = (BTreeInternalPage) getPage(tid, dirtypages, pid, Permissions.READ_ONLY);
Iterator<BTreeEntry> iterator = page.iterator();
BTreeEntry next = null;
while(iterator.hasNext()){
next = iterator.next();
if(f.compare(Op.LESS_THAN_OR_EQ,next.getKey()))
return findLeafPage(tid,dirtypages,next.getLeftChild(),perm,f);
}
// 走到这说明f比该页所有的key都大,只能从最右边的rightChild再找
return findLeafPage(tid,dirtypages,next.getRightChild(),perm,f);
}
}else {
if(pid.pgcateg() == BTreePageId.LEAF)
return (BTreeLeafPage) getPage(tid,dirtypages,pid,perm);
else {
BTreeInternalPage page = (BTreeInternalPage) getPage(tid, dirtypages, pid, Permissions.READ_ONLY);
Iterator<BTreeEntry> iterator = page.iterator();
BTreeEntry next = iterator.next();
return findLeafPage(tid,dirtypages,next.getLeftChild(),perm,null);
}
}
}
实验2
分裂叶子节点时,将新叶子节点的中存储的首条记录“拷贝”到父内部节点中(key
6):
分裂内部节点时,将新页的中存储的首条记录“移动”到父内部节点中(key
6):
叶子节点分裂实现步骤如下:
- 创建新的右叶子页面,将新建叶子节点的
keyFiled
更新为待分裂叶子节点的keyFiled
- 获取待分裂叶子节点的反向迭代器中,反向遍历其中一半的tuple插入新的右叶子页面中
- 根据待分裂的叶子节点是否有右兄弟,将新建叶子节点以合适的方式插入双向链表中
- 通过
getParentWithEmptySlots()
递归获取可插入新记录的父节点 - 将右叶子节点的第一条记录封装成
BTreeEntry
对象插入父节点 - 更新左右叶子节点的父指针
public BTreeLeafPage splitLeafPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreeLeafPage page, Field field)
throws DbException, IOException, TransactionAbortedException {
// 通过getPage方法读取的权限为Permissions.READ_WRITE的页都会加入dirtypages
BTreeLeafPage page1 = (BTreeLeafPage)getPage(tid, dirtypages, page.getId(), Permissions.READ_WRITE);
// 通过getEmptyPage创建的页会上X锁
BTreeLeafPage page2 = (BTreeLeafPage)getEmptyPage(tid, dirtypages, BTreePageId.LEAF);
try {
// 设置page2的keyFiled
java.lang.reflect.Field keyField = BTreePage.class.getDeclaredField("keyField");
keyField.setAccessible(true);
java.lang.reflect.Field modifiers = java.lang.reflect.Field.class.getDeclaredField("modifiers");
modifiers.setAccessible(true);
modifiers.setInt(keyField,keyField.getModifiers()&~Modifier.FINAL);
keyField.setInt(page2,page1.keyField);
} catch (Exception e) {
throw new RuntimeException(e);
}
int page1NumTuples = page1.getNumTuples();
int count = 0;
Iterator<Tuple> reverseIterator = page1.reverseIterator();
while(reverseIterator.hasNext()){
Tuple next = reverseIterator.next();
// BtreeLeafPage的insertTuple方法在插入tuple的时候,就会更新该tuple所属的页信息和序号,
// 这就会导致page1调用deleteTuple的时候发现该tuple不属于本页,从而报错
// page2.insertTuple(next);
// page1.deleteTuple(next);
page1.deleteTuple(next);
page2.insertTuple(next);
count++;
if(count >= page1NumTuples/2)
break;
}
// 插入兄弟指针
if(page1.getRightSiblingId() != null){
BTreeLeafPage page3 = (BTreeLeafPage)getPage(tid, dirtypages, page1.getRightSiblingId(), Permissions.READ_WRITE);
page1.setRightSiblingId(page2.getId());
page2.setLeftSiblingId(page1.getId());
page3.setLeftSiblingId(page2.getId());
page2.setRightSiblingId(page3.getId());
}else {
page1.setRightSiblingId(page2.getId());
page2.setLeftSiblingId(page1.getId());
}
// 如果原来的parent页满了,getParentWithEmptySlots会调用splitInternalPage方法并返回一个新的parent页
// 通过getParentWithEmptySlots获得的页已经上锁了
BTreeInternalPage parent = getParentWithEmptySlots(tid, dirtypages, page1.getParentId(), field);
Tuple first = page2.iterator().next();
parent.insertEntry(new BTreeEntry(first.getField(page2.keyField),page1.getId(),page2.getId()));
// 更新父指针
updateParentPointer(tid,dirtypages,parent.getId(),page1.getId());
updateParentPointer(tid,dirtypages,parent.getId(),page2.getId());
// 返回合适的页
if(field.compare(Op.LESS_THAN_OR_EQ,first.getField(page2.keyField))){
return page1;
}else {
return page2;
}
}
注意,splitLeafPage()
会根据传入的field
参数返回插入的叶子节点,不会真正执行记录插入操作。
内部节点的分裂步骤类似叶子节点,从第4步开始不同:
4. 获取左内部节点最后的BTreeEntry
记录,并从左内部节点删除该记录
5. 设置该记录的左右孩子分别为左右内部节点
6. 更新右内部节点中,所有子页的父节点为该右内部节点(这些子页的父节点一开始是左内部节点)
7. 通过getParentWithEmptySlots()
递归获取可插入新记录的父节点
8. 插入之前获取的左内部节点最后的BTreeEntry
记录
public BTreeInternalPage splitInternalPage(TransactionId tid, Map<PageId, Page> dirtypages,
BTreeInternalPage page, Field field)
throws DbException, IOException, TransactionAbortedException {
// some code goes here
//
// Split the internal page by adding a new page on the right of the existing
// page and moving half of the entries to the new page. Push the middle key up
// into the parent page, and recursively split the parent as needed to accommodate
// the new entry. getParentWithEmtpySlots() will be useful here. Don't forget to update
// the parent pointers of all the children moving to the new page. updateParentPointers()
// will be useful here. Return the page into which an entry with the given key field
// should be inserted.
BTreeInternalPage page1 = (BTreeInternalPage)getPage(tid, dirtypages, page.getId(), Permissions.READ_WRITE);
BTreeInternalPage page2 = (BTreeInternalPage)getEmptyPage(tid, dirtypages, BTreePageId.INTERNAL);
try {
java.lang.reflect.Field keyField = BTreePage.class.getDeclaredField("keyField");
keyField.setAccessible(true);
java.lang.reflect.Field modifiers = java.lang.reflect.Field.class.getDeclaredField("modifiers");
modifiers.setAccessible(true);
modifiers.setInt(keyField,keyField.getModifiers()&~Modifier.FINAL);
keyField.setInt(page2,page1.keyField);
} catch (Exception e) {
throw new RuntimeException(e);
}
int page1NumEntries = page1.getNumEntries();
Iterator<BTreeEntry> reverseIterator = page1.reverseIterator();
int count = 0;
while(reverseIterator.hasNext()){
BTreeEntry next = reverseIterator.next();
page1.deleteKeyAndRightChild(next);
page2.insertEntry(next);
count++;
if(count >= page1NumEntries/2)
break;
}
// 获取要提上parent的Entry
BTreeEntry last = page1.reverseIterator().next();
page1.deleteKeyAndRightChild(last);
last.setLeftChild(page1.getId());
last.setRightChild(page2.getId());
// 因为internal页分裂了,所以原先一半子节点被分到了新的internal页,要更新这些子节点页的父节点
updateParentPointers(tid,dirtypages,page2);
BTreeInternalPage parent = getParentWithEmptySlots(tid, dirtypages, page1.getParentId(), field);
parent.insertEntry(last);
// 因为parent页上有新加入的节点,所以也要更新该节点页的父节点
updateParentPointers(tid,dirtypages,parent);
BTreeEntry first = page2.iterator().next();
if(field.compare(Op.LESS_THAN_OR_EQ,first.getKey())){
return page1;
}else {
return page2;
}
}
实验3
删除记录时,可能会导致该记录所在的节点不满足B+
树的要求,这时要分两种情况,一种是兄弟节点含有多余的记录,将多余记录分给该节点;二是兄弟节点没有多余的记录,要将该节点和兄弟节点合并。
实现第一种情况要完成stealFromLeafPage()
,stealFromLeftInternalPage()
, stealFromRightInternalPage()
三个方法。
叶子节点分配记录的方法stealFromLeafPage()
实现流程:
- 根据
isRightSibling
判断是左兄弟还是右兄弟含有多余的记录,如果是左兄弟含有多余记录,则获取左兄弟的反向迭代器遍历数据插入叶子节点page
中,如果是右兄弟含有多余记录则获取右兄弟的正向迭代器遍历数据插入叶子节点page
中 - 更新父节点中指向这两个叶子节点的
entry
,将其key
值改为叶子节点page
中第一条记录的key
值。
public void stealFromLeafPage(BTreeLeafPage page, BTreeLeafPage sibling,
BTreeInternalPage parent, BTreeEntry entry, boolean isRightSibling) throws DbException {
// 注意,此时的page和sibling都是leaf,它们的父节点为parent不会变
int half = (page.getNumTuples() + sibling.getNumTuples())/2;
if(half < page.getMaxTuples()/2)
throw new RuntimeException("the leaf page does not have enough tuples to steal");
if(isRightSibling){
// 把右兄弟的tuple移动到page中
Iterator<Tuple> iterator = sibling.iterator();
while(page.getNumTuples() < half){
Tuple next = iterator.next();
sibling.deleteTuple(next);
page.insertTuple(next);
}
Tuple first = iterator.next();
entry.setKey(first.getField(sibling.keyField));
parent.updateEntry(entry);
}else {
// 把左兄弟的tuple移动到page中
Iterator<Tuple> reverseIterator = sibling.reverseIterator();
while(page.getNumTuples() < half){
Tuple next = reverseIterator.next();
sibling.deleteTuple(next);
page.insertTuple(next);
}
Tuple first = page.iterator().next();
entry.setKey(first.getField(page.keyField));
parent.updateEntry(entry);
}
}
内部节点分配记录的方法stealFromLeftInternalPage()
实现流程:
public void stealFromLeftInternalPage(TransactionId tid, Map<PageId, Page> dirtypages,
BTreeInternalPage page, BTreeInternalPage leftSibling, BTreeInternalPage parent,
BTreeEntry parentEntry) throws DbException, TransactionAbortedException {
page = (BTreeInternalPage) getPage(tid,dirtypages,page.getId(),Permissions.READ_WRITE);
leftSibling = (BTreeInternalPage) getPage(tid,dirtypages,leftSibling.getId(),Permissions.READ_WRITE);
parent = (BTreeInternalPage) getPage(tid,dirtypages,parent.getId(),Permissions.READ_WRITE);
int half = (page.getNumEntries() + leftSibling.getNumEntries())/2;
if(half < page.getMaxEntries()/2)
throw new RuntimeException("the internal page does not have enough entries to steal");
// 注意,BTreeFile的insertEntry方法要求新加入的entry的左右孩子指针,必须有一个和已存在的孩子指针相同
// 在插入entry时,新entry的key来自parent,左孩子为leftSibling对应位置entry的右孩子,右孩子为page第一个位置entry的左孩子
Iterator<BTreeEntry> reverseIterator = leftSibling.reverseIterator();
while(page.getNumEntries() < half){
BTreeEntry next = reverseIterator.next();
leftSibling.deleteKeyAndRightChild(next);
BTreeEntry bTreeEntry = new BTreeEntry(parentEntry.getKey(),
next.getRightChild(),
page.iterator().next().getLeftChild());
page.insertEntry(bTreeEntry);
parentEntry.setKey(next.getKey());
}
// 虽然parentEntry更新了,但是parent节点内的信息未更新,要调用updateEntry方法
parent.updateEntry(parentEntry);
// 因为有entry从左兄弟移动到了右兄弟,要更新这部分entry的父指针
updateParentPointers(tid,dirtypages,page);
}
实验4
merge
是split
的逆过程,注意deleteParentEntry
内部递归删除父节点中相应的entry
记录
public void mergeLeafPages(TransactionId tid, Map<PageId, Page> dirtypages,
BTreeLeafPage leftPage, BTreeLeafPage rightPage, BTreeInternalPage parent, BTreeEntry parentEntry)
throws DbException, IOException, TransactionAbortedException {
// Move all the tuples from the right page to the left page, update
// the sibling pointers, and make the right page available for reuse.
// Delete the entry in the parent corresponding to the two pages that are merging -
// deleteParentEntry() will be useful here
leftPage = (BTreeLeafPage) getPage(tid,dirtypages,leftPage.getId(),Permissions.READ_WRITE);
rightPage = (BTreeLeafPage) getPage(tid,dirtypages,rightPage.getId(),Permissions.READ_WRITE);
parent = (BTreeInternalPage) getPage(tid,dirtypages,parent.getId(),Permissions.READ_WRITE);
if(leftPage.getNumTuples() + rightPage.getNumTuples() > leftPage.getMaxTuples())
throw new RuntimeException("the leaf page can not hold those tuples");
// 从右leaf将tuple移动到左leaf
Iterator<Tuple> iterator = rightPage.iterator();
while(rightPage.getNumTuples() > 0){
Tuple next = iterator.next();
rightPage.deleteTuple(next);
leftPage.insertTuple(next);
}
// 设置新的左右兄弟
if(rightPage.getRightSiblingId() != null){
BTreeLeafPage rightSibling = (BTreeLeafPage) getPage(tid,dirtypages,rightPage.getRightSiblingId(),Permissions.READ_WRITE);
leftPage.setRightSiblingId(rightSibling.getId());
rightSibling.setLeftSiblingId(leftPage.getId());
}else {
leftPage.setRightSiblingId(null);
}
// 删除右leaf
setEmptyPage(tid,dirtypages,rightPage.getId().getPageNumber());
deleteParentEntry(tid,dirtypages,leftPage,parent,parentEntry);
}
public void mergeInternalPages(TransactionId tid, Map<PageId, Page> dirtypages,
BTreeInternalPage leftPage, BTreeInternalPage rightPage, BTreeInternalPage parent, BTreeEntry parentEntry)
throws DbException, IOException, TransactionAbortedException {
leftPage = (BTreeInternalPage) getPage(tid,dirtypages,leftPage.getId(),Permissions.READ_WRITE);
rightPage = (BTreeInternalPage) getPage(tid,dirtypages,rightPage.getId(),Permissions.READ_WRITE);
parent = (BTreeInternalPage) getPage(tid,dirtypages,parent.getId(),Permissions.READ_WRITE);
BTreeEntry bTreeEntry = new BTreeEntry(parentEntry.getKey(),
leftPage.reverseIterator().next().getRightChild(),
rightPage.iterator().next().getLeftChild());
leftPage.insertEntry(bTreeEntry);
Iterator<BTreeEntry> iterator = rightPage.iterator();
while (iterator.hasNext()){
BTreeEntry next = iterator.next();
rightPage.deleteKeyAndLeftChild(next);
leftPage.insertEntry(next);
}
updateParentPointers(tid,dirtypages,leftPage);
setEmptyPage(tid,dirtypages,rightPage.getId().getPageNumber());
deleteParentEntry(tid,dirtypages,leftPage,parent,parentEntry);
}
标签:dirtypages,parent,节点,next,Lab5,tid,MIT6.830,page
From: https://www.cnblogs.com/rockdow/p/18048627