MIT6.830-Lab2

标签：null hasNext opIterators tuple next Lab2 new MIT6.830

实验部分

实验1

Predicate类

用来存放对表记录进行条件过滤的信息（要过滤字段的序号，具体的比较规则，用来比较的字段），其内部的枚举类Op就是比较规则类，filter()方法的实现使用Field接口中的compare()即可。
JoinPredicate类

用来存放两表的记录进行连接的信息（两表之间要连接字段的序号，连接规则），其实现和Predicate基本一致。
Filter类

真正实现过滤操作，内部含有Predicate类对象和OpIterator数组（OpIterator是每个操作类都要实现的接口，用来对生成的结果集进行遍历，数组中元素指向SeqScan实例对象），它继承了Opreator类（该类为抽象类，实现了OpIterator的next()和hasNext（）方法）以简化编码。
重点在fetchNext()方法的实现，可以通过OpIterator数组中的对象逐条取出表记录，并用Predicate对象判断该条记录是否符合条件：

protected Tuple fetchNext() throws NoSuchElementException,
            TransactionAbortedException, DbException {
        Tuple next = null;
        while(opIterator[0].hasNext()){
            next = opIterator[0].next();
            if(next == null) {
                break;
            }
            else {
                if(predicate.filter(next))
                    break;
                else
                    // 如果不匹配predicate的规则，那么要把next置为null，避免将最后一条不符合规则的记录返回，导致出错
                    next = null;
            }
        }
        return next;

    }

Join类

Join和Filter类似，重点同样在于fetchNext()方法，我使用了简单的嵌套循环方式实现，left变量存放join连接中左表的记录，right变量存放右表的记录，leftThreadLocal变量则是记录当前左表遍历到哪条记录（使用嵌套循环时，以左表的某条记录为基准，到右表中查找符合连接条件的记录，当寻找到右表中一条该类记录后，fetchNext()方法就要返回，但此时右表很可能还未遍历完，因此需要将左表的当前记录保存下来，用于右表剩余数据的遍历）

protected Tuple fetchNext() throws TransactionAbortedException, DbException {
        Tuple left = null;
        Tuple right = null;
        // 判断 opIterators[0].hasNext() || opIterators[1].hasNext() 是让 opIterators[0]拿出最后一个元素后可以再遍历一遍 opIterators[1]中的元素
        outer:while(opIterators[0].hasNext() || opIterators[1].hasNext()){
            if(initial==false || !opIterators[1].hasNext()){
                initial = true;
                if(!opIterators[1].hasNext())
                    opIterators[1].rewind();
                leftThreadLocal.set(opIterators[0].next());
            }
            left = leftThreadLocal.get();
            right = null;
            if(left != null){
                while (opIterators[1].hasNext()){
                    right = opIterators[1].next();
                    if(right != null){
                        if(joinPredicate.filter(left,right)){
                            break outer;
                        }else {
                            right = null;
                        }
                    }
                }
            }
        }
        if(left == null || right == null){
            return null;
        }else{
            Tuple tuple = new Tuple(this.getTupleDesc());
            Iterator<Field> lIter = left.fields();
            Iterator<Field> rIter = right.fields();
            int idx = 0;
            while (lIter.hasNext()){
                tuple.setField(idx++,lIter.next());
            }
            while (rIter.hasNext()){
                tuple.setField(idx++,rIter.next());
            }
            return tuple;
        }
    }

实验2

Aggregator类

它是每个聚合函数都要实现的接口，内部的Op类规定了具体的聚合规则，mergeTupleIntoGroup()方法用于将表记录按照group by字段分组，用于之后的聚合计算，iterator()方法返回OpIterator对象，用来遍历聚合结果。
IntegerAggregator类

它是整形字段聚合方法的实现类，其中gbFiled、aField、gbFieldType成员变量分别代表用于分组字段的序号、聚合字段的序号、分组字段的类型，成员变量Map<Field,ArrayList<IntField>> map用来记录每个分组含有哪些记录的聚合字段（SQL查询中，如果存在聚合和分组操作，结果集中只能包含聚合字段和分组字段）。先看mergeTupleIntoGroup()实现，整体思路就是如果没有分组字段，那么就将所有记录的聚合字段分为一组，如果存在分组字段，那么就按照分组字段将记录的聚合字段存入不同的组：

public void mergeTupleIntoGroup(Tuple tup) {
        if(gbFiled == Aggregator.NO_GROUPING){
            StringField noGB = new StringField("NoGB", 4);
            if(map.isEmpty()){
                ArrayList<IntField> intFields = new ArrayList<>();
                intFields.add((IntField) tup.getField(aField));
                map.put(noGB,intFields);
            }else {
                map.get(noGB).add((IntField) tup.getField(aField));
            }
        }else {
            if(!map.containsKey(tup.getField(gbFiled))){
                ArrayList<IntField> intFields = new ArrayList<>();
                intFields.add((IntField) tup.getField(aField));
                map.put(tup.getField(gbFiled),intFields);
            }else {
                ArrayList<IntField> arr = map.get(tup.getField(gbFiled));
                arr.add((IntField) tup.getField(aField));
            }
        }
    }

接着是iterator()方法实现，我选择在调用OpIterator对象的open()方法时，将Map<Field,ArrayList<IntField>> map中的数据处理成对应聚合函数的格式：

public void open() throws DbException, TransactionAbortedException {
                this.idx = 0;
                arr = new ArrayList<>();
                if(gbFiled == Aggregator.NO_GROUPING){
                    this.tupleDesc = new TupleDesc(new Type[]{Type.INT_TYPE});
                    Tuple tuple = new Tuple(this.tupleDesc);
                    ArrayList<IntField> list = map.get(new StringField("NoGB", 4));
                    tuple.setField(0,list.get(0));
                    this.arr.add(tuple);
                    int sum = list.get(0).getValue();
                    for (int i=1;i< list.size();i++){
                        IntField tupleField = (IntField) tuple.getField(0);
                        IntField listField = list.get(i);
                        switch (op){
                            case MIN:
                                if(tupleField.compare(Predicate.Op.GREATER_THAN,listField))
                                    tuple.setField(0,listField);
                                break;
                            case MAX:
                                if(tupleField.compare(Predicate.Op.LESS_THAN,listField))
                                    tuple.setField(0,listField);
                                break;
                            case AVG:
                            case SUM:
                                sum += listField.getValue();
                                break;
                            case COUNT:
                                break;
                            default:
                                throw new DbException(op+" is unimplemented");
                        }
                    }
                    if(op == SUM){
                        tuple.setField(0,new IntField(sum));
                    }
                    if(op == AVG){
                        tuple.setField(0,new IntField(sum/ list.size()));
                    }
                    if(op == Op.COUNT){
                        tuple.setField(0,new IntField(list.size()));
                    }
                }else{
                    if(gbFieldType == Type.STRING_TYPE)
                        this.tupleDesc = new TupleDesc(new Type[]{Type.STRING_TYPE,Type.INT_TYPE});
                    else
                        this.tupleDesc = new TupleDesc(new Type[]{Type.INT_TYPE,Type.INT_TYPE});
                    for(Field field: map.keySet()){
                        ArrayList<IntField> intFields = map.get(field);
                        Tuple tuple = new Tuple(this.tupleDesc);
                        tuple.setField(0,field);
                        tuple.setField(1, intFields.get(0));
                        this.arr.add(tuple);
                        int sum = intFields.get(0).getValue();
                        for (int i=1;i<intFields.size();i++){
                            IntField intField = intFields.get(i);
                            IntField tupleField = (IntField) tuple.getField(1);
                            switch (op){
                                case MIN:
                                    if(tupleField.compare(Predicate.Op.GREATER_THAN,intField))
                                        tuple.setField(1,intField);
                                    break;
                                case MAX:
                                    if(tupleField.compare(Predicate.Op.LESS_THAN,intField))
                                        tuple.setField(1,intField);
                                    break;
                                case AVG:
                                case SUM:
                                    sum += intField.getValue();
                                    break;
                                case COUNT:
                                    break;
                                default:
                                    throw new DbException(op+" is unimplemented");
                            }
                        }
                        if(op == SUM){
                            tuple.setField(1,new IntField(sum));
                        }
                        if(op == AVG){
                            tuple.setField(1,new IntField(sum/intFields.size()));
                        }
                        if(op == Op.COUNT){
                            tuple.setField(1,new IntField(intFields.size()));
                        }
                    }
                }
                this.isOpen = true;
            }

StringAggregator类

实现类似IntegerAggregator，略过
Aggregate类

它是对IntegerAggregator和StringAggregator使用的一个封装类，成员变量opIterators数组指向待聚合记录的迭代器，aggregator指向具体的聚合器（IntegerAggregator或StringAggregator类型），aggIterator是该聚合器的迭代器，用来遍历聚合结果。重要的是open()方法，先将待聚合记录分组，再处理数据格式：

   public void open() throws NoSuchElementException, DbException,
            TransactionAbortedException {
        super.open();
        opIterators[0].open();
        while (opIterators[0].hasNext()){
            aggregator.mergeTupleIntoGroup(opIterators[0].next());
        }
        this.aggIterator = aggregator.iterator();
        this.aggIterator.open();
    }

标签：null,hasNext,opIterators,tuple,next,Lab2,new,MIT6.830
From： https://www.cnblogs.com/rockdow/p/18037957

实验部分

实验1

实验2

相关文章

赞助商

阅读排行