首页 > 编程语言 >【源码阅读】3. 建表

【源码阅读】3. 建表

时间:2023-06-25 18:56:04浏览次数:43  
标签:建表 key partition list String 源码 RESULT 阅读 desc

| KW_CREATE opt_external:isExternal KW_TABLE opt_if_not_exists:ifNotExists table_name:name
            LPAREN column_definition_list:columns COMMA index_definition_list:indexes RPAREN opt_engine:engineName
            opt_keys:keys
            opt_comment:tableComment
            opt_partition:partition
            opt_distribution:distribution
            opt_rollup:index
            opt_properties:tblProperties
            opt_ext_properties:extProperties
    {:
        RESULT = new CreateTableStmt(ifNotExists, isExternal, name, columns, indexes, engineName, keys, partition,
        distribution, tblProperties, extProperties, tableComment, index);
    :}                           
    
 CreateTableStmt最终解析出的属性如下所示:
    protected TableName tableName;                                              // 表名
    protected List<ColumnDef> columnDefs;                                 // 列定义
    private List<IndexDef> indexDefs;                                            // 索引定义
  protected PartitionDesc partitionDesc;               // 分区信息
  protected DistributionDesc distributionDesc;         // 分桶方式
    protected KeysDesc keysDesc;                                                          // 数据模型
    protected Map<String, String> properties;                                                     // 属性
    private String comment;                                                                                   // 评论
    private List<AlterClause> rollupAlterClauseList = Lists.newArrayList();     // rollup
 

建表执行

总体视角

创建时序图 - createOlapTable

 

细化过程

排序key选择逻辑Env.calcShortKeyColumnCount

默认最多3个排序字段,遍历每个key字段

● 如果加上此字段,key长度超过36个字节

○ 如果是char家族,这个字段算

○ 如果不是char家族,这个字段不算

● 如果加上此字段,key长度没有36个字节

○ 如果是varchar:这个字段算,退出

○ 否则这个字段算

 

表结构细化

每个分区下都存在相同索引

每个索引下都存在相同数量Tablet

Tablet总数为:分区数*索引数*分桶数

Replica总数为:

 

BE选择逻辑

每个(Partition, Rollup, Tablet)组合根据副本数去选择BE存放副本

 

RPC逻辑

每个(Partition, Rollup)把下面累积的CreateReplicaTask组装到AgentBatchTask中并通过AgentTaskExecutor发送

 

properties应用情况

建表时可指定的属性如下:

    public static final String PROPERTIES_REPLICATION_NUM = "replication_num";
    public static final String PROPERTIES_REPLICATION_ALLOCATION = "replication_allocation";
    public static final String PROPERTIES_SHORT_KEY = "short_key";    
    public static final String PROPERTIES_ENABLE_LIGHT_SCHEMA_CHANGE = "light_schema_change";
    public static final String PROPERTIES_STORAGE_FORMAT = "storage_format";                            //V2 
    public static final String PROPERTIES_DISABLE_AUTO_COMPACTION = "disable_auto_compaction";
    public static final String PROPERTIES_COMPRESSION = "compression";
    public static final String ENABLE_UNIQUE_KEY_MERGE_ON_WRITE = "enable_unique_key_merge_on_write";
    public static final String PROPERTIES_BF_COLUMNS = "bloom_filter_columns";
    public static final String PROPERTIES_BF_FPP = "bloom_filter_fpp";
    public static final String PROPERTIES_AUTO_BUCKET = "_auto_bucket";
    public static final String PROPERTIES_ESTIMATE_PARTITION_SIZE = "estimate_partition_size";
    public static final String PROPERTIES_INMEMORY = "in_memory";
    public static final String PROPERTIES_STORAGE_POLICY = "storage_policy";                            // Policy
    public static final String PROPERTIES_TABLET_TYPE = "tablet_type";
    public static final String PROPERTIES_STORAGE_MEDIUM = "storage_medium";                            // SSD, HDD
    public static final String PROPERTIES_STORAGE_COOLDOWN_TIME = "storage_cooldown_time";
    public static final String PROPERTIES_DATA_BASE_TIME = "data_base_time_ms";
    public static final String PROPERTIES_COLOCATE_WITH = "colocate_with";
    public static final String PROPERTIES_STORAGE_TYPE = "storage_type";                                // COLUMN
    public static final String PROPERTIES_SCHEMA_VERSION = "schema_version";
    public static final String PROPERTIES_FUNCTION_COLUMN = "function_column";
    public static final String PROPERTIES_SEQUENCE_TYPE = "sequence_type";
    public static final String PROPERTIES_SEQUENCE_COL = "sequence_col";
    public static final String PROPERTIES_VERSION_INFO = "version_info";
 

BE交互

 

 

其他

分区Partition

Partition的示例如下

PARTITION BY RANGE(`date`)
(
    PARTITION `p201701` VALUES LESS THAN ("2017-02-01"),
    PARTITION `p201702` VALUES LESS THAN ("2017-03-01"),
    PARTITION `p201703` VALUES LESS THAN ("2017-04-01")
)

PARTITION BY LIST(`city`)
(
    PARTITION `p_cn` VALUES IN ("Beijing", "Shanghai", "Hong Kong"),
    PARTITION `p_usa` VALUES IN ("New York", "San Francisco"),
    PARTITION `p_jp` VALUES IN ("Tokyo")
)
 

总体定义

opt_partition ::=
    /* Empty: no partition */
    {:
        RESULT = null;
    :}
    /* Range partition */
    | KW_PARTITION KW_BY KW_RANGE LPAREN ident_list:columns RPAREN
            LPAREN opt_all_partition_desc_list:list RPAREN
    {:
        RESULT = new RangePartitionDesc(columns, list);
    :}
    /* List partition */
    | KW_PARTITION KW_BY KW_LIST LPAREN ident_list:columns RPAREN
            LPAREN opt_all_partition_desc_list:list RPAREN
    {:
        RESULT = new ListPartitionDesc(columns, list);
    :}
    ;
 

Partition范围块的顶层设计

###   多个partition的定义    
opt_all_partition_desc_list ::=
    /* Empty */
    {:
        RESULT = null;
    :}
    | all_partition_desc_list:list
    {:
        RESULT = list;
    :}
    ;
all_partition_desc_list ::=
    all_partition_desc_list:list COMMA single_partition_desc:desc
    {:
        list.add(desc);
        RESULT = list;
    :}
    | single_partition_desc:desc
    {:
        RESULT = Lists.newArrayList(desc);
    :}
    | all_partition_desc_list:list COMMA multi_partition_desc:desc
    {:
        list.add(desc);
        RESULT = list;
    :}
    | multi_partition_desc:desc
    {:
        RESULT = Lists.newArrayList(desc);
    :}
    ;
 

单partition块的设计

## 单个partition行定义    
## PARTITION `xx` VALUES LESS THAN ("2017-02-01")=>SinglePartitionDesc(PartitionKeyDesc(List<PartitionValue>))
## PARTITION `xx` VALUES [("a","b"),("a","b")) => SinglePartitionDesc(PartitionKeyDesc(List<PartitionValue>,List<PartitionValue>))
## PARTITION `p_cn` VALUES IN ("A", "B") => SinglePartitionDesc(PartitionKeyDesc(List<List<PartitionValue>>))

single_partition_desc ::=
    KW_PARTITION opt_if_not_exists:ifNotExists ident:partName KW_VALUES KW_LESS KW_THAN partition_key_desc:desc
        opt_key_value_map:properties
    {:
        RESULT = new SinglePartitionDesc(ifNotExists, partName, desc, properties);
    :}
    | KW_PARTITION opt_if_not_exists:ifNotExists ident:partName KW_VALUES fixed_partition_key_desc:desc
        opt_key_value_map:properties
    {:
        RESULT = new SinglePartitionDesc(ifNotExists, partName, desc, properties);
    :}
    /* list partition */
    | KW_PARTITION opt_if_not_exists:ifNotExists ident:partName KW_VALUES KW_IN list_partition_key_desc:desc
        opt_key_value_map:properties
    {:
        RESULT = new SinglePartitionDesc(ifNotExists, partName, desc, properties);
    :}
    ; 
 

range分区值定义部分partition_key_desc,fixed_partition_key_desc

###   LESS THAN分区值("a","b","c")   =>   PartitionKeyDesc(List<PartitionValue>)
partition_key_desc ::=
    KW_MAX_VALUE
    {:
        RESULT = PartitionKeyDesc.createMaxKeyDesc();
    :}
    | LPAREN partition_key_list:keys RPAREN
    {:
        RESULT = PartitionKeyDesc.createLessThan(keys);
    :}
    ;    
###  VALUES开闭区间[("a","b","c"),("a","b","c")) => PartitionKeyDesc(List<PartitionValue>,List<PartitionValue>)
fixed_partition_key_desc ::=
    /* format: [(lower), (upper))*/
    LBRACKET LPAREN partition_key_list:lower RPAREN COMMA LPAREN partition_key_list:upper RPAREN RPAREN
    {:
        RESULT = PartitionKeyDesc.createFixed(lower, upper);
    :}
    ; 
    
###   "a","b","c"   =>   List<PartitionValue>
partition_key_list ::=
    /* empty */
    {:
        List<PartitionValue> l = new ArrayList<PartitionValue>();
        RESULT = l;
    :}
    | partition_key_list:l COMMA STRING_LITERAL:item
    {:
        l.add(new PartitionValue(item));
        RESULT = l;
    :}
    | partition_key_list:l COMMA KW_MAX_VALUE
    {:
        l.add(PartitionValue.MAX_VALUE);
        RESULT = l;
    :}
    | STRING_LITERAL:item
    {:
        RESULT = Lists.newArrayList(new PartitionValue(item));
    :}
    | KW_MAX_VALUE
    {:
        RESULT = Lists.newArrayList(PartitionValue.MAX_VALUE);
    :}
    ;
 

list分区值定义部分list_partition_key_desc

###  LIST分区部分        
###   (("abc","dbf"), ("abc","dbf"))   =>   PartitionKeyDesc(List<List<PartitionValue>>) 
###  加上最外层()
list_partition_key_desc ::=
    LPAREN list_partition_values_list:keys RPAREN
    {:
        RESULT = PartitionKeyDesc.createIn(keys);
    :}
    ;    
###  单体或,分割    
list_partition_values_list ::=
    partition_value_list:item
    {:
        ArrayList<List<PartitionValue>> l = new ArrayList();
        l.add(item);
        RESULT = l;
    :}
    | list_partition_values_list:l COMMA partition_value_list:item
    {:
        l.add(item);
        RESULT = l;
    :}
    ;
###   "abc"或者("abc")或者("abc","def")    =>  List<PartitionValue>
partition_value_list ::=
    /* single partition key */
    STRING_LITERAL:item
    {:
        RESULT = Lists.newArrayList(new PartitionValue(item));
    :}
    /* multi partition keys : (1, "beijing") */
    | LPAREN partition_key_item_list:l RPAREN
    {:
        RESULT = l;
    :}
    ;  
partition_key_item_list ::=
    STRING_LITERAL:item
    {:
        RESULT = Lists.newArrayList(new PartitionValue(item));
    :}
    | partition_key_item_list:l COMMA STRING_LITERAL:item
    {:
        l.add(new PartitionValue(item));
        RESULT = l;
    :}
    ;     

 

 

标签:建表,key,partition,list,String,源码,RESULT,阅读,desc
From: https://www.cnblogs.com/xutaoustc/p/17503704.html

相关文章

  • 【源码阅读】4. Stream Load 导入任务的执行流程
     FE起手路由在访问curl--location-trusted-uroot:-Ttest.csv-H"column_separator:,"http://127.0.0.1:8030/api/demo/example_tbl/_stream_load时,FE如下操作:● 检查用户名密码● 检查权限● 随机选择一个BE,Redirect到这个BE上 BE_stream_load的handler......
  • Bert Pytorch 源码分析:二、注意力层
    #注意力机制的具体模块#兼容单头和多头classAttention(nn.Module):"""Compute'ScaledDotProductAttention""" #QKV尺寸都是BS*ML*ES #(或者多头情况下是BS*HC*ML*HS,最后两维之外的维度不重要) #从输入计算QKV的过程可以统一处理,不必......
  • 【源码阅读】2. Catalog和Database
     Catalog创建|KW_CREATEKW_CATALOGopt_if_not_exists:ifNotExistsident:catalogNameopt_properties:properties{:RESULT=newCreateCatalogStmt(ifNotExists,catalogName,null,properties);:}|KW_CREATEKW_CATALOGopt_if_not_......
  • 【源码阅读】1. 配置、VARIABLE与用户PROPERTY
     配置初始化在FE启动时:● Config类ConfField注解标记的静态属性反射出Field存储到内存confFields,作为一个可读取和修改的属性列表(真正的值存储在Config类的静态属性中,反射出Field并存储到confFields只是一个读取和修改指针而已)● 读取配置文件,根据配置文件内容,设置Confi......
  • Bert PyTorch 源码分析:一、嵌入层
    #标记嵌入就是最普通的嵌入层#接受单词ID输出单词向量#直接转发给了`nn.Embedding`classTokenEmbedding(nn.Embedding):def__init__(self,vocab_size,embed_size=512):super().__init__(vocab_size,embed_size,padding_idx=0) #片段嵌入实际上是......
  • 谁与争锋!手机直播源码知识分享之主播PK功能
    今天我要分享的知识与PK有关,PK是指某些人分成几方进行对决、对抗,直到分出胜负。PK的方式有很多,在现实生活中,人们可以通过智力、力量等进行PK,方式可以是搏斗、扳手腕、现场智力问答等;而在网络中,人们可以通过游戏、网络智力问答的方式进行PK。我今天要讲的这个功能也是网络中的PK,这个......
  • 2.nacos-client源码及查看
    nacos-client.2.2.1-RC.SDK查看源码官网JAVASDK链接主要内容<dependency><groupId>com.alibaba.nacos</groupId><artifactId>nacos-client</artifactId><version>${version}</version></dependency>问题:1.获取配置api是获取快照......
  • k8s驱逐篇(7)-kube-controller-manager驱逐-taintManager源码分析
    概述taintManager的主要功能为:当某个node被打上NoExecute污点后,其上面的pod如果不能容忍该污点,则taintManager将会驱逐这些pod,而新建的pod也需要容忍该污点才能调度到该node上;通过kcm启动参数--enable-taint-manager来确定是否启动taintManager,true时启动(启动参数默认值为true);k......
  • spring源码笔记
    Bean创建流程获取对象的BeanDefinition通过反射创建空对象填充属性调用init方法  Bean创建关键方法(按顺序)getBeandoGetBeancreateBeandoCreateBeancreateBeanInstancepopulateBean  解决循环依赖:三级缓存循环依赖原因单例,每个类只有一个对象。A引用B,B又......
  • 阅读-《人生烦恼咨询室》
    作者:桦泽紫苑第一章人际关系第二章私人生活第三章工作第四章健康第五章心理......