首页 > 其他分享 >lightdb单机分布式测试

lightdb单机分布式测试

时间:2022-11-26 16:34:58浏览次数:74  
标签:rows lightdb 单机 branch table canopy 1000 id 分布式

  lightdb默认采用分布式、集中式一体化架构,单实例仍然可以启用分布式架构。

环境配置

  假设已经安装了lightdb,默认情况下,安装分布式的时候会自动为create database创建canopy插件,也就是分布式版。可通过show %lib%确认,如下:

[zjh@hs-10-20-30-193 ~]$ ltsql -p23456
ltsql (13.8-22.3)
Type "help" for help.

zjh@lt_test=# show %lib%;
           name            |                                                                            setting                                                                             |  
                          description                            
---------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+--
-----------------------------------------------------------------
 dynamic_library_path      | $libdir                                                                                                                                                        | S
ets the path for dynamically loadable modules.
 local_preload_libraries   |                                                                                                                                                                | L
ists unprivileged shared libraries to preload into each backend.
 session_preload_libraries | lt_cheat_funcs                                                                                                                                                 | L
ists shared libraries to preload into each backend.
 shared_preload_libraries  | canopy,lt_stat_statements,lt_stat_activity,autoinc,auto_explain,lt_prewarm,lt_cron,ltaudit,lt_hint_plan,lt_show_plans,pg_stat_kcache,lt_standby_forward,lt_ope | L
ists shared libraries to preload into server.
 ssl_library               | OpenSSL                                                                                                                                                        | N
ame of the SSL library.
(5 rows)

zjh@lt_test=# create extension canopy;
CREATE EXTENSION
zjh@lt_test=# select * from pg_dist_node;
 nodeid | groupid | nodename | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards 
--------+---------+----------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
(0 rows)

创建分布式表

zjh@lt_test=# create table canopy_table(id int primary key,v text);
CREATE TABLE
zjh@lt_test=# select * from pg_dist_shard;
 logicalrelid | shardid | shardstorage | shardminvalue | shardmaxvalue 
--------------+---------+--------------+---------------+---------------
(0 rows)

zjh@lt_test=# select create_distributed_table('canopy_table','id');
 create_distributed_table 
--------------------------
 
(1 row)

zjh@lt_test=# select * from pg_dist_shard;
 logicalrelid | shardid | shardstorage | shardminvalue | shardmaxvalue 
--------------+---------+--------------+---------------+---------------
 canopy_table |  102008 | t            | -2147483648   | -2013265921
 canopy_table |  102009 | t            | -2013265920   | -1879048193
 canopy_table |  102010 | t            | -1879048192   | -1744830465
 canopy_table |  102011 | t            | -1744830464   | -1610612737
 canopy_table |  102012 | t            | -1610612736   | -1476395009
 canopy_table |  102013 | t            | -1476395008   | -1342177281
 canopy_table |  102014 | t            | -1342177280   | -1207959553
 canopy_table |  102015 | t            | -1207959552   | -1073741825
 canopy_table |  102016 | t            | -1073741824   | -939524097
 canopy_table |  102017 | t            | -939524096    | -805306369
 canopy_table |  102018 | t            | -805306368    | -671088641
 canopy_table |  102019 | t            | -671088640    | -536870913
 canopy_table |  102020 | t            | -536870912    | -402653185
 canopy_table |  102021 | t            | -402653184    | -268435457
 canopy_table |  102022 | t            | -268435456    | -134217729
 canopy_table |  102023 | t            | -134217728    | -1
 canopy_table |  102024 | t            | 0             | 134217727
 canopy_table |  102025 | t            | 134217728     | 268435455
 canopy_table |  102026 | t            | 268435456     | 402653183
 canopy_table |  102027 | t            | 402653184     | 536870911
 canopy_table |  102028 | t            | 536870912     | 671088639
 canopy_table |  102029 | t            | 671088640     | 805306367
 canopy_table |  102030 | t            | 805306368     | 939524095
 canopy_table |  102031 | t            | 939524096     | 1073741823
 canopy_table |  102032 | t            | 1073741824    | 1207959551
 canopy_table |  102033 | t            | 1207959552    | 1342177279
 canopy_table |  102034 | t            | 1342177280    | 1476395007
 canopy_table |  102035 | t            | 1476395008    | 1610612735
 canopy_table |  102036 | t            | 1610612736    | 1744830463
 canopy_table |  102037 | t            | 1744830464    | 1879048191
 canopy_table |  102038 | t            | 1879048192    | 2013265919
 canopy_table |  102039 | t            | 2013265920    | 2147483647
(32 rows)

zjh@lt_test=# select * from pg_dist_node;
 nodeid | groupid | nodename  | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards 
--------+---------+-----------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
      1 |       0 | localhost |    23456 | default  | t           | t        | primary  | default     | t              | t
(1 row)

zjh@lt_test=# create table canopy_table_detail(id int primary key,v text,branch_id varchar(100));
CREATE TABLE
zjh@lt_test=# create table canopy_table_branch(v text,branch_id varchar(100));
CREATE TABLE
zjh@lt_test=# select create_distributed_table('canopy_table_detail','id');
 create_distributed_table 
--------------------------
 
(1 row)

zjh@lt_test=# select create_reference_table('canopy_table_branch');
 create_reference_table 
------------------------
 
(1 row)
--------------插入数据
zjh@lt_test=# insert into canopy_table select id, uuid() from generate_series(1,10000000) id;
INSERT 0 10000000
zjh@lt_test=# SELECT update_distributed_table_colocation('canopy_table_detail', colocate_with => 'canopy_table');
 update_distributed_table_colocation 
-------------------------------------
 
(1 row)
zjh@lt_test=# insert into canopy_table_detail select id, uuid(),id % 1000 from generate_series(1,1000000) id;
INSERT 0 1000000
zjh@lt_test=# select * from canopy_table_branch ;
 v | branch_id 
---+-----------
(0 rows)

zjh@lt_test=# insert into canopy_table_branch select uuid(),id from generate_series(1,1000) id;
INSERT 0 1000

注:lightdb也支持distributed by (col)语法,如:create table canopy_table_native(id int primary key,v text) distributed by (id); create table canopy_table_native(id int primary key,v text) distributed REPLICATED;

用户可以自行选择使用哪种语法。从lightdb 23c开始,如果在非分布式环境(canopy插件未启用或参数lightdb_arch_mode=classic)下指定了distributed by (id)子句,只是会被忽略,而不会报错,集中式、分布式更加一体化。

从22.4开始,lightdb支持不带distributed by子句的原生分布式表(不过主要用于POC目的),启用了canopy插件且参数lightdb_arch_mode=dist,默认会创建分布式表,会先取主键、没有主键取非唯一索引,否则报错。如果在分布式架构下希望创建本地表或参照表,则需要指定local子句,即create local table。

一般来说,生产推荐lightdb_arch_mode=off,通过distributed by子句创建分布式表、通过DISTRIBUTED REPLICATED子句创建复制表、不带子句创建本地表,开发学习可以启用该参数、更加开箱即用。

查看分布式的执行效果

zjh@lt_test=# explain analyze select b.branch_id,max(a.id),count(1),max(a.v) from canopy_table a,canopy_table_detail b,canopy_table_branch c where a.id=b.id and b.branch_id = c.branch_id group by b.branch_id;
                                                                                       QUERY PLAN                                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1000.00..1003.50 rows=200 width=262) (actual time=193.039..193.216 rows=999 loops=1)
   Group Key: remote_scan.branch_id
   Batches: 1  Memory Usage: 593kB
   ->  Custom Scan (Canopy Adaptive)  (cost=0.00..0.00 rows=100000 width=262) (actual time=183.568..185.866 rows=31968 loops=1)
         Task Count: 32
         Tuple data received from nodes: 1464 kB
         Tasks Shown: One of 32
         ->  Task
               Tuple data received from node: 46 kB
               Node: host=localhost port=23456 dbname=lt_test
               ->  HashAggregate  (cost=8334.15..8336.15 rows=200 width=262) (actual time=106.176..106.386 rows=999 loops=1)
                     Group Key: b.branch_id
                     Batches: 1  Memory Usage: 337kB
                     ->  Hash Join  (cost=16.95..8215.19 rows=11896 width=254) (actual time=0.381..91.031 rows=31230 loops=1)
                           Hash Cond: ((b.branch_id)::text = (c.branch_id)::text)
                           ->  Nested Loop  (cost=0.42..7761.80 rows=8204 width=254) (actual time=0.059..80.260 rows=31253 loops=1)
                                 ->  Seq Scan on canopy_table_detail_102056 b  (cost=0.00..375.04 rows=8204 width=222) (actual time=0.024..6.619 rows=31253 loops=1)
                                 ->  Index Scan using canopy_table_pkey_102024 on canopy_table_102024 a  (cost=0.42..0.90 rows=1 width=36) (actual time=0.002..0.002 rows=1 loops=31253)
                                       Index Cond: (id = b.id)
                           ->  Hash  (cost=12.90..12.90 rows=290 width=218) (actual time=0.307..0.308 rows=1000 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 44kB
                                 ->  Seq Scan on canopy_table_branch_102072 c  (cost=0.00..12.90 rows=290 width=218) (actual time=0.014..0.157 rows=1000 loops=1)
                   Planning Time: 0.652 ms
                   Execution Time: 106.715 ms
 Planning Time: 1.104 ms
 Execution Time: 193.623 ms
(26 rows)

zjh@lt_test=# select b.branch_id,max(a.id),count(1),max(a.v) from canopy_table a,canopy_table_detail b,canopy_table_branch c where a.id=b.id and b.branch_id = c.branch_id group by b.branch_id order by b.branch_id limit 10;
 branch_id |  max   | count |                 max                  
-----------+--------+-------+--------------------------------------
 1         | 999001 |  1000 | ffe56d6d-b0a3-45f8-8142-68dca1641c2f
 10        | 999010 |  1000 | fff9702e-1aa6-4b8b-a6b7-00d085ad00b3
 100       | 999100 |  1000 | fff65b94-d2cf-4b6b-95cb-4e7128d59835
 101       | 999101 |  1000 | ffb431bf-73d9-4a4b-b4bf-71380f8377be
 102       | 999102 |  1000 | ffff5f74-b8ed-4582-85c2-0e7866c14ff7
 103       | 999103 |  1000 | ffe86f77-1031-4b5f-8939-e37dffe74e7b
 104       | 999104 |  1000 | fff5f9d4-eb29-4c91-9fab-a0c6f75067be
 105       | 999105 |  1000 | ff706172-8281-4458-bd14-6b1d83941a72
 106       | 999106 |  1000 | ffbdff1e-5f42-42ce-936d-8e56d402ebfe
 107       | 999107 |  1000 | ffdd02c9-ab40-4a1e-96df-4bb0fd085cba
(10 rows)

Time: 101.436 ms

创建对应的非分布式表,然后对比性能

zjh@lt_test=# create table canopy_table_detail_classic(id int primary key,v text,branch_id varchar(100));
CREATE TABLE
zjh@lt_test=# create table canopy_table_branch_classic(v text,branch_id varchar(100));
CREATE TABLE
zjh@lt_test=# create table canopy_table_classic(id int primary key,v text);
CREATE TABLE

zjh@lt_test=# insert into canopy_table_branch_classic select uuid(),id from generate_series(1,1000) id;
INSERT 0 1000
zjh@lt_test=# insert into canopy_table_detail_classic select id, uuid(),id % 1000 from generate_series(1,1000000) id;
INSERT 0 1000000
zjh@lt_test=# insert into canopy_table_classic select id, uuid() from generate_series(1,10000000) id;
INSERT 0 10000000
zjh@lt_test=# explain analyze select b.branch_id,max(a.id),count(1),max(a.v) from canopy_table_classic a,canopy_table_detail_classic b,canopy_table_branch_classic c where a.id=b.id and b.branch_id = c.branch_id group by b.branch_id;
                                                                                             QUERY PLAN                                                                                        
     
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
 HashAggregate  (cost=391963.76..391973.76 rows=1000 width=47) (actual time=746.889..746.998 rows=999 loops=1)
   Group Key: b.branch_id
   Batches: 1  Memory Usage: 321kB
   ->  Hash Join  (cost=33.36..381963.76 rows=1000000 width=39) (actual time=0.204..540.833 rows=999000 loops=1)
         Hash Cond: ((b.branch_id)::text = (c.branch_id)::text)
         ->  Merge Join  (cost=0.86..368181.26 rows=1000000 width=39) (actual time=0.030..392.142 rows=1000000 loops=1)
               Merge Cond: (a.id = b.id)
               ->  Index Scan using canopy_table_classic_pkey on canopy_table_classic a  (cost=0.43..298916.92 rows=11869166 width=36) (actual time=0.013..125.229 rows=1000001 loops=1)
               ->  Index Scan using canopy_table_detail_classic_pkey on canopy_table_detail_classic b  (cost=0.42..27091.42 rows=1000000 width=7) (actual time=0.011..105.792 rows=1000000 loop
s=1)
         ->  Hash  (cost=20.00..20.00 rows=1000 width=3) (actual time=0.169..0.169 rows=1000 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 44kB
               ->  Seq Scan on canopy_table_branch_classic c  (cost=0.00..20.00 rows=1000 width=3) (actual time=0.005..0.093 rows=1000 loops=1)
 Planning Time: 0.306 ms
 Execution Time: 747.084 ms
(14 rows)

增加avg函数,如下:

zjh@lt_test=# select b.branch_id,avg(a.id),count(1),max(a.v) from canopy_table_classic a,canopy_table_detail_classic b,canopy_table_branch_classic c where a.id=b.id and b.branch_id = c.branch_id group by b.branch_id order by b.branch_id limit 10;
 branch_id |         avg         | count |                 max                  
-----------+---------------------+-------+--------------------------------------
 1         | 499501.000000000000 |  1000 | ffe3c50f-1c10-441a-b972-464f63f86e65
 10        | 499510.000000000000 |  1000 | ffde5346-62aa-4b1b-ae66-2af452716a87
 100       | 499600.000000000000 |  1000 | ffb84ead-bfe4-418b-8b7f-1a4d0159a9be
 101       | 499601.000000000000 |  1000 | ffe774ab-522f-4652-9994-376cee2dc12c
 102       | 499602.000000000000 |  1000 | ff924467-32c8-4d34-b1ad-4bab49a435cb
 103       | 499603.000000000000 |  1000 | ffd1c3e4-9c15-47f1-85ee-7baafc8352a7
 104       | 499604.000000000000 |  1000 | ffe80998-0a34-44db-95f9-a17417a0f954
 105       | 499605.000000000000 |  1000 | ffdb53fc-f684-4d55-98a2-af12725767ae
 106       | 499606.000000000000 |  1000 | ffcdde21-1a7c-49f6-bc86-116c96b80af9
 107       | 499607.000000000000 |  1000 | ff0fa26f-0b36-40d3-9dec-63ae1a0655cc
(10 rows)

Time: 687.505 ms
zjh@lt_test=# select b.branch_id,avg(a.id),count(1),max(a.v) from canopy_table a,canopy_table_detail b,canopy_table_branch c where a.id=b.id and b.branch_id = c.branch_id group by b.branch_id order by b.branch_id limit 10;
 branch_id |         avg         | count |                 max                  
-----------+---------------------+-------+--------------------------------------
 1         | 499501.000000000000 |  1000 | ffe56d6d-b0a3-45f8-8142-68dca1641c2f
 10        | 499510.000000000000 |  1000 | fff9702e-1aa6-4b8b-a6b7-00d085ad00b3
 100       | 499600.000000000000 |  1000 | fff65b94-d2cf-4b6b-95cb-4e7128d59835
 101       | 499601.000000000000 |  1000 | ffb431bf-73d9-4a4b-b4bf-71380f8377be
 102       | 499602.000000000000 |  1000 | ffff5f74-b8ed-4582-85c2-0e7866c14ff7
 103       | 499603.000000000000 |  1000 | ffe86f77-1031-4b5f-8939-e37dffe74e7b
 104       | 499604.000000000000 |  1000 | fff5f9d4-eb29-4c91-9fab-a0c6f75067be
 105       | 499605.000000000000 |  1000 | ff706172-8281-4458-bd14-6b1d83941a72
 106       | 499606.000000000000 |  1000 | ffbdff1e-5f42-42ce-936d-8e56d402ebfe
 107       | 499607.000000000000 |  1000 | ffdd02c9-ab40-4a1e-96df-4bb0fd085cba
(10 rows)

Time: 100.434 ms

从上可知,对于复杂SQL,分布式版的Lightdb性能远高于集中式版。

标签:rows,lightdb,单机,branch,table,canopy,1000,id,分布式
From: https://www.cnblogs.com/zhjh256/p/16927575.html

相关文章

  • 微服务之分布式搜索引擎elasticsearch
    什么是elasticsearchelasticsearch是一款非常强大的开源搜索引擎,可以帮助我们从海量数据中快速找到需要的内容。elasticsearch结合kibana、Logstash、Beats,也就是elastic......
  • 【重磅】Google 分布式 TensorFlow,像 Android 一样带来 AI 复兴?
     新智元原创1【新智元导读】今天,Google发布了分布式TensorFlow。Google的博文介绍了TensorFlow在图像分类的任务中,100个GPUs和不到65小时的训练时间下,达到了78......
  • 寻找Linux单机负载瓶颈
    服务器性能上不去,是哪里出了问题?IO还是CPU?只有找到瓶颈点,才能对症下药; 如何寻找Linux单机负载瓶颈,遵循的原则是不要推测,我们要通过测量的数据说话;负载分两类: 1.CPU负载; ......
  • 分布式架构演进与图解
    分布式系统(distributedsystem) 是建立在网络之上的软件系统。 内聚性:是指每一个数据库分布节点高度自治,有本地的数据库管理系统。透明性:是指每一个数据库分布节点对用户的......
  • 16-2-多服务之间分布式事务的一站解决(1)
                            ......
  • MassTransit - .NET Core 的分布式应用程序框架
    简介MassTransit是一个免费的、开源的.NET分布式应用程序框架。MassTransit使创建应用程序和服务变得容易,这些应用程序和服务利用基于消息的松散耦合异步通信来实现更......
  • 分布式存储之 etcd 的集群管理
    在分布式文件存储中,我们通常会面临集群选主,配置共享和节点状态监控的问题。通过etcd(基于Raft协议))可以实现超大规模集群的管理,以及多节点的服务可靠性。今天,我们就聊聊e......
  • 16-1-多服务之间分布式事务的一站解决(1)
                                                     ......
  • lightdb开启mysql兼容模式
    首先,从www.hs.net/lightdb下载最新版本并在安装时选择oracle模式,如下:  http://www.light-pg.com/docs/LightDB_Install_Manual/13.8-22.3/install.html#guilight......
  • 分布式文件系统HDFS 相关概念知识
    一、HDFS的局限性:1.不支持实时处理的任务需求。但Hbase满足实时处理需求。2.无法高效存储大量的小文件,因为是以索引结构保存到内存当中去。3.不支持多用户写入以及任意修......