首页 > 其他分享 >Bigquery - CTE 官网

Bigquery - CTE 官网

时间:2023-09-11 18:12:36浏览次数:54  
标签:recursive expression iteration Bigquery CTE 官网 WHERE SELECT

Work with recursive CTEs

In GoogleSQL for BigQuery, a WITH clause contains one or more common table expressions (CTEs) with temporary tables that you can reference in a query expression. CTEs can be non-recursiverecursive, or both. The RECURSIVE keyword enables recursion in the WITH clause (WITH RECURSIVE).

A recursive CTE can reference itself, a preceding CTE, or a subsequent CTE. A non-recursive CTE can reference only preceding CTEs and can't reference itself. Recursive CTEs run continuously until no new results are found, while non-recursive CTEs run once. For these reasons, recursive CTEs are commonly used for querying hierarchical data and graph data.

For example, consider a graph where each row represents a node that can link to other nodes. To find the transitive closure of all reachable nodes from a particular start node without knowing the maximum number of hops, you would need a recursive CTE in the query (WITH RECURSIVE). The recursive query would start with the base case of the start node, and each step would compute the new unseen nodes that can be reached from all the nodes seen so far up to the previous step. The query concludes when no new nodes can be found.

However, recursive CTEs can be computationally expensive, so before you use them, review this guide and the WITH clause section of the GoogleSQL reference documentation.

Create a recursive CTE

To create a recursive CTE in GoogleSQL, use the WITH RECURSIVE clause as shown in the following example:

 
WITH RECURSIVE
  CTE_1 AS (
    (SELECT 1 AS iteration UNION ALL SELECT 1 AS iteration)
    UNION ALL
    SELECT iteration + 1 AS iteration FROM CTE_1 WHERE iteration < 3
  )
SELECT iteration FROM CTE_1
ORDER BY 1 ASC

The preceding example produces the following results:

 
/*-----------*
 | iteration |
 +-----------+
 | 1         |
 | 1         |
 | 2         |
 | 2         |
 | 3         |
 | 3         |
 *-----------*/

A recursive CTE includes a base term, a union operator, and a recursive term. The base term runs the first iteration of the recursive union operation. The recursive term runs the remaining iterations and must include one self-reference to the recursive CTE. Only the recursive term can include a self-reference.

In the preceding example, the recursive CTE contains the following components:

  • Recursive CTE name: CTE_1
  • Base term: SELECT 1 AS iteration
  • Union operator: UNION ALL
  • Recursive term: SELECT iteration + 1 AS iteration FROM CTE_1 WHERE iteration < 3

To learn more about the recursive CTE syntax, rules, and examples, see WITH clause in the GoogleSQL reference documentation.

Explore reachability in a directed acyclic graph (DAG)

You can use a recursive query to explore reachability in a directed acyclic graph (DAG). The following query finds all nodes that can be reached from node 5 in a graph called GraphData:

 
WITH RECURSIVE
  GraphData AS (
    --    1          5
    --   / \        / \
    --  2 - 3      6   7
    --      |       \ /
    --      4        8
    SELECT 1 AS from_node, 2 AS to_node UNION ALL
    SELECT 1, 3 UNION ALL
    SELECT 2, 3 UNION ALL
    SELECT 3, 4 UNION ALL
    SELECT 5, 6 UNION ALL
    SELECT 5, 7 UNION ALL
    SELECT 6, 8 UNION ALL
    SELECT 7, 8
  ),
  R AS (
    (SELECT 5 AS node)
    UNION ALL
    (
      SELECT GraphData.to_node AS node
      FROM R
      INNER JOIN GraphData
        ON (R.node = GraphData.from_node)
    )
  )
SELECT DISTINCT node FROM R ORDER BY node;

The preceding example produces the following results:

 
/*------*
 | node |
 +------+
 | 5    |
 | 6    |
 | 7    |
 | 8    |
 *------*/

Troubleshoot iteration limit errors

Recursive CTEs can result in infinite recursion, which occurs when the recursive term executes continuously without meeting a termination condition. To terminate infinite recursions, a limit of iterations for each recursive CTE is enforced. For BigQuery, the iteration limit is 500 iterations. Once a recursive CTE reaches the maximum number of iterations, the CTE execution is aborted with an error.

This limit exists because the computation of a recursive CTE can be expensive, and running a CTE with a large number of iterations consumes a lot of system resources and takes a much longer time to finish.

Queries that reach the iteration limit are usually missing a proper termination condition, thus creating an infinite loop, or using recursive CTEs in inappropriate scenarios.

If you experience a recursion iteration limit error, review the suggestions in this section.

Check for infinite recursion

To prevent infinite recursion, make sure the recursive term is able to produce an empty result after executing a certain number of iterations.

One way to check for infinite recursion is to convert your recursive CTE to a TEMP TABLE with a REPEAT loop for the first 100 iterations, as follows:

 
DECLARE current_iteration INT64 DEFAULT 0;

CREATE TEMP TABLE recursive_cte_name AS
SELECT base_expression, current_iteration AS iteration;

REPEAT
  SET current_iteration = current_iteration + 1;
  INSERT INTO recursive_cte_name
    SELECT recursive_expression, current_iteration
    FROM recursive_cte_name
    WHERE termination_condition_expression
      AND iteration = current_iteration - 1
      AND current_iteration < 100;
  UNTIL NOT EXISTS(SELECT * FROM recursive_cte_name WHERE iteration = current_iteration)
END REPEAT;

Replace the following values:

  • recursive_cte_name: The recursive CTE to debug.
  • base_expression: The base term of the recursive CTE.
  • recursive_expression: The recursive term of the recursive CTE.
  • termination_condition_expression: The termination expression of the recursive CTE.

For example, consider the following recursive CTE called TestCTE:

 
WITH RECURSIVE
  TestCTE AS (
    SELECT 1 AS n
    UNION ALL
    SELECT n + 3 FROM TestCTE WHERE MOD(n, 6) != 0
  )

This example uses the following values:

  • recursive_cte_nameTestCTE
  • base_expressionSELECT 1
  • recursive_expressionn + 3
  • termination_condition_expressionMOD(n, 6) != 0

The following code would therefore test the TestCTE for infinite recursion:

 
DECLARE current_iteration INT64 DEFAULT 0;

CREATE TEMP TABLE TestCTE AS
SELECT 1 AS n, current_iteration AS iteration;

REPEAT
SET current_iteration = current_iteration + 1;

INSERT INTO TestCTE
SELECT n + 3, current_iteration
FROM TestCTE
WHERE
  MOD(n, 6) != 0
  AND iteration = current_iteration - 1
  AND current_iteration < 10;

UNTIL
  NOT EXISTS(SELECT * FROM TestCTE WHERE iteration = current_iteration)
    END REPEAT;

-- Print the number of rows produced by each iteration

SELECT iteration, COUNT(1) AS num_rows
FROM TestCTE
GROUP BY iteration
ORDER BY iteration;

-- Examine the actual result produced for a specific iteration

SELECT * FROM TestCTE WHERE iteration = 2;

The preceding example produces the following results that include the iteration ID and the number of rows that were produced during that iteration:

 
/*-----------+----------*
 | iteration | num_rows |
 +-----------+----------+
 | 0         | 1        |
 | 1         | 1        |
 | 2         | 1        |
 | 3         | 1        |
 | 4         | 1        |
 | 5         | 1        |
 | 6         | 1        |
 | 7         | 1        |
 | 8         | 1        |
 | 9         | 1        |
 | 10        | 1        |
 *-----------+----------*/

These are the actual results produced during iteration 2:

 
/*----------+-----------*
 | n        | iteration |
 +----------+-----------+
 | 7        | 2         |
 *----------+-----------*/

If the number of rows is always greater than zero, which is true in this example, then the sample likely has an infinite recursion.

Verify the appropriate usage of the recursive CTE

Verify that you're using the recursive CTE in an appropriate scenario. Recursive CTEs can be expensive to compute because they're designed to query hierarchical data and graph data. If you aren't querying these two kinds of data, consider alternatives, such as using the LOOP statement with a non-recursive CTE.

Split a recursive CTE into multiple recursive CTEs

If you think your recursive CTE needs more than the maximum allowed iterations, you might be able to break down your recursive CTE into multiple recursive CTEs.

You can split a recursive CTE with a query structure similar to the following:

 
WITH RECURSIVE
  CTE_1 AS (
    SELECT base_expression
    UNION ALL
    SELECT recursive_expression FROM CTE_1 WHERE iteration < 500
  ),
  CTE_2 AS (
    SELECT * FROM CTE_1 WHERE iteration = 500
    UNION ALL
    SELECT recursive_expression FROM CTE_2 WHERE iteration < 500 * 2
  ),
  CTE_3 AS (
    SELECT * FROM CTE_2 WHERE iteration = 500 * 2
    UNION ALL
    SELECT recursive_expression FROM CTE_3 WHERE iteration < 500 * 3
  ),
  [, ...]
SELECT * FROM CTE_1
UNION ALL SELECT * FROM CTE_2 WHERE iteration > 500
UNION ALL SELECT * FROM CTE_3 WHERE iteration > 500 * 2
[...]

Replace the following values:

  • base_expression: The base term expression for the current CTE.
  • recursive_expression: The recursive term expression for the current CTE.

For example, the following code splits a CTE into three distinct CTEs:

 
WITH RECURSIVE
  CTE_1 AS (
    SELECT 1 AS iteration
    UNION ALL
    SELECT iteration + 1 AS iteration FROM CTE_1 WHERE iteration < 10
  ),
  CTE_2 AS (
    SELECT * FROM CTE_1 WHERE iteration = 10
    UNION ALL
    SELECT iteration + 1 AS iteration FROM CTE_2 WHERE iteration < 10 * 2
  ),
  CTE_3 AS (
    SELECT * FROM CTE_2 WHERE iteration = 10 * 2
    UNION ALL
    SELECT iteration + 1 AS iteration FROM CTE_3 WHERE iteration < 10 * 3
  )
SELECT iteration FROM CTE_1
UNION ALL
SELECT iteration FROM CTE_2 WHERE iteration > 10
UNION ALL
SELECT iteration FROM CTE_3 WHERE iteration > 20
ORDER BY 1 ASC

In the preceding example, 500 iterations is replaced with 10 iterations so that it's faster to see the results of the query. The query produces 30 rows, but each recursive CTE only iterates 10 times. The output looks like the following:

 
/*-----------*
 | iteration |
 +-----------+
 | 2         |
 | ...       |
 | 30        |
 *-----------*/

You could test the previous query on much larger iterations.

Use a loop instead of a recursive CTE

To avoid iteration limits, consider using a loop instead of a recursive CTE. You can create a loop with one of several loop statements, such as LOOPREPEAT, or WHILE. For more information, see Loops.

Change the recursive limit

If you think the following factors apply, contact Customer Care to raise the recursive limit:

  • You have a valid reason for your recursive CTE to run more than 500 iterations.
  • You're OK with a much longer execution.

Keep in mind that raising the recursive limit has potential risks:

  • Your CTE might fail with a different error message, such as memory exceeded or timeout.
  • If your project is using the on-demand pricing model, your CTE might still fail with a billing tier error until you switch to the capacity-based pricing model.
  • A recursive CTE with a large number of iterations consumes a lot of resources. This might impact other queries that are running within the same reservation, since they compete for shared resources.

标签:recursive,expression,iteration,Bigquery,CTE,官网,WHERE,SELECT
From: https://www.cnblogs.com/watermarks/p/17694159.html

相关文章

  • Burp Unexpected_information插件二次开发教程
    Unexpected_information是什么?Unexpected_information是一款开源的Burp插件,该插件的主要场景是用来辅助渗透测试工作,它可以自动的标记出请求包中的一些敏感信息、JS接口和一些特殊字段,防止我们在测试中疏忽了一些数据包,能发现平时渗透测试中忽略掉的一细节信息。Unexpected_i......
  • property value expectedcss(css-propertyvalueexpected)
     解决方法:  ......
  • 基于webman的CMS,企业官网通用PHP后台管理系统
    2023年9月11日10:47:00仓库地址:https://gitee.com/open-php/zx-webman-website还有laravel+scui的版本目前还未开源,电商laravel版本差不多,后续在移植webman算是比较标准的php+vue的项目CMS,企业官网通用PHP后台管理系统Englishdoc框架介绍WebmanSCUI后端开发组件php8.......
  • 优化官网的SEO排名
    要优化官网的SEO排名,可以考虑以下前端设置:优化网站结构和布局:确保网站的结构清晰,使用语义化的HTML标记,并使用CSS进行良好的布局和样式设计。这有助于搜索引擎正确解析和索引网页内容。确保网页加载速度快:优化网页的加载速度对于SEO排名很重要。您可以使用压缩和缓存静态资源......
  • 2、protected: 这种权限是为继承而设计的,protected所修饰的成员,对所有子类是可访问的,但
    2、protected:这种权限是为继承而设计的,protected所修饰的成员,对所有子类是可访问的,但只对同包的类是可访问的,对外包的非子类是不可以访问; protected权限的访问范围是:当前包所有的类+当前包以外的子类。 ......
  • 前端几个常用的官网模版记录一下
    无意间发现的几个官网模版对于一些要求不高,不需要特定设计的官网,这几个模版套一套,改一改,轻松解决!......
  • 【题解】CF1854C Expected Destruction
    你考虑,我们如果没有重合就将元素删去的操作,我们就有答案:\(n\times(m+1)-\sum\limits_{i=1}^na_i\)但是,我们显然最后的答案是小于这个的,如果有两个数在\(i\)相撞,那么我们的答案就会减少\((m-i+1)\)我们设\(f_{i,j}\)表示两个数分别在\(i\)和\(j\)的概率\((i\leqj......
  • 打包发布版时报错 Error: The apk for your currently selected variant cannot be si
    当直接运行release版本时,报错Error:Theapkforyourcurrentlyselectedvariantcannotbesigned.Pleasespecifyasigningconfigurationforthisvariant(release).解决报错:要在配置里添加签名android{signingConfigs{debug{storeFil......
  • gitee官网创建仓库和git操作【命令行】
     创建git仓库:  用户信息获取-git获取全局配置gitconfig--list--获取用户名gitconfiguser.name--获取邮箱gitconfiguser.email--创建仓库文件夹mkdirmemorandumcdmemorandum--git初始化gitinit--创建文件touchREADME.md--文件加入git控制gitaddREADME......
  • gitee官网创建仓库和git操作【TortoiseGit小乌龟】
     操作流程:登陆官网创建仓库,本地git克隆后添加文件后提交。没有账号要先注册账号。 官网地址:https://gitee.com/  新建仓库    创建后:  点击“初始化readme文件”按钮。  到本地克隆:    输入gitee的账号密码: 拉取完成: 拉取后本地文......