标签：language Things 节点 machine Complexity time input Class 输入

52 Things: Number 4: The Complexity Class P

52 件事：数字 4：复杂度等级 P

This is the fourth blog post talking about '52 Things Every PhD Student Should Know' to do Cryptography, and the first on the topic of Theoretical Computer Science. In this post, I've been asked to define the complexity class P. Having recently joined the Cryptography group at Bristol as a Mathematician, I knew very little theoretical computer science when I first started my PhD and I'm sure there will be plenty of others in my situation, so this blog will start right from the beginning and you can skip over parts you know already. First, we'll give an overview of what complexity means and why it matters, then we'll define Turing machines, and finally arrive at the complexity class P, concluding with an example.
这是第四篇关于“每个博士生都应该知道的 52 件事”的博客文章，也是第一篇关于理论计算机科学主题的文章。在这篇文章中，我被要求定义复杂度类 P。我最近以数学家的身份加入了布里斯托尔大学的密码学小组，当我刚开始攻读博士学位时，我对理论计算机科学知之甚少，我相信还有很多其他人和我一样，所以这个博客将从头开始，你可以跳过你已经知道的部分。首先，我们将概述复杂性的含义以及它的重要性，然后我们将定义图灵机，最后得出复杂性等级 P，最后以一个示例结束。 Most of the content of this post is a reworking of parts of Introduction to the Theory of Computation by Michael Sipser [1], which I have found hugely helpful.
这篇文章的大部分内容是对 Michael Sipser [1] 的《计算理论导论》部分内容的修改，我发现它非常有帮助。

Section 1: Complexity and Big O Notation

第 1 部分：复杂性和大 O 符号 We want to know how difficult a given task is for a computer to do in order to design efficient programs. The trouble is that the processing power of a computer varies massively depending on the hardware (e.g. see last week's '52 Things' blog). So we want a measure of the difficulty of a task that doesn't depend on the specific details of the machine performing the task. One way to do this is to bound the number of operations that a certain model of a computer would take to do it. This is called (time) complexity theory.
我们想知道计算机为了设计有效的程序而完成给定的任务有多困难。问题在于，计算机的处理能力因硬件而异（例如，请参阅上周的“52 Things”博客）。因此，我们想要一个任务难度的度量，它不依赖于执行任务的机器的具体细节。一种方法是限制特定型号的计算机执行此操作所需的操作数。这就是所谓的（时间）复杂性理论。

Typically, though, the number of operations required will depend on the input to the task and may vary even with inputs of the same length. As a pertinent example, say we design a computer program which tells you whether or not an integer you input is prime. If we give as input the number 256, the program will probably output 'not prime' sooner than if we had given it the number 323 (even though they both have length 9 when written as binary integers, for example), since the first integer has a very small prime factor (2) and the second has larger factors (17 and 19). Therefore we usually opt for a worst-case analysis where we record the longest running time of all inputs of a particular length. So we obtain an algebraic expression t(n) that reflects the longest running time of all inputs of length n.
但是，通常情况下，所需的操作数将取决于任务的输入，并且即使输入长度相同，操作数也可能有所不同。举个相关的例子，假设我们设计了一个计算机程序，它告诉你输入的整数是否是素数。如果我们给出数字 256 作为输入，程序可能会比我们给它数字 323 更快地输出“非素数”（即使它们在写成二进制整数时都有 9 的长度），因为第一个整数有一个非常小的质因数（2），第二个有较大的因数（17 和 19）。因此，我们通常选择最坏情况分析，即记录特定长度的所有输入的最长运行时间。因此，我们得到一个代数表达 t(n) 式，它反映了所有长度 n 输入的最长运行时间。 Furthermore, when the input length n becomes very large, we can neglect all but the most dominant term in the expression and also ignore any constant factors. This is called asymptotic analysis; we assume n is enormous and ask roughly how many steps the model of computation will take to 'finish' when given the worst possible input of length n, writing our answer in the form O(t(n)). For example, if we find that our process takes 6n3–n2+1 steps, we write that it is O(n3), since all other terms can be ignored for very large n.
此外，当输入长度 n 变得非常大时，我们可以忽略表达式中除最主要项之外的所有项，也可以忽略任何常数因子。这称为渐近分析;我们假设 n 是巨大的，并粗略地询问计算模型在给定最差的长度 n 输入时需要多少步才能“完成”，以形式 O(t(n)) 写下我们的答案。例如，如果我们发现我们的过程采取了 6n3–n2+1 步骤，我们写它是 O(n3) ，因为对于非常大 n 的所有其他项都可以忽略。

Section 2: Turing Machines

第 2 部分：图灵机 Now we give the model that is most often used in the kind of calculations performed in Section 1. First, recall that an alphabet is a non-empty finite set and a string is a finite sequence of elements (symbols) from an alphabet. A language is simply a set of strings.
现在我们给出在第 1 节中执行的计算类型中最常用的模型。首先，回想一下，字母表是一个非空的有限集合，而字符串是字母表中元素（符号）的有限序列。语言只是一组字符串。 A Turing machine models what real computers can do. Its 'memory' is an infinitely long tape. At any time, each square of the tape is either blank or contains a symbol from some specified alphabet. The machine has a tape head that can move left or right along the tape, one square at a time, and read from and write to that square. At first, the tape is all blank except for the leftmost n squares which constitute the input (none of which can be blank so that it is clear where the input ends). The tape head starts at the leftmost square, reads the first input symbol and then decides what to do next according to a transition function. The transition function depends on what it reads at the square it is currently on and the state that the machine is currently in (like a record of what it has done so far) and returns
图灵机模拟真实计算机可以做什么。它的“记忆”是一盘无限长的磁带。在任何时候，磁带的每个方块要么是空白的，要么包含来自某个指定字母表的符号。机器有一个磁带头，可以沿着磁带向左或向右移动，一次一个方块，并读取和写入该方块。起初，磁带都是空白的，除了构成输入的最 n 左边的方块（它们都不能是空白的，以便清楚地知道输入的结束位置）。磁带头从最左边的方格开始，读取第一个输入符号，然后根据转换函数决定下一步要做什么。转换函数取决于它在当前所在的方格上读取的内容以及机器当前所处的状态（例如到目前为止它所做的事情的记录）并返回

a new state 一个新状态
another symbol to write to the square it is on (though this symbol might be the same as what was already written there)
另一个要写到它所在的方块的符号（尽管这个符号可能与那里已经写的符号相同）
a direction to move in: left or right.
向内移动的方向：向左或向右。

The machine will continue to move one square at a time, read a symbol, evaluate the transition function, write a symbol and move again, until its state becomes some specified accept state or reject state.
机器将继续一次移动一个方格，读取一个符号，评估转换函数，写入一个符号并再次移动，直到其状态变成某个指定的接受状态或拒绝状态。 If the machine ends up in the accept state, we say it accepts its input. Similarly it may reject its input. In either case we say the machine halts on its input. But note that it may enter a loop without accepting or rejecting i.e. it may never halt. If a Turing machine accepts every string in some language and rejects all other strings, then we say the machine decides that language. We can think of this as the machine testing whether or not the input string is a member of the language. Given a language, if there is a Turing machine that decides it, we say the language is decidable.
如果机器最终处于接受状态，我们说它接受其输入。同样，它可能会拒绝其输入。在任何一种情况下，我们都说机器在其输入时停止。但请注意，它可能会在不接受或拒绝的情况下进入循环，即它可能永远不会停止。如果图灵机接受某种语言的每个字符串并拒绝所有其他字符串，那么我们说机器决定了该语言。我们可以将其视为机器测试输入字符串是否是语言的成员。给定一种语言，如果有一个图灵机来决定它，我们说该语言是可判定的。 The power of this model comes from the fact that a Turing machine can do everything that a real computer can do (this is called the Church-Turing thesis [2]). We define the time complexity class TIME(t(n)) to be the collection of all languages that are decidable by an O(t(n)) time Turing machine, then we turn computational problems into questions about language membership (is an input string a member of a certain language? e.g. does this string representing an integer belong to the language of strings representing prime integers?) and can partition computational problems into time complexity classes.
这个模型的力量来自于这样一个事实，即图灵机可以做真实计算机可以做的所有事情（这被称为丘奇-图灵论文[2]）。我们将时间复杂度类 TIME(t(n)) 定义为 O(t(n)) 时间图灵机可判定的所有语言的集合，然后我们将计算问题转换为有关语言隶属度的问题（输入字符串是某种语言的成员吗？例如，这个表示整数的字符串是否属于表示质整数的字符串的语言？），并且可以将计算问题划分为时间复杂度类。

Section 3: The Complexity Class P

第 3 部分：复杂度等级 P Finally, we arrive at the purpose of this blog! If t(n)=nk for some k>0 then O(t(n)) is called polynomial time. The complexity class P is the class of all languages that are decidable in polynomial time by a Turing machine. Since k could be very large, such Turing machines are not necessarily all practical, (let alone 'fast'!), but this class is a rough model for what can be realistically achieved by a computer. Note that the class P is fundamentally different to those languages where t(n) has n in an exponent, such as 2n, which grow much, much faster as n increases – so fast that even if you have a decider for some language, you may find that the universe ends before it halts on your input!
最后，我们到达了这个博客的目的！如果 t(n)=nk 对于某些 k>0 人来说，则 O(t(n)) 称为多项式时间。复杂度类 P 是图灵机在多项式时间内可判定的所有语言的类。由于 k 图灵机可能非常大，因此不一定都是实用的（更不用说“快速”了！），但这个类是计算机可以实际实现的粗略模型。请注意，类 P 与那些 n 指数中有的语言 t(n) 有根本的不同，比如 2n ，随着 n 指数的增加，它们的增长速度要快得多——如此之快，以至于即使你对某种语言有一个决定因素，你也可能会发现宇宙在它停止你的输入之前就结束了！ We conclude with an example of a polynomial time problem. Suppose you have a directed graph (a set of nodes and edges where there is at most one edge between any pair of nodes and each edge has an arrow indicating a direction). Then if we encode the graph and the two nodes as a single string, we can form a language consisting of those strings representing a graph and two nodes such that it is possible to follow the edges from the first node and eventually arrive at the second. So a decider for this language will effectively answer the question of whether there is a path from the first node A to the second B, called the path problem, by accepting or rejecting the graph and nodes you input. We give a decider for this language and show that it decides in polynomial time.
我们以多项式时间问题为例结束。假设您有一个有向图（一组节点和边，其中任意一对节点之间最多有一个边，并且每条边都有一个指示方向的箭头）。然后，如果我们将图形和两个节点编码为单个字符串，我们可以形成一种由代表图形和两个节点的字符串组成的语言，这样就可以从第一个节点开始跟踪边缘并最终到达第二个节点。因此，这种语言的判定器将通过接受或拒绝您输入的图形和节点来有效地回答从第一个节点 A 到第二个 B 是否存在路径的问题，称为路径问题。我们给出了这种语言的决定性语言，并表明它在多项式时间内决定。

First put a mark on A.
首先在 A 上做一个标记。
Scan all the edges of the graph. If you find an edge from a marked node to an unmarked node, mark the unmarked node.
扫描图形的所有边缘。如果找到从已标记节点到未标记节点的边，请标记未标记的节点。
Repeat the above until you mark no new nodes.
重复上述操作，直到没有标记新节点。
If B is marked, accept. Otherwise, reject.
如果标记为 B，则接受。否则，请拒绝。

This process successively marks the nodes that are reachable from A by a path of length 1, then a path of length 2, and so on. So it is clear that a Turing machine implementing the above is a decider for our language. Now we consider the time complexity of this algorithm. If we couldn't do steps 1 and 4 in polynomial time, our machine would be terrible! So we focus on steps 2 and 3. Step 2 involves searching the input and placing a mark on one square, which is clearly polynomial time in the size of the input. Step 3 repeats step 2 no more times than the number of nodes, which is necessarily less than the size of the input (since the input must encode all the nodes of the graph) and is hence polynomial (in particular, linear) in the size of the input. Therefore the whole algorithm is polynomial time and so we say the path problem is in P.
此过程通过长度为 1 的路径，然后通过长度为 2 的路径依此类推，依此类推，标记可从 A 访问的节点。因此，很明显，实现上述内容的图灵机是我们语言的决定因素。现在我们考虑这个算法的时间复杂度。如果我们不能在多项式时间内完成步骤 1 和 4，我们的机器会很糟糕！因此，我们专注于第 2 步和第 3 步。第 2 步涉及搜索输入并在一个方块上放置标记，这显然是输入大小的多项式时间。步骤 3 重复步骤 2 的次数不超过节点数，节点数必然小于输入的大小（因为输入必须对图形的所有节点进行编码），因此输入的大小是多项式（特别是线性）。因此，整个算法是多项式时间，因此我们说路径问题在 P 中。

标签：language,Things,节点,machine,Complexity,time,input,Class,输入
From： https://www.cnblogs.com/3cH0-Nu1L/p/18104657

52 Things: Number 4: The Complexity Class P

52 Things: Number 4: The Complexity Class P

Section 1: Complexity and Big O Notation

Section 2: Turing Machines

Section 3: The Complexity Class P

相关文章

赞助商

阅读排行