首页 > 其他分享 >COMPSCI 369 Computational Biology

COMPSCI 369 Computational Biology

时间:2024-06-10 19:44:14浏览次数:25  
标签:Biology Computational matrix tree length state COMPSCI marks answer

COMPSCI 369

THE UNIVERSITY OF AUCKLAND FIRST SEMESTER, 2023 Campus: Central City COMPUTER SCIENCE Computational Methods in Interdisciplinary Science Time allowed: THREE hours)

NOTE: This is a restricted book exam. You are allowed a single sheet of A4 paper with notes writtenon it.This exam has 16 questions, and it is worth 120 marks in total.There are 4 sections.Section A consists 4 short answer questions worth 30 marks in total.Section B consists 5 short answer questions worth 20 marks in total.Section C consists 4 short answer questions worth 32 marks in total.Section D consists 3 short answer questions worth 38 marks in total.Answer all questionsThe exam is worth 55% of the final gradePage 1 of 7COMPSCI 369

Section A: Computational Biology, Numerical Integration &

Game Theory

Computational Game Theory

  1. In lectures we discussed David Chess’s paper ‘Simulating the evolution of behavior: the iteratedprisoners’ dilemma problem’. In this paper, Chess reported on four phases in his model: “The Eraof Exploitation,” “The Nadir,” “The Growth of Trust,” and “Equilibrium.”(a) Describe each of the four phases and their relation to each other. [4 marks]b) Explain two reasons why it was necessary to use computational methods to study this model.3 marks]

Modelling Dynamical Systems

  1. The following equation specifies a discrete-time dynamical system. In this equation, α is a parameter.xt+1 = α min(xt, 1 xt)

  (a) When α < 1, there is a single fixed point. What is it? [1 mark]

  (b) When α = 1, there are an infinite number of fixed points. What are they? [2 marks]

  (c) What would be appropriate to use as labels for each axis of a bifurcation diagram of thisystem? [2 marks]

  (d) Write pseudocode for generating a bifurcation diagram for this system. [10 marks]

  1. Briefly describe the Euler and Runge-Kutta methods for numerical integration and explain therelationship between them. [4 marks]
  1. Identify a situation where Euler integration would be perfectly accurate and explain why this is thecase.[4 marks]

Page 2 of 7COMPSCI 369

Section B: Sequence Alignment

  1. The partially completed F matrix for calculating the local alignment of the sequences GCT andTAACT is given below. The score matrix is given by s(a, b) = 2 when a = b and s(a, a) = 4.The linear gap penalty is d = 3.

  (a) Complete the matrix by finding values for u, v, w and x and showing traceback pointers.[4 marks]

  (b) Give the score for the best local alignment of these two sequences and provide an alignmentthat has this score. [3 marks]

  1. What is the biological motivation for using an affine rather than a linear gap penalty? [2 marks]
  2. Computationally, how can one efficiently perform alignment with an affine gap penalty and whatis the computational cost of doing so when compared to a linear gap? Use asymptotic notation aspart of your answer. [4 marks]
  1. Describe the main barrier to finding an exact solution to the multiple alignment problem. Useasymptotic notation as part of your answer. [2 marks]
  1. Describe the main steps of the heuristic algorithm we discussed in lectures for solving the multiplelignment problem, including the use of neutral characters. (You do not need to give preciseformulae for how the distances are calculated.) [5 marks]Page 3 of 7COMPSCI 369

Section C: Simulation and HMMs

  1. What does it mean for a sequence of random variables X0, X1, X2, . . . to have the Markov property? Express your answer in plain English and in mathematical notation. [2 marks]
  1. You are given a method choice(x,prob), where the arrays x and prob are of equal length,and the sum of the elements of prob is 1. choice(x,prob) returns x[i] with probabilityprob[i].Write a pseudo-code method simHMM(a,e,L,s) that takes as input a transition matrix a, anemission matrix e, a length L and a start state s. It should return state and symbol sequences oflength L with the state sequence starting in state s. Use integers corresponding to array indices torepresent states and emissions. [6 marks]
  1. Given the method choice(x,prob) as defined in Question 11, write a pseudo-code methodandwalk(k) that simulates a random walk of length k starting at 0 where steps of -1 and +1are equally likely. Assume the argument k is a positive integer. Your method should return anarray of length k where walk[i] is the position of the random walk after i steps. Show how youcan use this method to estimate the probability that the position of a random walker after 50 stepsmore than 10 steps from its starting point. [5 marks]Page 4 of 7COMPSCI 369
  1. Consider an HMM with states A, B, C each of which emit symbols Q, R, S, T. The transitions aregiven by the following table which has omitted the transition probabilities into state C.The model starts in state A 60% of the time, state C 40% of the time and never in state B.The emission probabilities for the model are given by the following table.

  (a) Write down the values of the missing elements in the transition matrix. [2 marks]

  (b) Sketch a diagram of the HMM, showing all states, possible transitions and transition probabilities. Include the begin state but no end state. Do notinclude emission probabilities in thediagram. [3 marks]

  (c) Explain why the length of a run of Bs in a state sequence follows a geometric distribution andgive the length of an average run of Bs. [3 marks]

  (d) What is the joint probability P(x, π) of the state sequence π = ABB and the symbol sequencex = QTR? Leave your answer as a product or sum of numbers. [3 marks]

  (e) Complete the entries i, j and k in the forward matrix below using the recursion fk(i + 1) =ek(xi+1) P l alkfl(xi). Remember to show your working.[5 marks]

  (f) The forward algorithm is used to calculate P(x). When π = ABB and x =QRR, is P(x)greater than, less than, or equal to P(x, π)? Justify your answer. [3 marks]

Section D: Trees

  specify the pairwise distances, Dij , between the four sequences x1, . . . , x4.

  (a) Construct a UPGMA tree from D showing your working. [5 marks]

  (b) Will UPGMA or neighbour-joining (or both or neither) reconstruct the correct tree in thiscase? Explain your answer. [2 marks]

  (c) Describe when you would use neighbour-joining and when you would use UPGMA. [3 marks]

  (a) Explain what parsimony informative means, and identify the parsimony informative sites inthe alignment. [2 marks]

  (b) By calculating the parsimony score for each possible tree topology for these four taxa, findthe maximum parsimony tree. [5 marks]

  (c) Demonstrate (for example, on a single branch in a one of your trees) how ancestral reconstructions can be used to estimate branch length on the maximum parsimony tree. [4 marks]

  (d) Describe two significant drawbacks of the parsimony method. [3 marks]

  (a) Why do we rely on heuristic methods to find a maximum likelihood tree? Describe one suchheuristic and explain whether this heuristic will typically   find the tree that maximises thelikelihood. [4 marks]

  (b) Given mutation rate parameter µ and normalised rate matrix Q, how do you calculate theprobability that a C mutates to a T along a lineage of length   t = 3? (Recall we denote, forexample, the (A, A)th entry of a matrix B by BAA.) [3 marks]

  (c) Let X and Y be sequences of length L. How can you use the calculation in part (b) tocalculate the probability that X mutates into Y over a lineage of   length t = 3? Explain anyassumptions you are making. [2 marks]

  (d) In order to efficiently calculate the likelihood of the tree, what assumption do we make about the mutation process on different lineages? [2 marks]

  (e) In parsimony and distance based methods, sites that are constant across all sequences arenot informative about the tree. Explain whether or not the   same applies to likelihood basedmethods. [3 marks]

 

标签:Biology,Computational,matrix,tree,length,state,COMPSCI,marks,answer
From: https://www.cnblogs.com/qq99515681/p/18240948

相关文章

  • 52 Things: Number 3: Computational and storage power of different form factors
    52Things:Number3:Computationalandstoragepowerofdifferentformfactors52件事:数字3:不同外形尺寸的计算和存储能力Thisisthethirdinaseriesofblogpoststoaddressthelistof '52ThingsEveryPhDStudentShouldKnow' todoCryptography.Thes......
  • 第三届世界华人计算生物学大会 The 3rd Worldwide Chinese Computational Biology Con
    第三届世界华人计算生物学大会发布:2020年08月03日11:58浏览:52次【转】The3rdWorldwideChineseComputationalBiologyConference 时间:2020年8月3日-8月6日线上会议&实时直播:https://www.koushare.com/live/liveroom?islive=0&lid=394&roomid=132792会议官网:https://q......
  • Bioinformatics/ Computational Biology /biostats
    BioinformaticsComputationalBiologybiostats 对于这两个专业,我们可以从应用领域来区分:●Biostatistics生物统计学的研究方向可分为两类:统计遗传学和临床统计学;课程中与生物相关的内容很少,更重视学生的量化能力。●而Bioinformatics生物信息学与数据挖......
  • ICBCB 生物信息学与计算生物学国际会议(The 10th International Conference on Bioi
     十届生物信息学与计算生物学国际会议(ICBCB2022)成功举办 编辑:张谊来源:生命科学学院时间:2022年05月20日访问次数:2197  2022年5月13-15日,由浙江大学生命科学学院主办的第十届生物信息学与计算生物学国际会议(The10thInternationalConferenceonBioinformaticsand......
  • 2023-8-24 Quantom Computational Advantage Using Pertons 光量子计算优越性 2023人
    QuantomComputationalAdvantageUsingPertons光量子计算优越性|2023人工智能大会青年科学家论坛钟瀚森上海人工智能实验室论文背景:量子计算有望在许多重要任务上实现超越经典的计算能力。但长期以来受限于实验技术,无法在实际任务上演示超越经典计算机的性能。论文成......
  • 题解 P9702【[GDCPC2023] Computational Geometry】
    这题一看就不是计算几何,考虑区间DP。设凸多边形的\(n\)个顶点依次为\(P_1,P_2,\cdots,P_n\)。设\(f_{i,j}\)在\(i<j\)时表示\(P_i,P_{i+1},\cdots,P_{j-1},P_j\)组成的多边形的直径的平方,在\(i>j\)时表示\(P_i,P_{i+1},\cdots,P_n,P_1,\cdots,P_{j-1},P_j\)组......
  • Mark Fan:A computational model study on the mechanical response mechanism of asp
    WuhanJiangxiaRoadandBridgeEngineeringCo.,LtdSchoolofCivilEngineeringandArchitecture,WuhanInstituteofTechnologyMarkFan 15927602711Introduction:Asphaltisacommonlyusedmaterialinroadconstruction,anditsmechanicalpropertiespl......
  • 【Introductory Biology】Lecture 7 - Replication 复制
    文章目录SlidesRefDNA复制是指DNA双链在细胞分裂间期阶段进行以一个亲代DNA分子为模板合成子代DNA链的过程。复制的结果是一条双链变成两条一样的双链(如果复制过程正常的话),每条双链都与原来的双链一样(排除突变等不定因素)。参与DNA复制叉工作的许多酶。SlidesRef【麻省理工公开课】......
  • 【Introductory Biology】Lecture 12 - Chemical Genetics 1 - Cell Division & Segre
    文章目录SlidesRefSlidescentraldogma中心法则…Andwe’regoingtostarttalkingabouthowinformationflowsbetweencells,sofromaparentcelltoitsdaughtercells.Andwe’realsogonnatalkabouthowinformationflowsfromgenerationtothenext.Andth......
  • COMPSCI 589 问题解答
    COMPSCI589Homework4-Spring2023DueMay6,2023,11:55pmEasternTime1InstructionsThishomeworkassignmentconsistsofaprogrammingportion.Whileyoumaydiscussproblemswithyourpeers,youmustanswerthequestionsonyourownandimplementall......