[1]柳 楠*,李胜华,朱永琦.基于 Contig 的双面基因组片段填充算法研究[J].计算机技术与发展,2022,32(12):213-220.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 032]
 LIU Nan*,LI Sheng-hua,ZHU Yong-qi.Research on Two-sided Genome Scaffold Filling Algorithm Based on Contig[J].,2022,32(12):213-220.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 032]
点击复制

基于 Contig 的双面基因组片段填充算法研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
32
期数:
2022年12期
页码:
213-220
栏目:
新型计算应用系统
出版日期:
2022-12-10

文章信息/Info

Title:
Research on Two-sided Genome Scaffold Filling Algorithm Based on Contig
文章编号:
1673-629X(2022)12-0213-08
作者:
柳 楠* 李胜华朱永琦
山东建筑大学 计算机科学与技术学院,山东 济南 250101
Author(s):
LIU Nan* LI Sheng-huaZHU Yong-qi
School of Computer Science and Technology,Shandong Jianzhu University,Jinan 250101,China
关键词:
基因组多项式时间贪婪搜索片段重叠群算法测试邻接
Keywords:
genomepolynomial timegreedy searchcontigalgorithm testadjacency
分类号:
TP301
DOI:
10. 3969 / j. issn. 1673-629X. 2022. 12. 032
摘要:
随着生物测序技术的快速发展,人们能够在更短的时间内获得大规模的基因片段序列,如何对不完整的基因组片段进行填充使其变得完整已经被越来越多的人关注。 基于普通序列的双面基因组问题是将缺失基因 X 填充到一个不完整基因组片段 A 中,得到 A’ ,将缺失基因 Y 填充到一个不完整基因组片段 B 中,得到 B’ ,使得 A’ 和 B’ 之间的邻接数最大化。 测序获得的基因组片段通常由片段重叠群(contig) 组成,缺失基因只能插入到 contig 两端。 该算法针对基因组片段重叠群一类实例,首先为了简化基因组片段,进行合并符合条件的连续缺失串操作,其后进行具有一一对应关系的 type-2 类型缺失串的插入操作;其次采用贪婪搜索策略对 type-1 类型缺失串进行插入操作,然后对 type-3 串进行插入操作。 设计了多项式时间算法,证明了算法的正确性,分析了算法的时间复杂度,开发了算法测试平台,进一步验证了算法的有效性。
Abstract:
With the rapid development of biological sequencing technology, large scale gene scaffold sequences can be obtained byhumans in a shorter time. How to fill the incomplete genome scaffold to make them complete has attracted more and more attention. Thetwo-sided genome problem based on common sequence is to fill the missing gene X into an incomplete genome scaffold A to get A’ ,andfill the missing gene Y into an incomplete genome scaffold B to get B’ ,so as to maximize the number of adjacency between A’ and B’ . Thegenomic scaffold obtained by sequencing are usually composed of scaffold overlap groups ( contig) ,and the missing genes can only beinserted at both ends of contig. For the example of overlapping groups of genome segments,in order to simplify the genome fragment, the operation of merging the qualified continuous missing strings is carried out firstly,and then the insertion of the type-2 missing stringswith one-to-one correspondence is carried out. Secondly,the greedy search strategy is used to insert the type-1 missing string,and thenthe type-3 string is inserted. A polynomial time algorithm is designed and proved in correctness. The time complexity of the algorithm isanalyzed,and the algorithm test platform is developed to further verify the effectiveness of the algorithm.

相似文献/References:

[1]柳 楠*,朱永琦,李胜华,等.基于 Contig 的单面基因组片段填充问题研究[J].计算机技术与发展,2022,32(11):8.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 002]
 LIU Nan*,ZHU Yong-qi,LI Sheng-hua,et al.Research Progress of One-sided Repetitive Genome Scaffold Filling Based on Contig[J].,2022,32(12):8.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 002]
[2]柳 楠,卞忠勇,李 洋,等.基于 Contig 的单面基因组框架填充 2-近似算法[J].计算机技术与发展,2024,34(02):148.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 022]
 LIU Nan,BIAN Zhong-yong,LI Yang,et al.A 2-Approximation Algorithm for Contig-based One-sided Genome Scaffold Filling[J].,2024,34(12):148.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 022]

更新日期/Last Update: 2022-12-10