[1]柳 楠,卞忠勇,李 洋,等.基于 Contig 的单面基因组框架填充 2-近似算法[J].计算机技术与发展,2024,34(02):148-155.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 022]
 LIU Nan,BIAN Zhong-yong,LI Yang,et al.A 2-Approximation Algorithm for Contig-based One-sided Genome Scaffold Filling[J].,2024,34(02):148-155.[doi:10. 3969 / j. issn. 1673-629X. 2024. 02. 022]
点击复制

基于 Contig 的单面基因组框架填充 2-近似算法()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
34
期数:
2024年02期
页码:
148-155
栏目:
人工智能
出版日期:
2024-02-10

文章信息/Info

Title:
A 2-Approximation Algorithm for Contig-based One-sided Genome Scaffold Filling
文章编号:
1673-629X(2024)02-0148-08
作者:
柳 楠卞忠勇李 洋朱永琦
山东建筑大学 计算机科学与技术学院,山东 济南 250101
Author(s):
LIU NanBIAN Zhong-yongLI YangZHU Yong-qi
School of Computer Science and Technology,Shandong Jianzhu University,Jinan 250101,China
关键词:
基因组框架填充近似算法贪婪策略最大匹配
Keywords:
genomescaffold fillingapproximation algorithmgreedy strategymaximum matching
分类号:
TP301
DOI:
10. 3969 / j. issn. 1673-629X. 2024. 02. 022
摘要:
随着基因测序技术的持续发展,基因组框架填充问题受到广泛关注。 该文针对基于 contig 的单面含重复基因的基因组框架填充问题开展研究。 通过设计有效的近似算法,完成根据参照基
因组,将缺失基因填充至基因测序获得的不完整框架中,提高基因组框架的完整性。 前期研究的基因组框架填充问题,缺失基因可以插入到不完整序列的任意两个基因之间,而基于片段重叠
群(contig) 的基因组框架填充,缺失基因的插入位置被限制在两个 contig 之间,更具一般性,该问题已被证明是 NP 完全问题。 现有的近似算法中,2-近似算法处理的实例具有特殊性,2. 57-近
似算法针对一般实例,但近似性能比不够理想。 该文以缺失基因、基因位点和断点三者之间的对应关系为基础,采用贪婪策略和最大匹配相结合的方式避免在填充过程中出现冗余公共邻接,
并通过生成新的 contig 增加外邻接的数量,将针对一般实例的算法近似性能比提高到 2,完成了基于 Python 的可视化程序开发,进一步验证了算法的有效性。
Abstract:
The problem of genome scaffold filling gets more and more attention with the advancement of genome sequencing technology.The contig-based one-sided genomic scaffold filling problem is researched. Effective approximation algorithms are devised to enhancethe integrity of the genome scaffold. The objective of algorithms is to insert the missing genes derived from genome sequencing into thescaffold according to the reference genome. In previous research on genome scaffold filling,missing genes can be inserted between anytwo genes in?
an incomplete sequence. While in genome scaffold based on contigs,the insertion position of missing genes is limited tobetween two contigs. This problem has been proven to be NP-complete and more general. Two related algorithms are analyzed. The 2-approximation is not for general but special case, and the approximation performance ratio of the 2. 57 - approximation algorithm isconsidered unsatisfactory. A new algorithm focusing on the correspondence between missing genes,gene slots and breakpoints is designedby using greedy method and maximal matching. The algorithm not only solves the redundant common adjacencies problem, but alsoincreases the number of external adjacencies by generating new contigs. As a result,the approximation performance for general case isimproved to a ratio of 2. The effectiveness of the algorithm is further verified by developing a visualization program based on Python.

相似文献/References:

[1]柳 楠*,朱永琦,李胜华,等.基于 Contig 的单面基因组片段填充问题研究[J].计算机技术与发展,2022,32(11):8.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 002]
 LIU Nan*,ZHU Yong-qi,LI Sheng-hua,et al.Research Progress of One-sided Repetitive Genome Scaffold Filling Based on Contig[J].,2022,32(02):8.[doi:10. 3969 / j. issn. 1673-629X. 2022. 11. 002]
[2]柳 楠*,李胜华,朱永琦.基于 Contig 的双面基因组片段填充算法研究[J].计算机技术与发展,2022,32(12):213.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 032]
 LIU Nan*,LI Sheng-hua,ZHU Yong-qi.Research on Two-sided Genome Scaffold Filling Algorithm Based on Contig[J].,2022,32(02):213.[doi:10. 3969 / j. issn. 1673-629X. 2022. 12. 032]

更新日期/Last Update: 2024-02-10