[1]曹茂俊,肖阳.基于K-BERT的测井文本分类方法研究[J].计算机技术与发展,2025,(05):197-204.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0390]
 CAO Mao-jun,XIAO Yang.Research on Logging Text Classification Method Based on K-BERT[J].,2025,(05):197-204.[doi:10.20165/j.cnki.ISSN1673-629X.2024.0390]
点击复制

基于K-BERT的测井文本分类方法研究()

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年05期
页码:
197-204
栏目:
新型计算应用系统
出版日期:
2025-05-10

文章信息/Info

Title:
Research on Logging Text Classification Method Based on K-BERT
文章编号:
1673-629X(2025)05-0197-08
作者:
曹茂俊肖阳
东北石油大学 计算机与信息技术学院,黑龙江 大庆 163318
Author(s):
CAO Mao-junXIAO Yang
School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China
关键词:
K-BERTTextCNN测井文本文本分类测井知识图谱
Keywords:
K-BERTTextCNNlogging texttext classificationlogging knowledge graph
分类号:
TP391.1
DOI:
10.20165/j.cnki.ISSN1673-629X.2024.0390
摘要:
在石油勘探与开发领域,测井文本数据的处理和分类是提高测井数据解读效率和准确性的关键环节。 然而,测井文本中包含大量专业术语和复杂的数据结构,使得传统的文本分类技术在面对专业领域数据时效果有限,难以满足实际应用需求。 为了解决这一问题,该文提出了一种改进的 K-BERT 文本分类方法。 该方法结合了 K-BERT 模型和 TextCNN的文本特征提取能力。 K-BERT 通过引入测井领域的知识图谱,将领域知识嵌入模型中,增强了模型对专业术语和复杂语义的理解能力,从而提升了模型在专业领域文本分类中的语义捕捉效果。 而 TextCNN 利用卷积神经网络的特性,能够有效提取文本的局部特征,捕捉文本细节信息,进一步提升分类的精度与鲁棒性。 两者的结合为测井文本的分类提供了一种创新的解决方案。 通过实验对比分析,该方法在宏精确率、宏召回率及宏 F1 值等指标上均优于传统文本分类模型,验证了其在专业领域文本分类中的有效性和优越性。
Abstract:
In the field of petroleum exploration and development,the processing and classification of well logging text data are crucial steps for enhancing the efficiency and accuracy of well logging data interpretation. However,well logging texts contain a plethora of pro-fessional terminology and complex data structures,which limit the effectiveness of traditional text classification techniques when dealing with domain-specific data,thus failing to meet practical application requirements. To address this issue,we propose an improved K-BERT text classification method. This method integrates the text feature extraction capabilities of the K-BERT model and TextCNN. By incorporating a knowledge graph specific to the well logging domain,K-BERT embeds domain knowledge into the model,enhancing its understanding of professional terminology and complex semantics,and improving the model’s semantic capture performance in domain-specific text classification. On the other hand,TextCNN leverages the characteristics of convolutional neural networks to effectively extract local features of texts and capture detailed textual information,further enhancing classification accuracy and robustness. The com-bination of these two techniques provides an innovative solution for the classification of well logging texts.Experimental comparisons demonstrate that the proposed method outperforms traditional text classification models in terms of macro precision,macro recall,and macro F1 score,validating its effectiveness and superiority in domain-specific text classification.
更新日期/Last Update: 2025-05-10