«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j. issn. 1673-629X. 2021. 04. 008]
点击复制

病种分类方法在医保中的应用研究()

分享到：

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:: 31
期数:: 2021年04期

页码:: 46-51

栏目:: 大数据分析与挖掘

出版日期:: 2021-04-10

文章信息/Info

Title:: Research on Application of Disease Classification Methods inMedical Insurance

文章编号:: 1673-629X(2021)04-0046-06

作者:: 邵云霞¹; 2 ; 王程¹ ; 成彬¹ ; 韩珍珍¹ ; 韩月³; 1. 河北省科学院应用数学研究所,河北石家庄 050081;
2. 河北省信息安全认证工程技术研究中心,河北石家庄 050081;
3. 河北师范大学,河北石家庄 050000

Author(s):: SHAO Yun-xia¹; 2 ; WANG Cheng¹; CHENG Bin¹; HAN Zhen-zhen¹ ; HAN Yue³; 1. Institute of Applied Mathematics,Hebei Academy of Sciences,Shijiazhuang 050081,China;
2. Hebei Authentication Technology Engineering Research Center,Shijiazhuang 050081,China;
3. Hebei Normal University,Shijiazhuang 050000,China

关键词:: 病种付费; 自然语言处理; 机器学习; 深度学习; 疾病分类

Keywords:: disease payment; natural language processing; machine learning; deep learning; disease classification

分类号:: TP391

DOI:: 10. 3969 / j. issn. 1673-629X. 2021. 04. 008

摘要:: 随着国内医保病种付费方式改革的稳步推进,疾病种类的准确规范成为医保事业中亟待解决的问题, 也是新医改顺利进行的关键环节。目前存在的最大难题是医院的病种名称和疾病编码不规范,对应关系混乱。因此,提出一种算法组合的疾病种类预测模型。首先对住院病案首页数据作质量检测和清洗等预处理,然后通过过采样和加大敏感数据权重等方法生成数据集以解决病种类别不均衡和代价敏感问题;采用自然语言处理技术对数据集进行中文分词并映射到向量空间,计算文本相似度筛选病组,以 SVM 和 Text CNN 组合成病种预测模型,在不同样本量的数据集上进行模型实验并分析结果。随后采用 2012 年至 2018 年 30 多万份阑尾炎患者的病案首页数据进行实验,结果表明 SVM 适合少见的样本量小的病种模型,其有效且稳定,Text CNN 适合常见的样本量较大的病种模型其精确度高。最后就该领域存在的问题和发展方向进行说明。

Abstract:: With the steady advancement of the reform of the medical insurance payment method in China,the accurate standardization of the disease types has become an urgent problem to be solved in the medical insurance industry,which is also a key link for the smooth progress of the new medical reform. At present,the biggest difficulty is that the hospital’ s disease name and disease code are not standardized,and the corresponding relationship is chaotic. Therefore, we present a disease category prediction model based on algorithms combination. First of all,the data on the first page of inpatient medical record are preprocessed by quality inspection and cleaning. Then,data sets are generated by over-sampling and increasing the weight of sensitive data to solve the problem of diseased categories imbalance and cost sensitivity. The natural language processing technology is used to segment Chinese words in the data set and map them to a vector space. The text similarity is calculated to screen the disease group. SVM and Text CNN are combined to form the disease prediction model. The model tests are carried on different sample sizes datasets and the results are analyzed. Subsequently, more than 300000 cases of appendicitis from 2012 to 2018 were used for the experiment. It is showed that SVM is suitable for rare disease models with small sample size,which is effective and stable,and Text CNN is suitable for common diseases with large sample size high accuracy.Finally,the problems and development direction in this field are explained.

相似文献/References:

[1]陈国华赵克李亚涛易帅.自然语言处理系统中的事件类名词的耦合处理[J].计算机技术与发展,2008,(06):60.
　CHEN Guo-hua,ZHAO Ke,LI Ya-tao,et al.Coupling Processing of Event Noun in NLP Systems[J].,2008,(04):60.
[2]程节华.基于FAQ的智能答疑系统中分词模块的设计[J].计算机技术与发展,2008,(07):181.
　CHENG Jie-hua.Design of Words Module in Intelligent Q/A System Based on FAQ[J].,2008,(04):181.
[3]杨欢许威赵克陈余.动词属性在自然语言处理当中的研究与应用[J].计算机技术与发展,2008,(07):233.
　YANG Huan,XU Wei,ZHAO Ke,et al.Research and Application of Verb Attributes in Natural Language Processing[J].,2008,(04):233.
[4]孙超张仰森.面向综合语言知识库的知识融合与获取研究[J].计算机技术与发展,2010,(08):25.
　SUN Chao,ZHANG Yang-sen.Research of Knowledge Integration and Obtaining Oriented Comprehensive Language Knowledge System[J].,2010,(04):25.
[5]党建亿珍珍赵克殷鸿.数学领域集体词结构形式化处理研究[J].计算机技术与发展,2007,(05):121.
　DANG Jian,YI Zhen-zhen,ZHAO Ke,et al.Research of Formalization Processing for Collective Structures in Mathematics Domain[J].,2007,(04):121.
[6]江有福郑庆华.自然语言网络答疑系统中倒排索引技术的研究[J].计算机技术与发展,2006,(02):126.
　JIANG You-fu,ZHENG Qing-hua.Research of Inverted Index in NLWAS[J].,2006,(04):126.
[7]刘亚清张瑾于纯妍.基于义原同现频率的汉语词义排歧系统[J].计算机技术与发展,2006,(05):184.
　LIU Ya-qing,ZHANG Jin,YU Chun-yan.A Chinese Word Sense Disambiguation System Based on Primitive CO- Occurrence Data[J].,2006,(04):184.
[8]刘政怡李炜吴建国.基于IMM—IME的汉字键盘输入法编程技术研究[J].计算机技术与发展,2006,(12):43.
　LIU Zheng-yi,LI Wei,WU Jian-guo.Research of Programming Technology of Chinese Input Method Based on IMM- IME[J].,2006,(04):43.
[9]赵鹏何留进孙凯方薇[].基于情感计算的网络中文信息分析技术[J].计算机技术与发展,2010,(11):146.
　ZHAO Peng,HE Liu-jin,SUN Kai,et al.Analyzing Technologies of Internet Chinese Information Based on Affective Computing[J].,2010,(04):146.
[10]徐远方李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
　XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(04):134.

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed1353
全文下载/Downloads1042
评论/Comments