[1]邵云霞,王 程,成 彬,等.病种分类方法在医保中的应用研究[J].计算机技术与发展,2021,31(04):46-51.[doi:10. 3969 / j. issn. 1673-629X. 2021. 04. 008]
 SHAO Yun-xia,WANG Cheng,CHENG Bin,et al.Research on Application of Disease Classification Methods inMedical Insurance[J].,2021,31(04):46-51.[doi:10. 3969 / j. issn. 1673-629X. 2021. 04. 008]
点击复制

病种分类方法在医保中的应用研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
31
期数:
2021年04期
页码:
46-51
栏目:
大数据分析与挖掘
出版日期:
2021-04-10

文章信息/Info

Title:
Research on Application of Disease Classification Methods inMedical Insurance
文章编号:
1673-629X(2021)04-0046-06
作者:
邵云霞12 王 程1 成 彬1 韩珍珍1 韩 月3
1. 河北省科学院应用数学研究所,河北 石家庄 050081;
2. 河北省信息安全认证工程技术研究中心,河北 石家庄 050081;
3. 河北师范大学,河北 石家庄 050000
Author(s):
SHAO Yun-xia12 WANG Cheng1 CHENG Bin1 HAN Zhen-zhen1 HAN Yue3
1. Institute of Applied Mathematics,Hebei Academy of Sciences,Shijiazhuang 050081,China;
2. Hebei Authentication Technology Engineering Research Center,Shijiazhuang 050081,China;
3. Hebei Normal University,Shijiazhuang 050000,China
关键词:
病种付费自然语言处理机器学习深度学习疾病分类
Keywords:
disease paymentnatural language processingmachine learningdeep learningdisease classification
分类号:
TP391
DOI:
10. 3969 / j. issn. 1673-629X. 2021. 04. 008
摘要:
随着国内医保病种付费方式改革的稳步推进,疾病种类的准确规范成为医保事业中亟待解决的问题, 也是新医改顺利进行的关键环节。 目前存在的最大难题是医院的病种名称和疾病编码不规范,对应关系混乱。 因此,提出一种算法组合的疾病种类预测模型。 首先对住院病案首页数据作质量检测和清洗等预处理,然后通过过采样和加大敏感数据权重等方法生成数据集以解决病种类别不均衡和代价敏感问题;采用自然语言处理技术对数据集进行中文分词并映射到向量空间,计算文本相似度筛选病组,以 SVM 和 Text CNN 组合成病种预测模型,在不同样本量的数据集上进行模型实验并分析结果。 随后采用 2012 年至 2018 年 30 多万份阑尾炎患者的病案首页数据进行实验,结果表明 SVM 适合少见的样本量小的病种模型,其有效且稳定,Text CNN 适合常见的样本量较大的病种模型其精确度高。 最后就该领域存在的问题和发展方向进行说明。
Abstract:
With the steady advancement of the reform of the medical insurance payment method in China,the accurate standardization of the disease types has become an urgent problem to be solved in the medical insurance industry,which is also a key link for the smooth progress of the new medical reform. At present,the biggest difficulty is that the hospital’ s disease name and disease code are not standardized,and the corresponding relationship is chaotic. Therefore, we present a disease category prediction model based on algorithms combination. First of all,the data on the first page of inpatient medical record are preprocessed by quality inspection and cleaning. Then,data sets are generated by over-sampling and increasing the weight of sensitive data to solve the problem of diseased categories imbalance and cost sensitivity. The natural language processing technology is used to segment Chinese words in the data set and map them to a vector space. The text similarity is calculated to screen the disease group. SVM and Text CNN are combined to form the disease prediction model. The model tests are carried on different sample sizes datasets and the results are analyzed. Subsequently, more than 300000 cases of appendicitis from 2012 to 2018 were used for the experiment. It is showed that SVM is suitable for rare disease models with small sample size,which is effective and stable,and Text CNN is suitable for common diseases with large sample size high accuracy.Finally,the problems and development direction in this field are explained.

相似文献/References:

[1]陈国华 赵克 李亚涛 易帅.自然语言处理系统中的事件类名词的耦合处理[J].计算机技术与发展,2008,(06):60.
 CHEN Guo-hua,ZHAO Ke,LI Ya-tao,et al.Coupling Processing of Event Noun in NLP Systems[J].,2008,(04):60.
[2]程节华.基于FAQ的智能答疑系统中分词模块的设计[J].计算机技术与发展,2008,(07):181.
 CHENG Jie-hua.Design of Words Module in Intelligent Q/A System Based on FAQ[J].,2008,(04):181.
[3]杨欢 许威 赵克 陈余.动词属性在自然语言处理当中的研究与应用[J].计算机技术与发展,2008,(07):233.
 YANG Huan,XU Wei,ZHAO Ke,et al.Research and Application of Verb Attributes in Natural Language Processing[J].,2008,(04):233.
[4]孙超 张仰森.面向综合语言知识库的知识融合与获取研究[J].计算机技术与发展,2010,(08):25.
 SUN Chao,ZHANG Yang-sen.Research of Knowledge Integration and Obtaining Oriented Comprehensive Language Knowledge System[J].,2010,(04):25.
[5]党建 亿珍珍 赵克 殷鸿.数学领域集体词结构形式化处理研究[J].计算机技术与发展,2007,(05):121.
 DANG Jian,YI Zhen-zhen,ZHAO Ke,et al.Research of Formalization Processing for Collective Structures in Mathematics Domain[J].,2007,(04):121.
[6]江有福 郑庆华.自然语言网络答疑系统中倒排索引技术的研究[J].计算机技术与发展,2006,(02):126.
 JIANG You-fu,ZHENG Qing-hua.Research of Inverted Index in NLWAS[J].,2006,(04):126.
[7]刘亚清 张瑾 于纯妍.基于义原同现频率的汉语词义排歧系统[J].计算机技术与发展,2006,(05):184.
 LIU Ya-qing,ZHANG Jin,YU Chun-yan.A Chinese Word Sense Disambiguation System Based on Primitive CO- Occurrence Data[J].,2006,(04):184.
[8]刘政怡 李炜 吴建国.基于IMM—IME的汉字键盘输入法编程技术研究[J].计算机技术与发展,2006,(12):43.
 LIU Zheng-yi,LI Wei,WU Jian-guo.Research of Programming Technology of Chinese Input Method Based on IMM- IME[J].,2006,(04):43.
[9]赵鹏 何留进 孙凯 方薇[].基于情感计算的网络中文信息分析技术[J].计算机技术与发展,2010,(11):146.
 ZHAO Peng,HE Liu-jin,SUN Kai,et al.Analyzing Technologies of Internet Chinese Information Based on Affective Computing[J].,2010,(04):146.
[10]徐远方 李成城.基于SVM和词间特征的新词识别研究[J].计算机技术与发展,2012,(05):134.
 XU Yuan-fang,LI Cheng-cheng.Research on New Word Identification Based on SVM and Word Characteristics[J].,2012,(04):134.

更新日期/Last Update: 2020-04-10