[1]张 锦,胡子达,陆玟冰,等.基于 Scratch 作品相似度的检测研究[J].计算机技术与发展,2023,33(10):143-149.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 022]
 ZHANG Jin,HU Zi-da,LU Wen-bing,et al.Research on Similarity Detection of Project Based on Scratch[J].,2023,33(10):143-149.[doi:10. 3969 / j. issn. 1673-629X. 2023. 10. 022]
点击复制

基于 Scratch 作品相似度的检测研究()
分享到:

《计算机技术与发展》[ISSN:1006-6977/CN:61-1281/TN]

卷:
33
期数:
2023年10期
页码:
143-149
栏目:
人工智能
出版日期:
2023-10-10

文章信息/Info

Title:
Research on Similarity Detection of Project Based on Scratch
文章编号:
1673-629X(2023)10-0143-07
作者:
张 锦12 胡子达1 陆玟冰1 杨定康1 李 强1 罗元盛2
1. 湖南师范大学 信息科学与工程学院,湖南 长沙 410006;
2. 长沙理工大学 计算机与通信工程学院,湖南 长沙 410006
Author(s):
ZHANG Jin12 HU Zi-da1 LU Wen-bing1 YANG Ding-kang1 LI Qiang1 LUO Yuan-sheng2
1. School of Information Science and Engineering,Hunan Normal University,Changsha 410006,China;
2. School of Computer and Communication Engineering,Changsha University of Science & Technology,Changsha 410006,China
关键词:
Scratch 图形化编程Siamese-BERT 模型连续词袋模型Siamese 神经网络BERT 模型余弦相似度
Keywords:
Scratch graphical programmingSiamese-BERTCBOWSiamese networkBERTcosine similarity
分类号:
TP399
DOI:
10. 3969 / j. issn. 1673-629X. 2023. 10. 022
摘要:
Scratch 作为图形化编程中的热门课程吸引了广大中小学生,而对于学生所做的作品与标准作品之间差异性的评定通常是靠教师通过人工对比检查,对于教师不仅工作量大且耗费巨大精力,因此对于 Scratch 作品相似性的识别就可以辅助教师快速检测学生作品,从而提高教学效率。 针对该问题,提出 Siamese-BERT 模型对两个 Scratch 作品之间的相似度进行检测。 首先,对 Scratch 源文件进行解析提取原始积木块序列,根据积木块逻辑特征提出一种积木块重构算法,将原始积木块序列排序成 Token 序列,将 Token 序列作为 CBOW( Continuous Bag of Words)模型的输入文本进行预训练,从而得到Scratch 的词向量模型;再使用 Siamese 神经网络框架结合 BERT( Bidirectional Encoder Representation from Transformers) 模型组合训练,最终输入到余弦相似度函数进行相似度计算。 数据集来自于长沙市 Scratch 培训机构的培训作品和学生的练习作品,在该数据集上,Siamese-BERT 模型准确度能达到 0. 82,对比其它的文本相似度模型,Siamese-BERT 模型在 Scratch作品相似度检测上更加准确。
Abstract:
As a popular course in graphic programming,Scratch has attracted a large number of primary and secondary school students,and the evaluation of the difference between the projects made by students and the standard projects is usually made by the teacherthrough manual comparison and inspection, which is not only a?
heavy workload for teachers, but also a huge energy consumption.Therefore,the recognition of similarities in Scratch projects can assist teachers to quickly detect students’ projects,thus improving teachingefficiency. To solve this problem,the Siamese-BERT model is proposed to detect the similarity between two Scratch projects. Firstly,theScratch source file is analyzed to extract the sequence of original building blocks, and a building block reconstruction algorithm isproposed according to the logical characteristics of building blocks to sort the sequence of original building blocks into Token sequence.Token sequence is used as input text of? Continuous Bag of Words(CBOW) model for pre-training,so as to obtain Scratch word vectormodel. Then,Siamese neural network framework is used for combined training with BERT
( Bidirectional Encoder Representation fromTransformers) model,and finally input into cosine similarity function for similarity calculation. The data set comes from the trainingprojects of Scratch training institution in Changsha City and the practice projects of students. On this data set,the accuracy of Siamese-BERT model can reach 0. 82. Compared with other text similarity models,the Siamese-BERT model is more accurate in the similaritydetection of Scratch projects.
更新日期/Last Update: 2023-10-10