As a popular course in graphic programming,Scratch has attracted a large number of primary and secondary school students,and the evaluation of the difference between the projects made by students and the standard projects is usually made by the teacherthrough manual comparison and inspection, which is not only a?heavy workload for teachers, but also a huge energy consumption.Therefore,the recognition of similarities in Scratch projects can assist teachers to quickly detect students’ projects,thus improving teachingefficiency. To solve this problem,the Siamese-BERT model is proposed to detect the similarity between two Scratch projects. Firstly,theScratch source file is analyzed to extract the sequence of original building blocks, and a building block reconstruction algorithm isproposed according to the logical characteristics of building blocks to sort the sequence of original building blocks into Token sequence.Token sequence is used as input text of? Continuous Bag of Words(CBOW) model for pre-training,so as to obtain Scratch word vectormodel. Then,Siamese neural network framework is used for combined training with BERT
( Bidirectional Encoder Representation fromTransformers) model,and finally input into cosine similarity function for similarity calculation. The data set comes from the trainingprojects of Scratch training institution in Changsha City and the practice projects of students. On this data set,the accuracy of Siamese-BERT model can reach 0. 82. Compared with other text similarity models,the Siamese-BERT model is more accurate in the similaritydetection of Scratch projects.