一个bug引起的…thinking
需求简述
将excel学生信息表转换为json格式。
其中代码有一步要将excel每行的数据按照json模板格式替换掉默认值。
原代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 import xlrdimport jsonclass GetStudentInfo (object ): def __init__ (self, student_info_path ): self.student_info_path = student_info_path self.template = { "name" : "ZhangSan" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : "0" } def create_new_student (self, name, student_id ): new_student = self.template new_student['name' ] = name new_student['id' ] = student_id return new_student def get_whole_stu_info (self ): students = {} tables = xlrd.open_workbook(self.student_info_path) table = tables.sheets()[0 ] for row in range (0 , table.nrows - 1 ): name = table.cell_value(row + 1 , 0 ) student_id = table.cell_value(row + 1 , 1 ) new_student = self.create_new_student(name, student_id) students[str (row)] = new_student self.get_new_file(students) def get_new_file (self, students ): with open ('./output.json' , 'w' , encoding='utf-8' ) as file: json.dump(students, file, indent=4 , ensure_ascii=False ) if __name__ == '__main__' : student_info_path = './student_info.xlsx' data = GetStudentInfo(student_info_path) data.get_whole_stu_info()
输出文件为
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 { "0" : { "name" : "GouDong" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 4.0 } , "1" : { "name" : "GouDong" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 4.0 } , "2" : { "name" : "GouDong" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 4.0 } , "3" : { "name" : "GouDong" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 4.0 } }
Bug定位
1 2 3 4 5 6 def create_new_student (self, name, student_id ): new_student = self.template print (id (new_student)) new_student['name' ] = name new_student['id' ] = student_id return new_student
发现每一次的new_student
的id是一样的
1 2 3 4 2110590375320 2110590375320 2110590375320 2110590375320
解决方法
修改为
1 2 3 4 5 6 7 def create_new_student (self, name, student_id ): new_student = copy.deepcopy(self.template) print (id (new_student)) new_student['name' ] = name new_student['id' ] = student_id return new_student
此时输出的id不同了:
1 2 3 4 2392740865832 2392740866072 2392740866152 2392740865752
新的输出文件为
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 { "0" : { "name" : "LiuBo" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 1.0 } , "1" : { "name" : "BoCai" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 2.0 } , "2" : { "name" : "CaiGou" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 3.0 } , "3" : { "name" : "GouDong" , "sex" : "female" , "grade" : "6" , "age" : "12" , "id" : 4.0 } }
分析
原代码中,new_student = self.template
每一次都将*new_student
指向self.template
,students[str(row)] = new_student
每一次都将students[str(row)]
指向new_student
。所以每次new_student
修改后,students[str(row)]
的全部值都会更改为最新版。
若要避免该问题,就涉及到浅拷贝和深拷贝的问题。
赋值 :仅仅是个别名,引用,指向原有地址,id的地址和原有地址相同。(就像快捷方式。)
浅拷贝 :第一层拷贝了,里面子文件全是引用。(先建一个新对象,对象地址是新的,里面放原数据的地址,就像一个文件夹里放的全是快捷方式。)
深拷贝 :新对象的内存地址也会重新分配,跟原来的内存地址不一样。完全弄一个克隆版,克隆体和本体没有关系了,本体改了克隆体不变。(先建一个新对象,对象地址是新的,里面放的全是克隆体,其地址也是新的。就像一个文件夹里放的全是文件,而不是快捷方式。)
再要分清Python里,“=”号、copy.copy
和copy.deepcopy
三者的区别。
1 2 3 4 5 6 7 8 9 10 11 12 a = 1 b = a print ('原来的a' , a, '地址' , id (a))print ('b' , b, '地址' , id (b))b = 2 print ('此时的a' , a, '地址' , id (a))print ('修改后的b' , b, '地址' , id (b))
修改后b此时地址变了,因为赋给一个全新完整的变量会重新生成新地址。
1 2 3 4 5 6 7 8 9 10 11 12 c = [1 , 2 ] print ('原来的c' , c, '地址' , id (c))d = c print ('d' , d, '地址' , id (d))d[0 ] = 3 print ('此时的c' , c, '地址' , id (c))print ('修改后的d' , d, '地址' , id (d))
修改后d地址没变,因为只修改了d内的部分值。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 e = { "name" : "ZhangSan" , "id" : "0" } print ('原来的e' , e, '地址' , id (e))f = e print ('f' , f, '地址' , id (f))f['id' ] = '1' print ('此时的e' , e, '地址' , id (e))print ('修改后的f' , f, '地址' , id (f))
修改后f地址没变,因为只修改了f内的部分值。
copy.copy
:对应浅拷贝
copy.deepcopy
:对应深拷贝
官方文档:copy函数
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
注意加粗字体,区别在于一个是引用,一个是复制体本身。