一个bug引起的……thinking
需求简述
将excel学生信息表转换为json格式。
其中代码有一步要将excel每行的数据按照json模板格式替换掉默认值。
原代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
import xlrd
import json
class GetStudentInfo(object):
def __init__(self, student_info_path):
self.student_info_path = student_info_path
self.template = {
"name": "ZhangSan",
"sex": "female",
"grade": "6",
"age": "12",
"id": "0"
}
def create_new_student(self, name, student_id):
new_student = self.template
new_student['name'] = name
new_student['id'] = student_id
return new_student
def get_whole_stu_info(self):
students = {}
tables = xlrd.open_workbook(self.student_info_path)
table = tables.sheets()[0]
for row in range(0, table.nrows - 1):
name = table.cell_value(row + 1, 0)
student_id = table.cell_value(row + 1, 1)
new_student = self.create_new_student(name, student_id)
students[str(row)] = new_student
self.get_new_file(students)
def get_new_file(self, students):
with open('./output.json', 'w', encoding='utf-8') as file:
json.dump(students, file, indent=4, ensure_ascii=False)
if __name__ == '__main__':
student_info_path = './student_info.xlsx'
data = GetStudentInfo(student_info_path)
data.get_whole_stu_info()
|
输出文件为
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
{
"0": {
"name": "GouDong",
"sex": "female",
"grade": "6",
"age": "12",
"id": 4.0
},
"1": {
"name": "GouDong",
"sex": "female",
"grade": "6",
"age": "12",
"id": 4.0
},
"2": {
"name": "GouDong",
"sex": "female",
"grade": "6",
"age": "12",
"id": 4.0
},
"3": {
"name": "GouDong",
"sex": "female",
"grade": "6",
"age": "12",
"id": 4.0
}
}
|
Bug定位
1
2
3
4
5
6
|
def create_new_student(self, name, student_id):
new_student = self.template # 这一行有问题!
print(id(new_student))
new_student['name'] = name
new_student['id'] = student_id
return new_student
|
发现每一次的new_student
的id是一样的
1
2
3
4
|
2110590375320
2110590375320
2110590375320
2110590375320
|
解决方法
修改为
1
2
3
4
5
6
7
|
def create_new_student(self, name, student_id):
new_student = copy.deepcopy(self.template) # 修改后(法一)
# new_student = copy.copy(self.template) # 修改后(法二)
print(id(new_student))
new_student['name'] = name
new_student['id'] = student_id
return new_student
|
此时输出的id不同了:
1
2
3
4
|
2392740865832
2392740866072
2392740866152
2392740865752
|
新的输出文件为
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
{
"0": {
"name": "LiuBo",
"sex": "female",
"grade": "6",
"age": "12",
"id": 1.0
},
"1": {
"name": "BoCai",
"sex": "female",
"grade": "6",
"age": "12",
"id": 2.0
},
"2": {
"name": "CaiGou",
"sex": "female",
"grade": "6",
"age": "12",
"id": 3.0
},
"3": {
"name": "GouDong",
"sex": "female",
"grade": "6",
"age": "12",
"id": 4.0
}
}
|
分析
原代码中,new_student = self.template
每一次都将*new_student
指向self.template
,students[str(row)] = new_student
每一次都将students[str(row)]
指向new_student
。所以每次new_student
修改后,students[str(row)]
的全部值都会更改为最新版。
若要避免该问题,就涉及到浅拷贝和深拷贝的问题。
- 赋值:仅仅是个别名,引用,指向原有地址,id的地址和原有地址相同。(就像快捷方式。)
- 浅拷贝:第一层拷贝了,里面子文件全是引用。(先建一个新对象,对象地址是新的,里面放原数据的地址,就像一个文件夹里放的全是快捷方式。)
- 深拷贝:新对象的内存地址也会重新分配,跟原来的内存地址不一样。完全弄一个克隆版,克隆体和本体没有关系了,本体改了克隆体不变。(先建一个新对象,对象地址是新的,里面放的全是克隆体,其地址也是新的。就像一个文件夹里放的全是文件,而不是快捷方式。)
再要分清Python里,“=”号、copy.copy
和copy.deepcopy
三者的区别。
1
2
3
4
5
6
7
8
9
10
11
12
|
a = 1
b = a
print('原来的a', a, '地址', id(a))
print('b', b, '地址', id(b))
b = 2
print('此时的a', a, '地址', id(a))
print('修改后的b', b, '地址', id(b))
# output
# 原来的a 1 地址 140720364485696
# b 1 地址 140720364485696
# 此时的a 1 地址 140720364485696
# 修改后的b 2 地址 140720364485728
|
修改后b此时地址变了,因为赋给一个全新完整的变量会重新生成新地址。
1
2
3
4
5
6
7
8
9
10
11
12
|
c = [1, 2]
print('原来的c', c, '地址', id(c))
d = c
print('d', d, '地址', id(d))
d[0] = 3
print('此时的c', c, '地址', id(c))
print('修改后的d', d, '地址', id(d))
# output
# 原来的c [1, 2] 地址 2758022890888
# d [1, 2] 地址 2758022890888
# 此时的c [3, 2] 地址 2758022890888
# 修改后的d [3, 2] 地址 2758022890888
|
修改后d地址没变,因为只修改了d内的部分值。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
e = {
"name": "ZhangSan",
"id": "0"
}
print('原来的e', e, '地址', id(e))
f = e
print('f', f, '地址', id(f))
f['id'] = '1'
print('此时的e', e, '地址', id(e))
print('修改后的f', f, '地址', id(f))
# output
# 原来的e {'name': 'ZhangSan', 'id': '0'} 地址 2001978290072
# f {'name': 'ZhangSan', 'id': '0'} 地址 2001978290072
# 此时的e {'name': 'ZhangSan', 'id': '1'} 地址 2001978290072
# 修改后的f {'name': 'ZhangSan', 'id': '1'} 地址 2001978290072
|
修改后f地址没变,因为只修改了f内的部分值。
copy.copy
:对应浅拷贝
copy.deepcopy
:对应深拷贝
官方文档:copy函数
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
注意加粗字体,区别在于一个是引用,一个是复制体本身。