如何在Python 2.7中将CSV字符串分解为字典(或列表)的字典?
我已经创建了一个Python脚本,它从以'text/csv'格式返回的API获取数据。我想要做的就是使用CSV文件中的头文件来创建字典的字典,或者根据哪些字典更有效来创建列表字典。如何在Python 2.7中将CSV字符串分解为字典(或列表)的字典?
的输出是一个很长的字符串,我已经分裂成一个列表,然后用下面的代码字典(我已经消毒这一点):
# Makes API call
resultsreturn = requests.get(url,headers=head)
# Grabs text from API call
# Data is returned in one long string:
# '"Header1,Header2,Header3,Header4\\nR1C1,R1C2,R1C3,R1C4\\nR2C1,R2C2,R2C3,R2C4"'
results_json_data = json.dumps(resultsreturn.text)
# Splits results into list:
# ['"Header1,Header2,Header3,Header4', 'R1C1,R1C2,R1C3,R1C4', 'R2C1,R2C2,R2C3,R2C4"']
list_results_split = results_json_data.split('\\n')
#Splits list into dictionary.
dict_results = dict(zip(range(len(list_results_split)), list_results_split))
印刷dict_results看起来是这样的:
{0: '"Header1,Header2,Header3,Header4',
1: 'R1C1,R1C2,R1C3,R1C4'
2: 'R2C1,R2C2,R2C3,R2C4"'}
所以我想要做的是什么莫名其妙得到的东西看起来是这样的:
{0: {"Header1":"R1C1", "Header2":"R1C2", "Header3":"R1C3", "Header4":"R1C4"}
1: {"Header1":"R2C1", "Header2":"R2C2", "Header3":"R2C3", "Header4":"R2C4"}
2: {"Header1":"R3C1", "Header2":"R3C2", "Header3":"R3C3", "Header4":"R3C4"}}
我也刚刚注意到,从results_json_data创建的第一个字符串有一个“开头和一个”结尾,我可能需要去掉所有看起来像我想要的东西。希望有人能指引我正确的方向;我对编程/ Python相当陌生。
请看csv
模块和DictReader
类。如果可能的话,你应该用规定的库,而不是自己做这件事的处理CSV数据:
> import csv
# first param must be an iterable producing strings (the lines of your csv data)
# this tyically is a file-like object, but can be a plain list
> reader = csv.DictReader(list_results_split, delimiter=',')
> reader.fieldnames
["Header1", "Header2", "Header3"]
> lst = list(reader)
[{"Header1":"R1C1", "Header2":"R1C2", "Header3":"R1C3", "Header4":"R1C4"},
{"Header1":"R2C1", "Header2":"R2C2", "Header3":"R2C3", "Header4":"R2C4"},
{"Header1":"R3C1", "Header2":"R3C2", "Header3":"R3C3", "Header4":"R3C4"}]
# And
> dict(enumerate(lst))
{0: {"Header1":"R1C1", "Header2":"R1C2", "Header3":"R1C3", "Header4":"R1C4"}
1: {"Header1":"R2C1", "Header2":"R2C2", "Header3":"R2C3", "Header4":"R2C4"}
2: {"Header1":"R3C1", "Header2":"R3C2", "Header3":"R3C3", "Header4":"R3C4"}}
看你的原始字符串和您的输出,你应该考虑剥离"
前处理:
results_json_data = results_json_data.strip('"')
谢谢!我很亲密,但仍然陷入困境。当我列出(读者)我得到的结果。但是,当我使用dict(枚举(读者))时,我只是将{}作为输出。 – Keefer
是的,list()调用耗尽了基础迭代器。你可以忽略它或存储结果:'l = list(reader)',然后'dict(enumerate(l))''。更新我的答案... – schwobaseggl
因为(半)的单行的乐趣:
string = """Header1,Header2,Header3,Header4
R1C1,R1C2,R1C3,R1C4
R2C1,R2C2,R2C3,R2C4"""
string = string.split()
headers, data = string[0].split(","), string[1:]
d = {j:{headers[i]:data[j].split(",")[i] for i in range(len(headers))} for j in range(len(data))}
输出
{0: {'Header2': 'R1C2', 'Header3': 'R1C3', 'Header1': 'R1C1', 'Header4': 'R1C4'},
1: {'Header2': 'R2C2', 'Header3': 'R2C3', 'Header1': 'R2C1', 'Header4': 'R2C4'}}
你可以发布api的原始响应吗? – jspurim
@jspurim我不能,因为这些数据包含的信息会让我在因特网上发布信息。我已经完全格式化了它,除了我已经更改了值和行数/列数。列数将始终保持不变,但行数会有所不同。 – Keefer