将字符串列表中的数据帧列转换为元组

问题描述:

我在列表中有urls。这是数据框的一个元素。我需要将这些字符串列表中的每一个转换为像元组这样的可排列类型。我已经用逗号读取了元组(a,),在转换时保留了列表中的字符串。在应用于整列数据框时,我似乎无法使其发挥作用。概率简单的东西将字符串列表中的数据帧列转换为元组

df['url'] = tuple(df['url',]) ...不起作用

flatframe['url'] = flatframe['url'].apply(tuple) ...工程,但不保留字符串

这里有几个数据的列:

index artist ranking song songurl songtext artisturl year 

2280 (Lady Antebellum,) 81 [Bartender (Lady Antebellum song)] [/wiki/Bartender_(Lady_Antebellum_song)] "Bartender (Lady Antebellum song)" /wiki/Lady_Antebellum 2014 

2281 (Naughty Boy, Sam Smith) 82 [La La La (Naughty Boy song)] [/wiki/La_La_La_(Naughty_Boy_song)] "La La La (Naughty Boy song)" [/wiki/Naughty_Boy, /wiki/Sam_Smith_(singer)] 2014 

2282 (Robin Thicke, T.I., Pharrell Williams) 83 [Blurred Lines] [/wiki/Blurred_Lines] "Blurred Lines" [/wiki/Robin_Thicke, /wiki/T.I., /wiki/Pharrel... 2014 

2283 (Lady Gaga, R. Kelly) 84 [Do What U Want] [/wiki/Do_What_U_Want] "Do What U Want" [/wiki/Lady_Gaga, /wiki/R._Kelly] 2014 
+0

你能为我们提供一个样本数据看? – yesemsanthoshkumar

+0

也许,这可能有助于... https://stackoverflow.com/questions/37994791/in-pandas-how-to-read-csv-files-with-lists-in-a-column – yesemsanthoshkumar

+0

你能提供一个样本你期望的输出? –

可以说数据帧是这样的:

import pandas as pd 
pd.set_printoptions(max_columns=10) 
df = pd.DataFrame(
[[2280, ("Lady Antebellum"), 81, ["Bartender (Lady Antebellum song)"], ["/wiki/Bartender_(Lady_Antebellum_song)"], "Bartender (Lady Antebellum song)", "/wiki/Lady_Antebellum", 2014], 
[2281, "(Naughty Boy, Sam Smith)", 82, ["La La La (Naughty Boy song)"], ["/wiki/La_La_La_(Naughty_Boy_song)"], "La La La (Naughty Boy song)", ["/wiki/Naughty_Boy", "/wiki/Sam_Smith_(singer)"], 2014], 
[2282, "(Robin Thicke, T.I., Pharrell Williams)", 83, ["Blurred Lines"], ["/wiki/Blurred_Lines"], "Blurred Lines", ["/wiki/Robin_Thicke", "/wiki/T.I. /wiki/Pharrel"], 2014], 
[2283, "(Lady Gaga, R. Kelly)", 84, ["Do What U Want"], ["/wiki/Do_What_U_Want"], "Do What U Want", ["/wiki/Lady_Gaga", "/wiki/R._Kelly"], 2014]], 
columns = ["index", "artist", "ranking", "song", "songurl", "songtext", "artisturl", "year"]) 

这时可以尝试用:

df.artisturl = df.artisturl.apply(lambda x: tuple(x) if type(x)!= str else tuple([x])) 

这将适用元组只对非字符串项,并转换成列表,然后到元组是字符串项。就好像它是一个字符串,并且你应用元组,它将为每个字符提供一个元组作为条目。

你列然后artisturl将如下:

>>> df.artisturl 
0       ('/wiki/Lady_Antebellum',) 
1 ('/wiki/Naughty_Boy', '/wiki/Sam_Smith_(singer)') 
2 ('/wiki/Robin_Thicke', '/wiki/T.I. /wiki/Pharr... 
3    ('/wiki/Lady_Gaga', '/wiki/R._Kelly') 
Name: artisturl