将元组的第二个元素（在元组列表中）作为字符串获取

问题描述：

我有一个输出是一个元组列表。它看起来像这样：将元组的第二个元素（在元组列表中）作为字符串获取

annot1=[(402L, u"[It's very seldom that you're blessed to find your equal]"), 
     (415L, u'[He very seldom has them in this show or his movies]')…

我需要使用元组的第二部分只应用'split'并分别获取每个单词的句子。

在这一点上，我无法隔离元组的第二部分（文本）。

这是我的代码：

def scope_match(annot1): 
    scope = annot1[1:] 
    scope_string = ‘’.join(scope) 
    scope_set = set(scope_string.split(' '))

，但我得到：

TypeError: sequence item 0: expected string, tuple found

我试图用annot1 [1]，但它给了我文本的第二个索引，而不是元组的第二个元素。

答

你可以做这样的事情跟列表解析：

annot1=[(402L, u"[It's very seldom that you're blessed to find your equal]"), 
     (415L, u'[He very seldom has them in this show or his movies]')] 
print [a[1].strip('[]').encode('utf-8').split() for a in annot1]

输出：

[["It's", 'very', 'seldom', 'that', "you're", 'blessed', 'to', 'find', 'your', 'equal'], ['He', 'very', 'seldom', 'has', 'them', 'in', 'this', 'show', 'or', 'his', 'movies']]

你可以像这样计算annot1和annot2中对应位置的串：

for x,y in zip(annot1,annot2): 
    print set(x[1].strip('[]').encode('utf-8').split()).intersection(y[1].strip('[]').encode('utf-8').split())

我需要在输出上传递'intersection（）'（我有两个输出，我需要找到常用词）。我怎样才能把它从列表中删除？ '十字路口'不会列表。 – norpa

我没有得到你。你想计算两个列表还是多个列表之间的交集？因为annot1似乎包含多个元组，根据您在问题中如何定义它。 – MYGz

我想计算两个元组列表的第二个元素（一个字符串）之间的交集。我有'annot1'和'annot2'（有如上所示的元组），我需要比较annot1中tuple1的第二个元素和annot2中tuple1的第二个元素;那么annot1中的tuple2的第二个元素与annot2中的tuple2的第二个元素... etc – norpa

答

annot1是元组列表。为了从每个元素的字符串，可以做这样的事情

def scope_match(annot1): 
    for pair in annot1: 
     string = pair[1] 
     print string # or whatever you want to do

我得到'TypeError：'长'对象没有属性'__getitem __''。你知道会发生什么吗？... – norpa

这听起来像你试图访问一个长的列表。你确定'annot1'是元组列表吗？听起来像其他人，但你已经覆盖。 – Iluvatar

将元组的第二个元素（在元组列表中）作为字符串获取

相关推荐