熊猫只写在CSV文件中的最后一行

问题描述:

我正在从一个txt文件中取出URL并将其导出到一个csv文件。但毕竟这个过程我的代码只写入最后一个url的信息。我的猜测是我忘记了一个循环。但是哪里? 这里是我的代码:熊猫只写在CSV文件中的最后一行

import requests 
from bs4 import BeautifulSoup 
import pandas as pd 
from urllib import urlopen 

file = open('urls.txt', 'r') 
filelines = (line.strip() for line in file) 
for code in filelines: 
    site = urlopen(code) 
    soup = BeautifulSoup(site, "html.parser") 
    final = soup.find_all("span", {"class": "bd js-title-main-info"}) 
    print final 

records = [] 
for pagetxt in final: 
    print pagetxt.text 
    records.append((pagetxt.text)) 
df = pd.DataFrame(records, columns=['product name']) 
df.to_csv('test.csv', index=False, encoding='utf-8') 

感谢

当你从文件中的数据您保持变量final只有最后一个值。尝试追加数据较早(我已标记更改为#####):

import requests 
from bs4 import BeautifulSoup 
import pandas as pd 
from urllib import urlopen 

file = open('urls.txt', 'r') 
filelines = (line.strip() for line in file) 
records = []       ###### 
for code in filelines: 
    site = urlopen(code) 
    soup = BeautifulSoup(site, "html.parser") 
    final = soup.find_all("span", {"class": "bd js-title-main-info"}) 
    print final 

    for pagetxt in final:    ###### 
     print pagetxt.text    ###### 
     records.append((pagetxt.text)) ###### 

df = pd.DataFrame(records, columns=['product name']) 
df.to_csv('test.csv', index=False, encoding='utf-8') 
+1

是的,它的工作!谢谢! – Jodmoreira