Python将Excel转换为CSV

问题描述:

似乎有很多这个问题上的帖子,我的解决方案是与最常见的答案似乎是一致的,但是我遇到了编码错误,我不知道如何地址。Python将Excel转换为CSV

>>> def Excel2CSV(ExcelFile, SheetName, CSVFile): 
    import xlrd 
    import csv 
    workbook = xlrd.open_workbook(ExcelFile) 
    worksheet = workbook.sheet_by_name(SheetName) 
    csvfile = open(CSVFile, 'wb') 
    wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL) 

    for rownum in xrange(worksheet.nrows): 
     wr.writerow(worksheet.row_values(rownum)) 

    csvfile.close() 

>>> Excel2CSV(r"C:\Temp\Store List.xls", "Open_Locations", 
       r"C:\Temp\StoreList.csv") 

Traceback (most recent call last): 
File "<pyshell#2>", line 1, in <module> 
Excel2CSV(r"C:\Temp\Store List.xls", "Open_Locations", r"C:\Temp\StoreList.csv") 
File "<pyshell#1>", line 10, in Excel2CSV 
wr.writerow(worksheet.row_values(rownum)) 
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 14: 
ordinal not in range(128) 
>>> 

任何帮助或洞察力,非常感谢。

正如@davidism指出的那样,Python 2 csv模块不适用于unicode。您可以变通的作法是将所有的unicode对象str对象提交到csv前:

def Excel2CSV(ExcelFile, SheetName, CSVFile): 
    import xlrd 
    import csv 
    workbook = xlrd.open_workbook(ExcelFile) 
    worksheet = workbook.sheet_by_name(SheetName) 
    csvfile = open(CSVFile, 'wb') 
    wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL) 

    for rownum in xrange(worksheet.nrows): 
     wr.writerow(
      list(x.encode('utf-8') if type(x) == type(u'') else x 
        for x in worksheet.row_values(rownum))) 

    csvfile.close() 

Python 2 csv模块有一些unicode数据的问题。您可以在写入之前将所有内容编码为UTF-8,或者使用unicodecsv模块为您完成。

第一个pip install unicodecsv。然后,而不是import csv,只是import unicodecsv as csv。 API是相同的(加上编码选项),所以不需要其他更改。

另一种方式这样做:强制转换为string,所以你有一个字符串,可以编纂它作为“UTF-8”。

str(worksheet.row_values(rownum)).encode('utf-8') 

的整体功能:

def Excel2CSV(ExcelFile, SheetName, CSVFile): 
    import xlrd 
    import csv 
    workbook = xlrd.open_workbook(ExcelFile) 
    worksheet = workbook.sheet_by_name(SheetName) 
    csvfile = open(CSVFile, 'wb') 
    wr = csv.writer(csvfile, quoting=csv.QUOTE_ALL) 

    for rownum in xrange(worksheet.nrows): 
     wr.writerow(str(worksheet.row_values(rownum)).encode('utf-8')) 

    csvfile.close()