红宝石CSV错误:UTF-8

问题描述:

[[email protected] RcTools]$ irb 
1.9.3p0 :001 > require 'csv' 
=> true 
1.9.3p0 :002 > master = CSV.read("./public/jobs/in/Appexchange_Applications_Companies_487.csv") 
ArgumentError: invalid byte sequence in UTF-8 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1855:in `sub!' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1855:in `block in shift' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1849:in `loop' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1849:in `shift' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1791:in `each' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1805:in `to_a' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1805:in `read' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1411:in `block in read' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1354:in `open' 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/1.9.1/csv.rb:1411:in `read' 
     from (irb):2 
     from /home/sagar/.rvm/rubies/ruby-1.9.3-p0/bin/irb:16:in `<main>' 
1.9.3p0 :003 > 

无效字节序列但是当我做红宝石CSV错误:UTF-8

1.9.3p0 :003 > master = CSV.open("./public/jobs/in/Appexchange_Applications_Companies_487.csv","r") 
=> <#CSV io_type:File io_path:"./public/jobs/in/Appexchange_Applications_Companies_487.csv" encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\r\n" quote_char:"\""> 
1.9.3p0 :004 > 

我只是想知道为什么发生这种情况,什么是解决方案。我想读csv,因为它返回了一个csv的数组。 所以,如果我在第一种方式一样

master = CSV.read("./public/jobs/in/Appexchange_Applications_Companies_487.csv") 

读取文件时,它返回我的数组

1.9.3p0 :008 > master.class 
=> Array 

但是在第二种情况下,类是CSV。 什么是第一种方式读取csv的解决方案。

+0

UTF-8中的csv文件?如果不是,您可以通过将其作为第二个参数传递给CSV.read来指定编码。 – jlundqvist 2012-08-14 12:53:37

+0

ruby​​ iconv可以处理编码问题。 请看看http://*.com/questions/1793284/uploaded-file-char-set-conversion-with-ruby – 2012-08-14 13:56:33

关于错误:首先,确保您使用的是正确的字符编码。如果你这样做,那么你的csv文件中可能有无效的数据。您可以使用iconv修复它(请参阅由Chetan Muneshwar发布的链接)。

关于你的问题的第二部分:CSV.open只是打开文件阅读,但没有阅读。 CSV.read将打开文件,读取其内容并再次关闭。因此,只需从文件中使用CSV.read即可获取数据。