SSL:Windows上的CERTIFICATE_VERIFY_FAILED错误

问题描述:

我正在构建一个简单的程序来浏览URL列表并使用美丽的汤提取其内容。对于分钟我只是试图通过列表迭代和检索HTML,但我不断收到以下错误:SSL:Windows上的CERTIFICATE_VERIFY_FAILED错误

Traceback (most recent call last): 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 1318, in do_open 
    encode_chunked=req.has_header('Transfer-encoding')) 
    File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1239, in request 
    self._send_request(method, url, body, headers, encode_chunked) 
    File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1285, in _send_request 
    self.endheaders(body, encode_chunked=encode_chunked) 
    File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1234, in endheaders 
    self._send_output(message_body, encode_chunked=encode_chunked) 
    File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1026, in _send_output 
    self.send(msg) 
    File "C:\ProgramData\Anaconda3\lib\http\client.py", line 964, in send 
    self.connect() 
    File "C:\ProgramData\Anaconda3\lib\http\client.py", line 1400, in connect 
    server_hostname=server_hostname) 
    File "C:\ProgramData\Anaconda3\lib\ssl.py", line 401, in wrap_socket 
    _context=self, _session=session) 
    File "C:\ProgramData\Anaconda3\lib\ssl.py", line 808, in __init__ 
    self.do_handshake() 
    File "C:\ProgramData\Anaconda3\lib\ssl.py", line 1061, in do_handshake 
    self._sslobj.do_handshake() 
    File "C:\ProgramData\Anaconda3\lib\ssl.py", line 683, in do_handshake 
    self._sslobj.do_handshake() 
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749) 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "C:/Users/thoma/PycharmProjects/fyp/urls_and_prep/parsing_html.py", line 17, in <module> 
    response = urllib.request.urlopen(req) 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 223, in urlopen 
    return opener.open(url, data, timeout) 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 526, in open 
    response = self._open(req, data) 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 544, in _open 
    '_open', req) 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 504, in _call_chain 
    result = func(*args) 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 1361, in https_open 
    context=self._context, check_hostname=self._check_hostname) 
    File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 1320, in do_open 
    raise URLError(err) 
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)> 

我的程序很简单,但我不明白或发现任何好的资源找出究竟究竟在做什么/如何处理它。我知道它与SSL证书有关,但我不确定如何使用它们或安装它们的位置等。我只是在这一点上有点失落,因为我从来没有真正使用过SSL 。任何指导或帮助非常感谢。下面的代码:

import urllib.request 
from bs4 import BeautifulSoup 

file = open("all_urls.txt", "r") 

for line in file: 
    print(line) 

    try: 
     response = urllib.request.urlopen(line) 
     html = response.read() 
    except ValueError: 
     print(ValueError) 
     continue 
    soup = BeautifulSoup(html, 'lxml') 
    print(soup.get_text()) 
+0

有[关于这个话题的许多问题(https://stackoverflow.com/search?q=is%3Aquestion合作+蟒蛇+证书+失败)。如果这些没有帮助,并且想要获得有关您的具体问题的帮助,请提供足够的详细信息来重现问题。这尤其意味着代码失败的URL。 –

你使用的是Windows还是Linux?这个问题似乎不在Python上,但在Anaconda或操作系统中。你可以尝试一些简单的解决方案,比如:1 - 使用其他Python安装而不是Anaconda的安装来执行scrypt。 2 - 使用virtualenv隔离操作系统的组件。

+0

我使用的窗口与anaconda,但我想我安装了蟒蛇之前安装了python和一些库。你会认为重新安装python/anaconda帮助吗?谢谢回复? –

+0

从Anaconda安装单个Python和Python是在不同的地方。尝试在执行脚本时传递Python的完整路径。例如:'C:\ Program Files \ Python34 \ Python xxxxxxxx.py' –

下面将解决问题。但一定不要在生产中使用,因为它会不验证SSL证书 -

import urllib 
from bs4 import BeautifulSoup 
import ssl 

# This is a temporary fix .Be carefule of malicious links 
context = ssl._create_unverified_context() 
file = open("all_urls.txt", "r") 

for line in file: 
    print(line) 

    try: 
     response = urllib.request.urlopen(line, context=context) 
     html = response.read() 
    except ValueError: 
     print(ValueError) 
     continue 
    soup = BeautifulSoup(html, 'lxml') 
    print(soup.get_text()) 
+0

好吧,这很好,我知道我的列表中没有任何链接是恶意的,因此应该可以工作。我将在稍后的爬虫程序中使用此代码,那么在那种情况下,我不知道我将检查哪些链接,您会推荐什么?非常感谢答复,真的很感激。 –