使用urllib.request时出现HTTP错误

问题描述：

我正在尝试做一个亵渎检查测试。到目前为止，我已经写的代码是使用urllib.request时出现HTTP错误

import urllib.request 

def read_text(): 
    file = open (r"C:\Users\Kashif\Downloads\abc.txt") 
    file_print = file.read() 
    print (file_print) 
    file.close() 
    check_profanity (file_print) 

def check_profanity (file_print): 
    connection = urllib.request.urlopen ("http://www.purgomalum.com/service/containsprofanity?text="+file_print) 
    output = connection.read() 
    print ("The Output is "+output) 
    connection.close() 
    read_text()

，但我得到以下

urllib.error.HTTPError: HTTP Error 400: Bad Request

错误，我不知道，我错了。

我使用Python 3.6.1

您应该包括错误堆栈跟踪。 – Craicerjack

请同时包含abc的内容。文件本身可能存在一些问题。 –

答

你得到HTTP错误通常是在您请求数据到服务器的方式坏事的标志。按照HTTP Spec：

400 Bad Request

The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications

在你的榜样混凝土，这个问题似乎是缺少你的URL发送数据的URL编码。你应该尝试使用方法quote_plus从urllib.parse模块，使您的请求被接受：

from urllib.parse import quote_plus 

... 

encoded_file_print = quote_plus(file_print) 
url = "http://www.purgomalum.com/service/containsprofanity?text=" + encoded_file_print 
connection = urllib.request.urlopen(url)

如果不工作，那么问题可能是您的文件的内容。您可以先用一个简单的示例尝试它，以验证您的脚本是否正常工作，然后尝试使用该文件的内容。

除了以上的，这里还有一些其他问题与您的代码：

没有空间的方法和支架之间需要：file.close()或def read_text():等。
内容解码读取它字节转换为字符串后：output = connection.read().decode('utf-8')
你调用的方法的方法创建一个循环依赖。 read_text电话check_profanity，在结束通话read_text调用check_profanity等删除多余的方法调用，只需使用return返回一个方法的输出：
```
content = read_text() 
has_profanity = check_profanity(content) 
print("has profanity? %s" % has_profanity) 
```

感谢您的帮助，它完美的工作。我只是有另一个问题。我已经看到这个相同的程序在Python 2.7中完美工作。由于版本的变化，您提到的更改是什么？我完全是初学者，因此我不太了解 –

嗯，我不这么认为。我使用'quote_plus'所做的更改对于将请求正确发送到服务器是必需的。我没有看到它在Python 2.7中的工作方式。如果你有这个代码发送它，我可以尝试一下 –

使用urllib.request时出现HTTP错误

相关推荐