提交HIT到亚马逊Mechanical Turk时出现错误信息

问题描述:

我在提交HIT到Amazon Mechanical Turk沙箱时遇到问题。提交HIT到亚马逊Mechanical Turk时出现错误信息

我用下面的代码提交HIT:

external_content = """" 
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd"> 
    <ExternalURL>https://MY_HOST_GOES_HERE/</ExternalURL> 
    <FrameHeight>400</FrameHeight> 
</ExternalQuestion> 
""" 

import boto3 

import os 

region_name = 'us-east-1' 

aws_access_key_id = 'MYKEY' 
aws_secret_access_key = 'MYSECRETKEY' 

endpoint_url = 'https://mturk-requester-sandbox.us-east-1.amazonaws.com' 

# Uncomment this line to use in production 
# endpoint_url = 'https://mturk-requester.us-east-1.amazonaws.com' 

client = boto3.client('mturk', 
         endpoint_url=endpoint_url, 
         region_name=region_name, 
         aws_access_key_id=aws_access_key_id, 
         aws_secret_access_key=aws_secret_access_key, 
        ) 

# This will return $10,000.00 in the MTurk Developer Sandbox 
print(client.get_account_balance()['AvailableBalance']) 


response = client.create_hit(Question=external_content, 
          LifetimeInSeconds=60 * 60 * 24, 
          Title="Answer a simple question", 
          Description="Help research a topic", 
          Keywords="question, answer, research", 
          AssignmentDurationInSeconds=120, 
          Reward='0.05') 

# The response included several helpful fields 
hit_group_id = response['HIT']['HITGroupId'] 
hit_id = response['HIT']['HITId'] 

# Let's construct a URL to access the HIT 
sb_path = "https://workersandbox.mturk.com/mturk/preview?groupId={}" 
hit_url = sb_path.format(hit_group_id) 

print(hit_url) 

该错误消息我得到的是:

botocore.exceptions.ClientError: An error occurred (ParameterValidationError) when calling the CreateHIT operation: There was an error parsing the XML question or answer data in your request. Please make sure the data is well-formed and validates against the appropriate schema. Details: Content is not allowed in prolog. (1493572622889 s) 

什么可能会在这里的原因是什么? XML完全赞同位于亚马逊服务器上的xml模式。

由外部主机返回的HTML是:

<!DOCTYPE html> 
<head> 
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/> 
<script src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js' type='text/javascript'></script> 
</head> 
<body> 
<!-- HTML to handle creating the HIT form --> 
<form name='mturk_form' method='post' id='mturk_form' action='https://workersandbox.mturk.com/mturk/externalSubmit'> 
<input type='hidden' value='' name='assignmentId' id='assignmentId'/> 
<!-- This is where you define your question(s) --> 
<h1>Please name the company that created the iPhone</h1> 
<p><textarea name='answer' rows=3 cols=80></textarea></p> 
<!-- HTML to handle submitting the HIT --> 
<p><input type='submit' id='submitButton' value='Submit' /></p></form> 
<script language='Javascript'>turkSetAssignmentID();</script> 
</body> 
</html> 

谢谢

此消息 “详细信息:内容是不是在序言中不允许的。”是线索。事实证明,这就是说你不能在预期之外拥有内容。这是通常发生在垃圾字符(认为是智能引号或不可打印的ASCII值)的地方。这些可以成为诊断的真正痛处。

就你而言,调试有点容易,但仍然令人沮丧。看看这个行:

external_content = """" 

事实证明,Python中只需要三个引号(“”“)实际上呈现为XML的一部分,以确认一个多行字符串定义。因此您的第四位。”。将该行更改为:

external_content = """ 

而你是金。我只是测试它,它的工作原理。对不起所有的沮丧,但希望这可以解除您的阻碍。快乐星期天!

+0

哈哈哈,你救了我的一天!非常感谢! –