JSON如何从谷歌的自然语言API中序列化一个对象? (没有__dict__属性)

问题描述:

我正在使用Google Natural Language API进行项目标记文本并进行情感分析。我想将我的NL结果存储为JSON。如果向Google发出直接HTTP请求,则返回JSON响应。JSON如何从谷歌的自然语言API中序列化一个对象? (没有__dict__属性)

但是,使用提供的Python库时,会返回一个对象,而该对象不是直接JSON可序列化的。

这里是我的代码示例:

import os 
import sys 
import oauth2client.client 
from google.cloud.gapic.language.v1beta2 import enums, language_service_client 
from google.cloud.proto.language.v1beta2 import language_service_pb2 

class LanguageReader: 
    # class that parses, stores and reports language data from text 

    def __init__(self, content=None): 

     try: 
      # attempts to autheticate credentials from env variable 
      oauth2client.client.GoogleCredentials.get_application_default() 
     except oauth2client.client.ApplicationDefaultCredentialsError: 
      print("=== ERROR: Google credentials could not be authenticated! ===") 
      print("Current enviroment variable for this process is: {}".format(os.environ['GOOGLE_APPLICATION_CREDENTIALS'])) 
      print("Run:") 
      print(" $ export GOOGLE_APPLICATION_CREDENTIALS=/YOUR_PATH_HERE/YOUR_JSON_KEY_HERE.json") 
      print("to set the authentication credentials manually") 
      sys.exit() 

     self.language_client = language_service_client.LanguageServiceClient() 
     self.document = language_service_pb2.Document() 
     self.document.type = enums.Document.Type.PLAIN_TEXT 
     self.encoding = enums.EncodingType.UTF32 

     self.results = None 

     if content is not None: 
       self.read_content(content) 

    def read_content(self, content): 
     self.document.content = content 
     self.language_client.analyze_sentiment(self.document, self.encoding) 
     self.results = self.language_client.analyze_sentiment(self.document, self.encoding) 

现在,如果你要运行:

sample_text="I love R&B music. Marvin Gaye is the best. 'What's Going On' is one of my favorite songs. It was so sad when Marvin Gaye died." 
resp = LanguageReader(sample_text).results 
print resp 

你会得到:

document_sentiment { 
    magnitude: 2.40000009537 
    score: 0.40000000596 
} 
language: "en" 
sentences { 
    text { 
    content: "I love R&B music." 
    } 
    sentiment { 
    magnitude: 0.800000011921 
    score: 0.800000011921 
    } 
} 
sentences { 
    text { 
    content: "Marvin Gaye is the best." 
    begin_offset: 18 
    } 
    sentiment { 
    magnitude: 0.800000011921 
    score: 0.800000011921 
    } 
} 
sentences { 
    text { 
    content: "\'What\'s Going On\' is one of my favorite songs." 
    begin_offset: 43 
    } 
    sentiment { 
    magnitude: 0.40000000596 
    score: 0.40000000596 
    } 
} 
sentences { 
    text { 
    content: "It was so sad when Marvin Gaye died." 
    begin_offset: 90 
    } 
    sentiment { 
    magnitude: 0.20000000298 
    score: -0.20000000298 
    } 
} 

这是不是JSON。它是google.cloud.proto.language.v1beta2.language_service_pb2.AnalyzeSentimentResponse对象的一个​​实例。它没有__dict__属性属性,所以它不能使用json.dumps()进行序列化。

我该如何指定响应应该使用JSON还是将对象序列化为JSON?

编辑:@Zach指出谷歌的protobuf数据交换格式。看来最佳的办法是使用这些protobuf.json_format方法:

from google.protobuf.json_format import MessageToDict, MessageToJson 

self.dict = MessageToDict(self.results) 
self.json = MessageToJson(self.results) 

从文档字符串:

MessageToJson(message, including_default_value_fields=False, preserving_proto_field_name=False) 
    Converts protobuf message to JSON format. 

    Args: 
     message: The protocol buffers message instance to serialize. 
     including_default_value_fields: If True, singular primitive fields, 
      repeated fields, and map fields will always be serialized. If 
      False, only serialize non-empty fields. Singular message fields 
      and oneof fields are not affected by this option. 
     preserving_proto_field_name: If True, use the original proto field 
      names as defined in the .proto file. If False, convert the field 
      names to lowerCamelCase. 

    Returns: 
     A string containing the JSON formatted protocol buffer message. 
+0

谢谢您的答复。对象不具有像这样的\ _ \ _ dict__属性有多常见?如果我自己定义一个类并初始化它,那么默认情况下会有一个类。难道是因为Google自然语言API主要是用另一种语言实现的? –

+1

它看起来像一个可用性监督,但如果API是“测试版”(?),这并不罕见。提交问题和/或PR会很好。你正在安装哪个软件包? – brennan

+1

我认为[这是回购](https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/language/google/cloud/proto/language/v1beta2/language_service_pb2.py)从哪里谷歌.cloud.proto.language.v1beta2已实施。我认为它是解析请求并在那里创建对象。我会在该github上发布一个问题。 –