在SQL Server 2016中存储JSON格式的数据

In this article, I continue to review the exciting features available in SQL Server 2016. One such feature is the long awaited T-SQL support for JSON formatted data. In this article we take a look at how JSON support will impact data warehouse solutions.

在本文中,我将继续回顾SQL Server 2016中可用的激动人心的功能。这样的功能之一是期待已久的T-SQL对JSON格式数据的支持。 在本文中,我们将研究JSON支持如何影响数据仓库解决方案。

背景 (Background)

Since the advent of EXtensible Markup Language (XML) many modern web applications have focused on providing data that is both human-readable and machine-readable. From a relational database perspective, SQL Server kept up with these modern web applications by providing support for XML data in a form of an XML data type and several functions that could be used to parse, query and manipulate XML formatted data.

自从可扩展标记语言(XML)出现以来,许多现代的Web应用程序都集中于提供人类可读和机器可读的数据。 从关系数据库的角度来看,SQL Server通过以XML数据类型的形式提供对XML数据的支持以及一些可用于解析,查询和操作XML格式化数据的功能,从而与这些现代Web应用程序保持同步。

As a result of being supported in SQL Server, data warehouse solutions based off SQL Server were then able to source XML-based OLTP data into a data mart. To illustrate this point, let’s take a look at the XML representation of our fictitious Fruit Sales data shown in Figure 1.

由于受到SQL Server支持,因此基于SQL Server的数据仓库解决方案便能够将基于XML的OLTP数据提供给数据集市。 为了说明这一点,让我们看一下图1所示的虚拟Fruit Sales数据的XML表示。

在SQL Server 2016中存储JSON格式的数据

To process this data in data warehouse, we would first have to convert it into relational format of rows and columns using T-SQL XML built-in functions such as the nodes() function. Figure 2 shows a script that makes use of the nodes() function to convert the sample data shown in Figure 1.

要在数据仓库中处理此数据,我们首先必须使用T-SQL XML内置函数(例如, nodes()函数将其转换为行和列的关系格式。 图2显示了一个脚本,该脚本利用node()函数来转换图1所示的样本数据。

在SQL Server 2016中存储JSON格式的数据

The results of the above script are shown in Figure 3 in a recognisable format for data warehouse.

以上脚本的结果在图3中以可识别的格式显示在数据仓库中。

在SQL Server 2016中存储JSON格式的数据

Soon after XML became a dominant language for data interchange for many modern web applications, JavaScript Object Notation (JSON) was introduced as a lightweight data-interchange format that is more convenient for web applications to process than XML. Likewise most relational database vendors released newer versions of their database systems that included the support for JSON formatted data. Unfortunately, Microsoft SQL Server was not one of those vendors and up until SQL Server 2014, JSON data was not supported. Obviously this lack of support for JSON, created challenges for data warehouse environments that are based off SQL Server.

XML成为许多现代Web应用程序中用于数据交换的主要语言后不久,JavaScript对象表示法(JSON)被引入为一种轻量级的数据交换格式,比XML更便于Web应用程序处理。 同样,大多数关系数据库供应商也发布了其数据库系统的较新版本,其中包括对JSON格式数据的支持。 不幸的是,Microsoft SQL Server并不是这些供应商之一,直到SQL Server 2014才不支持JSON数据。 显然,这种对JSON的缺乏支持给基于SQL Server的数据仓库环境带来了挑战。

Although there were workarounds (i.e. using Json.Net) to addressing the lack of JSON support in SQL Server, there was always sense that these workarounds were inadequate, time-wasting, and were forcing data warehouse development teams to pick up a new skill (i.e. learn .Net). Fortunately, the release of SQL Server 2016 has ensured that development teams can throw away their JSON workarounds as JSON is supported in SQL Server 2016.

尽管存在一些解决方法(即使用Json.Net )来解决SQL Server中缺少JSON支持的问题,但始终感觉到这些解决方法不够充分,浪费时间,并迫使数据仓库开发团队掌握新技能(即学习.NET)。 幸运的是,SQL Server 2016的发布确保开发团队可以放弃其JSON解决方法,因为SQL Server 2016支持JSON。

将JSON数据解析到数据仓库中 (Parsing JSON Data into Data Warehouse)

Similarly to XML support in SQL Server, SQL Server supports of JSON can be classified into two ways:

与SQL Server中的XML支持类似,JSONSQL Server支持可以分为两种方式:

  1. Converting Relational dataset into JSON format

    将关系数据集转换为JSON格式

  2. Converting JSON dataset into relational format

    将JSON数据集转换为关系格式

However, for the purposes of this discussion we are focusing primarily on the second part – which is converting a JSON formatted data (retrieved from OLTP sources) into a relational format of rows and columns. To illustrate our discussion points we once again make use of the fictitious fruit sales dataset. This time around the fictitious dataset has been converted into a JSON format as shown in Figure 4.

但是,出于讨论的目的,我们主要关注第二部分-将JSON格式的数据(从OLTP来源检索)转换为行和列的关系格式。 为了说明我们的讨论要点,我们再次使用虚拟水果销售数据集。 这次围绕虚拟数据集的操作已转换为JSON格式, 如图4所示。

在SQL Server 2016中存储JSON格式的数据

ISJSON函数 (ISJSON function)

As part of supporting JSON formatted data in other relational databases such as MySQL and PostgreSQL 9.2, there is a separate JSON data type that has been introduced by these vendors. Amongst other things, JSON data type conducts validation checks to ensure that values being stored are indeed of valid JSON format.

作为在其他关系数据库(例如MySQLPostgreSQL 9.2)中支持JSON格式的数据的一部分,这些供应商已经引入了一种单独的JSON数据类型。 除其他外,JSON数据类型执行验证检查以确保所存储的值确实为有效JSON格式。

Unfortunately, SQL Server 2016 (and ORACLE 12c) do not have a special data type for storing JSON data instead a variable character (varchar/nvarchar) data type is used. Therefore, a recommended practice to dealing with JSON data in SQL Server 2016 is to firstly ensure that you are indeed dealing with a valid JSON data. The simplest way to do so is to use the ISJSON function. This is a built-in T-SQL function that returns 1 for a valid JSON dataset and 0 for invalids.

不幸的是,SQL Server 2016(和ORACLE 12c )没有用于存储JSON数据的特殊数据类型,而是使用了可变字符(varchar / nvarchar)数据类型。 因此,在SQL Server 2016中处理JSON数据的建议做法是,首先确保您确实在处理有效的JSON数据。 最简单的方法是使用ISJSON函数。 这是一个内置的T-SQL函数,对于有效的JSON数据集返回1,对于无效的返回0。

Figure 5 shows us the implementation of ISJSON function whereby we validate our fictitious sample dataset.

图5向我们展示了ISJSON函数的实现,从而验证了虚拟样本数据集。

在SQL Server 2016中存储JSON格式的数据


OPENJSON函数 (OPENJSON function)

Now that we have confirmed that we are working with a valid JSON dataset, the next step is to convert the data into a table format. Again, we have a built-in T-SQL function to do this in a form of OPENJSON. OPENJSON works similar to OPENXML in that it takes in an object and convert its data into rows and columns.

现在,我们已经确认我们正在使用有效的JSON数据集,下一步是将数据转换为表格格式。 同样,我们有一个内置的T-SQL函数以OPENJSON的形式执行此操作。 OPENJSON的工作方式与OPENXML类似,因为它接受一个对象并将其数据转换为行和列。

Figure 6 shows a complete T-SQL script for converting JSON object into rows and columns.

图6显示了一个完整的T-SQL脚本,用于将JSON对象转换为行和列。

在SQL Server 2016中存储JSON格式的数据

Once we execute the above script, we get relational output shown in Figure 7.

一旦执行了上面的脚本,我们将得到图7所示的关系输出。

在SQL Server 2016中存储JSON格式的数据

Now that we have our relational dataset, we can process this data into data warehouse.

现在我们有了关系数据集,我们可以将此数据处理到数据仓库中。

JSON_VALUE函数 (JSON_VALUE function)

Prior to concluding our discussion of JSON in SQL Server 2016, it is worth mentioning that in addition to OPENJSON, you have other functions such as JSON_VALUE that could be used to query JSON data. However this function returns a scalar value which means that unlike the multiple rows and columns returned using OPENJSON, JSON_VALUE returns a single value as shown in Figure 8.

在结束我们对SQL Server 2016中JSON的讨论之前,值得一提的是,除了OPENJSON之外,您还有其他功能(例如JSON_VALUE)可用于查询JSON数据。 但是,此函数返回标量值,这意味着与使用OPENJSON返回的多个行和列不同,JSON_VALUE返回单个值,如图8所示。

在SQL Server 2016中存储JSON格式的数据

If you the JSON object that you are querying doesn’t have multiple elements, than you don’t have to specify the row index (i.e. [0]) as shown in Figure 9.

如果您要查询的JSON对象没有多个元素,则不必指定行索引(即[0]), 如图9所示。

在SQL Server 2016中存储JSON格式的数据

结论 (Conclusion)

The long wait is finally over and with the release of SQL Server 2016, JSON is now supported. Similarly to XML, T-SQL support the conversion of JSON object to relational format as well the conversion of relational tables to a JSON object. This support is implemented via built-in T-SQL functions such as OPENJSON and JSON_VALUE. In spite of all the excitement with the support of JSON is SQL Server 2016, we still don’t have a JSON data type. The ISJSON function can then be used to validate JSON text.

漫长的等待终于结束了,随着SQL Server 2016的发布,现在支持JSON。 与XML类似,T-SQL支持将JSON对象转换为关系格式,以及将关系表转换为JSON对象。 通过内置的T-SQL函数(例如OPENJSON和JSON_VALUE)实现此支持。 尽管对JSON的支持令人兴奋,但SQL Server 2016仍然没有JSON数据类型。 然后可以使用ISJSON函数来验证JSON文本。

资料下载 (Downloads)

参考 (Reference)

翻译自: https://www.sqlshack.com/warehousing-json-formatted-data-in-sql-server-2016/