Skip to content

Commit eb7eed5

Browse files
authored
More examples using JSONPath
1 parent b79b0a9 commit eb7eed5

File tree

1 file changed

+54
-54
lines changed

1 file changed

+54
-54
lines changed

JData_specification.md

Lines changed: 54 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -365,8 +365,8 @@ Below is a short summary of the JData data annotation/storage keywords that can
365365

366366
* **Data grouping**: `_DataGroup_`, `_Dataset_`, `_DataRecord_`
367367
* **N-D Array**: `_ArrayType_`, `_ArraySize_`, `_ArrayIsComplex_`, `_ArrayIsSparse_`,
368-
`_ArrayData_`,`_ArrayShape_`, `_ArrayOrder_`, `_ArrayZipType_`,`_ArrayZipSize_`,
369-
`_ArrayZipData_`, `_ArrayZipEndian_`, `_ArrayZipLevel_`, `_ArrayZipOptions_`
368+
`_ArrayData_`, `_ArrayLabel_`, `_ArrayShape_`, `_ArrayOrder_`, `_ArrayZipType_`,
369+
`_ArrayZipSize_`, `_ArrayZipData_`, `_ArrayZipEndian_`, `_ArrayZipLevel_`, `_ArrayZipOptions_`
370370
* **Hash/Map**: `_MapData_`
371371
* **Table**: `_TableData_`, `_TableCols_`, `_TableRows_`, `_TableRecords_`
372372
* **Enumeration**: `_EnumKey_`, `_EnumValue_`
@@ -538,7 +538,7 @@ to, the following list
538538
```
539539

540540
In software environments where `"_DataInfo_"` can not be defined, such as the root
541-
level of the JSON documents in Apache CouchDB (https://couchdb.apache.org/), an alternative
541+
level of the JSON documents in [Apache CouchDB](https://couchdb.apache.org/), an alternative
542542
name `".datainfo"` can be used instead.
543543

544544
### Data Storage Keywords
@@ -640,8 +640,8 @@ Here, the array annotation keywords are defined below:
640640
the form of a string, for the 1st dimension, the 2nd element defines the name for the 2nd dimension,
641641
and so on. If the label of a dimension is an empty string `""`, it is undefined. If any of the element
642642
is an array, it further defines the names/labels for the array indices along this dimension. This array
643-
must be in the form `[["label1", column_start1, column_width1], ["label2", column_start2, ...]]`,
644-
where optional intergers `column_start_i` and `column_width_i` define the start and width, respectively,
643+
must be in the form `[["label_1", column_start_1, column_width_1], ["label_2", column_start_2, ...]]`,
644+
where optional integers `column_start_i` and `column_width_i` define the start and width, respectively,
645645
of the array indices that are associated with this label.
646646

647647
To facilitate the pre-allocation of the buffer for storage of the array in the parser, when
@@ -1455,55 +1455,73 @@ and processing JData files
14551455
* **`JD_GetType(node)`**: returns the node type
14561456
* **`JD_GetLength(node)`**: returns the number of children of the specified node
14571457

1458-
### Index vector
1458+
### Element Referencing
14591459

14601460
Essentially, JData stores a serialized version of complex data using collections
14611461
of sequential or nested nodes, either in the named or indexed form. To access
14621462
any element (a leaflet, leaf or branch) of the JData document, one should use a vector
14631463
of indices that points to the specific node.
14641464

14651465
A JData-compliant parser must be able to retrieve JData elements via the below pseudo-code
1466-
interface using a linear index vector
1466+
interface using an index vector or reference string
14671467
```
14681468
JD_Node item = JD_GetNode(JD_Node root, [i1, i2, i3, i4, ...], is_compact)
1469+
or
1470+
JD_Node item = JD_GetNode(JD_Node root, "key")
14691471
```
14701472
where `i1` is the index of the data on the top-most level (relative to the root level of the
14711473
`"root"` object), `i2`, is the index along the 2nd level, and so on. Each index is an
14721474
integer, starting from 1, denoting the order
14731475
of the data element among the serialized elements of the same level. In order words, if
14741476
the current level is an array object, the index is the count of the elements before this data
14751477
element plus 1; if the current level is a structure, the index is the count of the named
1476-
nodes appearing before this data plus 1. Using the tree data structure above, the linear index
1478+
nodes appearing before this data plus 1.
1479+
1480+
In some programming environments, such as C++ and Python, object keys may be unordered.
1481+
As a result, referencing using integer based index vector may not be reliable. An alternative
1482+
is [JSONPath](https://goessner.net/articles/JsonPath/). JSONPath uses a series of
1483+
dot-separated names to locate an element inside a JSON data tree, such as
1484+
```
1485+
$.name1.name2.name3[0].name4
1486+
```
1487+
in this notation, `$` refers to the root of the JSON object, `$.name1` refers to the root-level
1488+
sub-element `name1`; `$.name1.name2.name3[0]` specifies the `name2` child of `$.name1` has a child
1489+
named `.name3`, which is an array; using `.name3[0]` means taking the first element of the
1490+
array. Lastly, the first element of the `name3` array has a child named `.name4`.
1491+
When the object names contain `.` `[`, or `]`, they must be escaped by inserting a
1492+
`\` before. For example, `$.file.test\.json` specifies a key named `test.json` under the `$.file` object.
1493+
1494+
It is worth mentioning that JSONPath supports deep-scan (`..`) and filtering operators,
1495+
although this specification does not require the parsers to fully support all JSONPath
1496+
operators.
1497+
1498+
Using the tree data structure above, the linear index
14771499
of each node is listed below on the right side:
14781500

14791501
```
14801502
{
1481-
"_TreeNode_(root)": data0, <- [1]
1482-
"_TreeChildren_": [ <- [2]
1483-
{"_TreeNode_(node1)": data1}, <- [2,1]
1484-
{ <- [2,2]
1485-
"_TreeNode_(node2)": data2, <- [2,2,1]
1486-
"_TreeChildren_": [ <- [2,2,2]
1487-
{"_TreeNode_(node2.1): data2.1}, <- [2,2,2,1]
1488-
{"_TreeNode_(node2.2): data2.2} <- [2,2,2,2]
1503+
"_TreeNode_(root)": data0, <- [1] or $._TreeNode_(root)
1504+
"_TreeChildren_": [ <- [2] or $._TreeChildren_
1505+
{"_TreeNode_(node1)": data1}, <- [2,1] or $._TreeChildren_[0]
1506+
{ <- [2,2] or $._TreeChildren_[1]
1507+
"_TreeNode_(node2)": data2, <- [2,2,1] or $._TreeChildren_[0]._TreeNode_(node2)
1508+
"_TreeChildren_": [ <- [2,2,2] or $._TreeChildren_[0].__TreeChildren_
1509+
{"_TreeNode_(node2.1): data2.1}, <- [2,2,2,1] or $._TreeChildren_[0].__TreeChildren_[0]
1510+
{"_TreeNode_(node2.2): data2.2} <- [2,2,2,2] or $._TreeChildren_[0].__TreeChildren_[1]
14891511
]
14901512
},
1491-
{ <- [2,3]
1492-
"_TreeNode_(node3)": data3 <- [2,3,1], or [[2,3]]
1513+
{ <- [2,3] or $._TreeChildren_[2]
1514+
"_TreeNode_(node3)": data3 <- [2,3,1], or [[2,3]] or $._TreeChildren_[2]._TreeNode_(node3)
14931515
}
14941516
]
14951517
}
14961518
```
1497-
One can insert zeros to the right-side of the indexing vector if the array storing the vector
1498-
has a length longer than the depth of the assessed node. In this case, the first 0 scanning
1499-
from left to right of the indexing vector is considered the termination flag of the index.
1500-
In other words, index vectors `[2,2]`, `[2,2,0]` and `[2,2,0,0]` are equivalent.
15011519

15021520
The third parameter, `is_compact`, is a boolean flag. If set to `true`, `JD_GetNode`
15031521
shall skip the index if any of the dimensions along the indexing vector is a singlet,
15041522
i.e. the child count is 1. The compact indexing vector, enclosed by double-square-brackets
15051523
as `[[...]]`, shall be passed to `JD_GetNode` as the 2nd input when `is_compact` is `true`.
1506-
Using the above example, both index vectors [[2,3]] and [2,3,1] refer to
1524+
Using the above example, both index vectors `[[2,3]]` and `[2,3,1]` refer to
15071525
`"_TreeNode_(node3)": data3`. Please be aware that the compact indexing vector can not
15081526
distinguish between row and column vectors as the column vector in JData has a trailing
15091527
singlet dimension ([see N-D array section](#direct-storage-of-n-d-arrays)).
@@ -1589,28 +1607,28 @@ is the conversion from a structure to an array as shown in the below example:
15891607
```
15901608
{
15911609
"a": {
1592-
"name1": value1, <- [1,1] or ["a","name1"]
1593-
"name2": value2, <- [1,2] or ["a","name2"]
1594-
"name3": value3 <- [1,3] or ["a","name3"]
1610+
"name1": value1, <- [1,1] or ["a","name1"] or $.a.name1
1611+
"name2": value2, <- [1,2] or ["a","name2"] or $.a.name2
1612+
"name3": value3 <- [1,3] or ["a","name3"] or $.a.name3
15951613
},
1596-
"b": "value4" <- [2] or ["b"]
1614+
"b": "value4" <- [2] or ["b"] or $.b
15971615
}
15981616
```
15991617
to
16001618
```
16011619
{
16021620
"a": [
16031621
{
1604-
"name1": value1 <- [[1,1]] or [["a","name1"]]
1622+
"name1": value1 <- [[1,1]] or [["a","name1"]] or $.a[0].name1
16051623
},
16061624
{
1607-
"name2": value2 <- [[1,2]] or [["a","name2"]]
1625+
"name2": value2 <- [[1,2]] or [["a","name2"]] or $.a[1].name2
16081626
},
16091627
{
1610-
"name3": value3 <- [[1,3]] or [["a","name3"]]
1628+
"name3": value3 <- [[1,3]] or [["a","name3"]] or $.a[2].name3
16111629
}
16121630
],
1613-
"b": "value4" <- [2] or ["b"]
1631+
"b": "value4" <- [2] or ["b"] or $.b
16141632
}
16151633
```
16161634
The only permitted "non-isometric transform" is the conversion between a direct
@@ -1631,7 +1649,7 @@ A link is defined by a named leaflet or leaf as shown in the below two styles
16311649
"link_style2": {
16321650
"_DataLink_": {
16331651
"URI": "path",
1634-
"Parameters": [...],
1652+
"Parameters": "$.key1.key2...",
16351653
"MaxRecursion": 1
16361654
}
16371655
}
@@ -1642,11 +1660,11 @@ by the indexing vector string to point to a specific element of the referenced d
16421660
For example, the below link
16431661
```
16441662
{
1645-
"_DataLink_": "file:///space/test/jdfiles/tree.jdat:[1,2,2]"
1663+
"_DataLink_": "file:///space/test/jdfiles/tree.jdat:$.key1.key2..."
16461664
}
16471665
```
16481666
asks the parser to read a local file located at "/space/test/jdfiles/tree.jdat" and
1649-
load the node specified by indexing vector `[1,2,2]`, starting from the root (or super-root
1667+
load the node specified by JSONPath string `$.key1.key2...`, starting from the root (or super-root
16501668
if containing CJSON) to replace the `"_DataLink_"` node in the current document.
16511669

16521670
If using a `"_DataLink_"` structure, additional parameters can be specified via
@@ -1678,10 +1696,10 @@ Then, the data object can be referenced as shown in the below example `"global_l
16781696
"_DataLink_": "#a_unique_anchor_name"
16791697
}
16801698
"local_link2": {
1681-
"_DataLink_": [1,2,2]
1699+
"_DataLink_": "$.key1.key2..."
16821700
}
16831701
"local_compact_link3": {
1684-
"_DataLink_": [[1,2,2]]
1702+
"_DataLink_": [1,2,2]
16851703
}
16861704
```
16871705
A data link URI starting with "#" refers to the data anchor defined within the same document,
@@ -1692,24 +1710,6 @@ a local node, as shown in the `"local_link2"` example above. A compact indexing
16921710
i.e. 1-level nested vector, as shown in the `"local_compact_link3"` example above. The
16931711
behaviors of other types of data link values are not specified.
16941712

1695-
Instead of using the above described index vector to reference a sub-element inside an
1696-
externally linked JData file, this specification also permit the use of
1697-
[JSONPath specifiers](https://goessner.net/articles/JsonPath/). JSONPath uses a series of
1698-
dot-separated names to locate an element inside a JSON data tree, such as
1699-
```
1700-
$.name1.name2.name3[0].name4
1701-
```
1702-
in this notation, `$` refers to the root of the JSON object, `$.name1` refers to the root-level
1703-
sub-element `name1`; `$.name1.name2.name3[0]` specifies the `name2` child of `$.name1` has a child
1704-
named `.name3`, which is an array; using `.name3[0]` means taking the first element of the
1705-
array. Lastly, the first element of the `name3` array has a child named `.name4`.
1706-
When the object names contain `.` `[`, or `]`, they must be escaped by inserting a
1707-
`\` before. For example, `$.file.test\.json` specifies a key named `test.json` under the `$.file` object.
1708-
1709-
It is worth mentioning that JSONPath supports deep-scan (`..`) and filtering operators,
1710-
although this specification does not require the parsers to fully support all JSONPath
1711-
operators.
1712-
17131713
Recommended File Specifiers
17141714
------------------------------
17151715

0 commit comments

Comments
 (0)