2010-01-30

Mapping Relational Data to JSON/HTTP

Information stored in a SQL database can efficiently be made available over a HTTP API with JSON as media type. This article describes a strategy to build a web representation for normalized relational data as it is found in a SQL database. When creating applications for the web, I don't have to abandon decades of best practices in relational data modeling, but can continue to build upon them.

An application layer between the database and the frontend consisting of JSON objects identified by URI:s serve two purposes. First, it formats the data nicely for a HTML+CSS+JS frontend, separating the data from the presentation. Second, it enables automated data processing as it becomes possible for various scripts and agents to access the data directly using standard web protocols and formats.

Each row in the database is represented by a JSON document, uniquely identified by an URI. Relationships between entities are represented by embedded JSON objects containing a link and additional data that is necessary for the frontend implementation to be efficient. Many to many relationships are represented both by a link and a node, where node is a reference to the object that is part of the relation managing the relationship between the entities.

This means that each entity and each entity instance is represented by an URI, and the operations GET, POST, PUT and DELETE can be used in a straightforward manner. Additionally, relations used to build many to many relationships between entities are also represented by URIs. This means that I don't need any special purpose POST commands on the participating entities to build and destroy many to many relationships. The normal pattern using POST to create, DELETE to remove and PUT to update relationships can be used.

I use the URI patterns http://<domain>/<tablename>/<primarykey>, and http://<domain>/<tablename>/<unique> where <unique> is a unique identifier for the row, not necessarily the primary key. I use the latter pattern when I want users to have the ability to create URI:s from the information they have about the object they are looking for. An alternative is to use search to find entities, but a publicly known URI pattern makes it simpler for humans to quickly reference objects. (Note that this simplification is for humans to use, not software clients, which will discover the URI:s when using the API.) The former pattern is used for relations which have no significance for humans, and are primarily used by software clients only.

Time for an example from a PDM system.



A product is described in zero or more documents. A product can also form a relationship with another product, where one is part of the other. The relationship between a product and its documents is one to many, and the relationship between products is many to many. Both relationships have a lifecycle property defining the state of the relationship. A product is identified by a class and a number. A document with the product identifier and a number. Each entity also has an autmatically assigned sequence number as primary key.

Product:
-----------------------
| id | class | number |
+----+-------+--------+
| 1 | AB | 22 |
| 2 | AB | 23 |

Where id is the primary key and <class, number> is unique.

Document:
---------------------------------------
| id | belongsto | number | lifecycle |
+----+-----------+--------+-----------+
| 1 | 1 | 445 | active |
| 2 | 1 | 811 | obsolete |

Where id is the primary key, belongsto a foreign product key and <belongsto, number> is unique.

Subproduct:
------------------------------
| parent | child | lifecycle |
+--------+-------+-----------+
| 1 | 2 | active |

Which represents the many to many relationship between products where parent and child are foreign product keys and <parent, child> is unique.

JSON representations
http://mydomain.com/product/AB+22:
{
"class": "AB",
"number": 22,
"name": "AB 22",
"immutable": [
"class",
"number"
],
"mutable": [
],
"documents": [
{"node": "http://mydomain.com/document/AB+22/445",
"link": "http://mydomain.com/document/AB+22/445",
"name": "AB 22 - 445",
"lifecycle": "active"},
{"node": "http://mydomain.com/document/AB+22/811",
"link": "http://mydomain.com/document/AB+22/811",
"name": "AB 22 - 811"},
"lifecycle": "obsolete"}
],
"children": [
{"node": "http://mydomain.com/subproduct/1/2",
"link": "http://mydomain.com/product/AB+23",
"name": "AB 23",
"lifecycle": "active"}
],
"parents": [
]
}

http://mydomain.com/product/AB+23:
{
"class": "AB",
"number": 23,
"name": "AB 23",
"immutable": [
"class",
"number"
],
"mutable": [
],
"documents": [
],
"children": [
],
"parents": [
{"node": "http://mydomain.com/subproduct/1/2",
"link": "http://mydomain.com/product/AB+22",
"name": "AB 22",
"lifecycle": "active"}
]
}

http://mydomain.com/document/AB+22/445:
{
"belongsto": {
"node": "http://mydomain.com/document/AB+22/445",
"link": "http://mydomain.com/product/AB+22",
"name": "AB 22"
},
"number": 445,
"name": "AB 22 - 445",
"lifecycle: "active",
"immutable": [
"belongsto",
"number"
],
"mutable": [
"lifecycle"
]
}

http://mydomain.com/document/AB+22/811:
{
"belongsto": {
"node": "http://mydomain.com/document/AB+22/811",
"link": "http://mydomain.com/product/AB+22",
"name": "AB 22"
},
"number": 811,
"name": "AB 22 - 811",
"lifecycle: "obsolete",
"immutable": [
"belongsto",
"number"
],
"mutable": [
"lifecycle"
]
}

http://mydomain.com/subproduct/1/2:
{
"parent": {
"node": "http://mydomain.com/subproduct/1/2",
"link": "http://mydomain.com/product/AB+22",
"name": "AB 22"
},
"child": {
"node": "http://mydomain.com/subproduct/1/2",
"link": "http://mydomain.com/product/AB+23",
"name": "AB 23"
},
"lifecycle: "active",
"immutable": [
"parent",
"child"
],
"mutable": [
"lifecycle"
]
}


name is a human readable identifier, It is present in relationships to avoid unnecessary server communication when relationships are displayed for humans to navigate. lifecycle is present in the relationships to make it possible for the frontend to act differently depending on the current state. This is also motivated by efficiency reasons.

Product properties parents and children link directly to the related products. Ihe node property links to the corresponding subproduct. link is used for normal navigation and node when the relationship itself shall be updated or deleted. node is also present in product-document relationships to make it easier to write polymorphic frontend code.

New objects are created by POSTing to http://mydomain.com/product, http://mydomain.com/document and http://mydomain.com/subproduct. Objects are removed with DELETE on the object URI. Objects are updated by PUTing a new representation to the object URI. Changes will be rejected if they compromise data integrity. Relationships can not be changed by PUTing a modified product. They can only be changed through document and subproduct objects.

mutable and immutable exists to make it easier to write polymorphic frontend code. mutable contains the fields that are overwritten in a PUT operation. Both together contains the POST parameters to use when creating a new instance.


Reblog this post [with Zemanta]

Debugging with Popper