2020-11-28

Product Information Index - Pii

Track your product structure and dependencies in a fun way. Automate the tedious task to collect and sort the information generated during product development. Query the result to find unimplemented requirements, failed tests, recent changes or just browse around and impress yourself with what you have accomplished.

Pii helps us with the complex work of tracking artifacts and their relationships during product development. Pii integrates with existing infrastructure and processes that you already use for product developmen and documentation. Pii automatically tracks changes and their impact on related artifacts. An artifact is the abstract idea that represents a physical thing or a digital record that exist in different versions. Custom data formats are integrated with Pii using Python funcions for parsing and categorising information.

Role-Relationship Model

Pii is based on a relational database with a browser frontend. Entities are represented by UUIDs and are associated with other entities and values through relations. Entities are assigned roles (types) dynamically. Roles decide which relations the entity can join. This is unlike traditional entity-relationship modeling where entities are modelled as relations. In the relational model of Pii, an entity is just an UUID taking on different roles to participate in relations. Let's call it an RR-Model.

Tracking Changes

Tracking changes in the contents of a file is a basic capability of Pii. A single line in tracker.py adds a file to Pii.

a = trackFile("path/to/filename", "Content-Type")

Initially, one MutableE entity representing the file and one ConstantE entity that represents the contents of the file are added to Pii. MutableE and ConstantE are roles. They are associated through the relation ContentEE. The MutableE entity is also assigned the role FileE with the additional relation PathES.

a = UUID()

a -- EntityE
a -- IdentityES -- "filename"

a -- MutableE

a -- FileE
a -- PathES -- 
"path/to/filename"

b = UUID()

b -- EntityE
b -- IdentityES -- "filename 2020-10-14T19:37:10.121"

b -- ConstantE

b -- ContentTypeES -- "Content-Type"
b -- ContentEB -- <11010...>
b -- ShaES -- "E0EE5BC391BB02D9891139EBBA3C674CFA1CA712"

a -- ContentEE -- b

When a change to the file content is detected an additional CostantE is created representing the new content.

c = UUID()

c -- EntityE
c -- IdentityES -- "filename 2020-10-14T19:42:13.443"

c -- ConstantE

c -- ContentTypeES -- "Content-Type"
c -- ContentEB -- <10110...>
c -- ShaES -- "69CE10436A9247A38A07B07BDC5A02B4CAAC3CF1"

a -- ContentEE -- c

Now we have two instances of the file stored in Pii, both related to the same file. 

Schema

Column types in Pii are the usual suspects String, Timestamp, Binary Integer, Real and Entity.

  • String S is UTF-8 encoded text. (sqlite3 Text)
  • Timestamp T is ISO8601 date and time. (sqlite3 Text)
  • Binary B is a sequence of octets. (sqlite3 Blob)
  • Integer I is an integer number. (sqlite3 Integer)
  • Real R is a real number. (sqlite3 Real)
  • Entity E is an UUID. (sqlite3 Text)

All relations - both unary and binary - have the column left L. Binary relations also have the column right R. L and R contain entity UUIDs or values from the value types. The type of the columns are shown with letters in the name of the relation. Relations also have columns for creation time T and association A.

create table ConstantE (l text, t text, a text);
create table ContentEB (l text, r blob, t text, a text);

Rows in relations are never removed or changed, only appended. The A column indicates if the relation should be realized (True) or if it should be broken up (False). This enables us to track the state of the relation over time, undo changes and construct views with cardinality n:n, 1:n, n:1 and 1:1 - all present at the same time for application queries. The schema can be seen as an extension of the 7:th normal form.

For every relation, four views will be created that represents the cardinalities. For the relation ContentEE the views ContentEEcnn, ContentEEcn1, ContentEEc1n and ContentEEc11 will be available. These views only contain currently realized associations and therefore the column A is not needed.

create view MutableEcn as select l, t ...
create view ContentEEcnn as select l, r, t ...
create view ContentEEcn1 as select l, r, t ...
create view ContentEEc1n as select l, r, t ...
create view ContentEEc11 as select l, r, t ...

Information Model

The growth of the schema is open ended. An entity starts out as being only an UUID which is not even stored in the database. Roles are then added to the entity. An entity can take on just one or all roles in the system at once. New roles can be added at any time. It would be tedious if the presentation layer had to search through all relations to find a particular entity. We need to add information that the presentation layer can use when it wants to display an entity and its relations. We want to specify if an entity role (unary relation) participates on the left or the right side of a binary relation. We also want to know the roles a specific entity has. The relations LeftSS, RightSS and RoleES do this for us.

a -- RoleES -- "EntityE"
a -- RoleES -- "MutableE"
a -- RoleES -- "FileE"
b -- RoleES -- "EntityE"
b -- RoleES -- "ConstantE"
c -- RoleES -- "EntityE"
c -- RoleES -- "ConstantE"

"FileE" -- LeftSS -- "PathEScn1"

"ConstantE" -- LeftSS -- "MimeTypeEScn1"
"ConstantE" -- LeftSS -- "ValueEBcn1"
"ConstantE" -- LeftSS -- "ShaEScn1"

"MutableE" -- LeftSS -- "ContentEEc1n"

"ContentEEc1n" -- RightSS -- "ConstantE"

Note that the value types S, B, T, I, R are not added to RightSS. They are derived from the relation name when needed. It is only the entity roles that we need to model in this way. 

From now on whenever I assign a role to an entity, e.g. x -- FileE it also means that there will be an x -- RoleES -- "FileE" row added to RoleES in addition to the row in relation FileE. And instead of specifying the LeftSS and RightSS relations I will use the following shorthands to define the information model.

FileE -- PathEScn1

ConstantE -- ContentTypeEScn1
ConstantE -- ContentEBcn1
ConstantE -- ShaEScn1

MutableE -- ContentEEc1n -- ConstantE

All entities should have the role EntityE with the relation IdentityES. This is the human readable name of the entity.

EntityE -- IdentityEScn1

All relations are always optional for an entity to participate in. Any entity that hasn't specified an identity will just not have a value for that relation.

The shape and display color of an entity can be changed by its roles. The last role assigned to the entity has precedence.

"MutableE" -- ShapeSS -- "box"
"ConstantE -- ColorSS -- "white"

Embedded Records

Now lets assume the file is a requirements specification that contains requirements. We use a function that can parse the document and extract the requirements. Add a second line to tracker.py.

rs = trackRequirements(a)

The requirements that are found will be added to Pii.

d = UUID()

d -- EntityE
d -- IdentityES -- "Requirement 17.3 Make it fast! v6.1"
d -- MutableE
d -- EmbeddedE

a -- ContainerE
a -- MemberEE -- d

e = UUID()

e -- ConstantE

e -- ContentTypeES -- "text/plain"
e -- ContentEB -- <11010...>
e -- ShaES -- "2CAF6FBAE0B00796E2B59656660941BC331FDEED"
e -- EmbeddedE

c -- ContainerE
c -- MemberEE -- e


d -- ContentEE -- e

# Information Model
ContainerE -- SectionEEc1n -- EmbeddedE

Both the mutable and the constant requirement entities are assigned the role EmbeddedE. This is not strictly necessary and could be seen as duplication of information. This is not a problem as Pii is not the original source of the information, it just models information already available elsewhere and it will never be updated manually. Therefore it is harmless to add redundant relations that will simplify navigation for the user.

We could also use EmbeddedE for FileEs that are stored in ContainerE zip FileEs.

Note that we have not assigned the role RequirementE to the MutableE. This is a role we want to save for a higher level entity.

Versions

Different versions of an artifact form a collection that we want to keep together. The low level change tracking that we get with ConstantE and MutableE are not suitable for this task. We may for example want to have several versions available in the file system at the same time and we may want to maintain parallel branches. We will introduce two entitiy roles called VersionE and ArtifactE and extend the MutableE d from above representing the requirement with VersionE. The artifact will also have the role RequirementE.

d -- VersionE

f = UUID()

f -- EntityE
f -- IdentityES -- "Requirement 17.3 Make it fast!"
f -- ArtifactE

f -- RequirementE

f -- VersionEE -- d

# Information Model
ArtifactE -- VersionEEc1n -- VersionE

Branches

A branch is an entity that is both an ArtifactE and a VersionE and it is used to group versions together for easier navigation. A dot in a version number indicates that we have a branch. We will rebuild the structure above by inserting a branch between the ArtifactE f and the VersionE d.

g = UUID()

g -- EntityE
g -- IdentityES -- "Requirement 17.3 Make it fast! v6.x"
g -- ArtifactE
g -- VersionE

f -- VersionEE -- g
g -- VersionEE -- d

The VersionE can be associated with both the top ArtifactE and the branch ArtifactE. This is probably a correct representation of how the model has evolved over time. First we have the branchless versions 1, 2, 3, 4, 5, 6, 7 directly associated with the top ArtifactE f. Then the need to branch version 6 arises and we create the branch 6.x where we put both version 6 (aka 6.0) and 6.1. Nothing prevents us from also associate ArtifactE f with 6.1 directly if we want to.

Designing and Building Things using Parts, Materials, Tools and Instructions

A product design is modelled as an ArtifactE. To be able to build items of this design we need to know which parts and materials to use, what tools we need and what instructions to follow. These four categories are in turn modelled as ArtifactEs and can be broken down in the same way.

ComposedE -- BoMPartEE1n -- BoMPartE

BoMPart -- CountEIcn1
BoMPart -- PartEEcn1 -- PartE

ComposedE -- BoMMaterialEEc1n - BoMMaterialE

BoMMaterialE -- AmountERcn1
BoMMaterialE -- UnitEScn1
BoMMaterialE -- MaterialEEcn1 - MaterialE

BoM stands for Bill of Materials which is a list of the things and how much of each we need to build something. The ArtifactE we are designing will come in different versions over time. Each VersionE entity of the ArtifactE will also have the relations above, but they will associate with other VersionEs of the ArtifactEs that are used as parts in each particular case.

Software and other information ArtifactEs seldom use more than one of each of the components that they are built from (licensing conditions could create exceptions). This motivates a simpler kind of relation in these cases.

AggregateE -- ComponentEEcnn -- ComponentE

We may want to use this relation for the ArtifactEs of physical products as well and only specify the BoM for the VersionEs. The reason is that the amount and exactly what is needed to build something can vary over time, but the ArtifactE represents all VersionEs of itself and should therefore not be too specific.

Finally, here are the relations for tools and instructions.

ManufactureE -- ToolEEcnn -- ToolE

WorkE -- InstructionEEcnn -- InstructionE

ElectricalE -- SchemaEEcnn -- SchemaE

MechanicalE -- DrawingEEcnn -- DrawingE

CompiledE -- SourceCodeEEcnn -- SourceCodeE

Manufacture and work should be read as nouns, e.g. a work of art, the manufacture was made of wood.

As we have seen above, digital records that are the result of the build process can be stored directly in the VersionE entity when it is assigned the role MutableE. Physical products are associated with the VersionE through the ItemEE or ProduceEE relations. The VersionE is assigned the BlueprintE role or the RecipeE role depending on what kind of product is produced.

BlueprintE -- ItemEEc1n -- ItemE

RecipeE -- ProduceEEc1n -- ProduceE

ProduceE -- UnitEScn1
ProduceE -- AmountERcn1

If you want to organise your output in batches then let the batch entity have both the BlueprintE+ItemE roles or the RecipeE+ProduceE roles just like we did with branches.

Batteries not Included

Products that doesn't come with all necessary parts to function as indended are called integrated. This is typical for software that dynamically loads modules at runtime.

IntegratedE -- ModuleEEcnn -- ModuleE

Specifications

Specifications like the requirement we modelled earlier are someting that dictates what other artifacts must be or behave like.

SpecificationE -- ImplementationEEcnn -- ImplementationE

Correctly tracked, this relation can be used to find out which changes made to specification still remain to be implemented. A key feature to fully comply with traceability. If an AnrifactE is an ImplementationE of a SpecificationE, then all VersionEs of the SpecificationE should have at least one ImplementationE among the VersionEs of the ArtifactE. If some are missing then we know we have things left to implement.

It is also valuable to track which TestEs aim to refute the ImplementationE with input from the SpecificationE.

SpecificationE -- TestEEcnn -- TestE

Note that a test can be an implementation of a test specification at the same time it is a refutation test for a product specification.

Tests

Tests can be organized in a hierarchical structure using the aggregate/component roles and relations. The test result associates a version of the test target with a specific version of the test that was performed.

TestResultE -- TestTargetEEcn1 -- TestTargetE
TestResultE -- TestEEcn1 -- TestE

Guides

Guides tells you how to use things. It can be user guides, service guides, and so on.

ApplianceE -- GuideEEcnn -- GuideE

Retired Entities

When an entity no longer is actively participating in the product structure it will take on the role RetiredE.

x -- RetiredE

I Want Pii

Download the code from https://github.com/TheOtherMarcus/Pii.

No comments:

Debugging with Popper