Formal Language: 2020

2020-12-07

- Every thing is a pattern

- But, but, but, physical things, like blueberries. They must not be just a pattern?

- They are. The key feature that lets us perceive a lump of matter as a blueberry is how its atoms are arranged, a pattern. The arrangement makes the lump good at reflecting blue light, among other things.

- The atoms then? They are real.

- They are also just patterns of even smaller things. Or a collection of measurements made by man made apparatus.

- The light?

- No, not even the light. Everything you can identify as an isolated thing is just a pattern, specific arrangements of smaller things or measurements. Extended in time or space. To be able to recognize a blueberry, its pattern needs to be interpreted by a computing device, i.e. it becomes software for the device, which transforms it into a different pattern. It could become an idea stored in your memory or a rearrangement of your muscular pattern, i.e. movement.

Physical things are patterns. Abstract things are patterns. Everything are patterns.

The things we call physical are the patterns that exclusively occupy a part of physical reality. No other pattern that exists at the same abstraction level can claim the same space at the same time. Some of our senses like touch and vision have evolved to measure physical patterns.

All patterns are real - by that I mean that they exist somewhere in spacetime - also those that traditionally have been regarded as abstract. Thoughts are real. Numbers are real. Genes are real. They are patterns that can change other patterns, for example the patterns we categorise as physical.

Abstract patterns are substrate independent. If you try to understand them by investigating what the substrate is doing - the particles - the meaning is lost.

Abstract patterns of information are easy to create as long as we have a universal computer, e.g. our mind. When we try to create patterns in the physical world we encounter resistance. Most attempts don't work out the way we would like them to. The resistance is the laws of physics.

Evolution produces computation devices that take patterns as inputs and turn them into output patterns. Life is computation. Computations that can avoid being destroyed by reality stays around. New kinds of computation are created by mutation.

Memes are patterns that run on the reality simulator running on your neurons. They can create copies of themselves in other people's minds. To be able to do that they must survive in the simulated reality and they must create patterns in the physical reality that can be perceived by other people's senses. Memes evolve due to errors in this process. New memes can also arise from our creativity, i.e. ideas trapped in our minds that evolve into a pattern capable of escaping.

It could be that there are physical things out there in reality, but we can only ever understand and process patterns, because our mind is a computing device and patterns are the only thing it can know anything about.

P.S. In Objective Knowledge, Popper uses the concepts of first, second and third world to describe the physical, the personal and the memetic worlds. The laws of physics (first world) created the first replicators by deterministic application of forces. The replicators are then able to preserve the knowledge of replication through the feedback loop of replication. Mutations create new knowledge that hangs on to the stabilising attractor of replication. Neo-Darwinian evolution takes place in this constructed environment maintained by replicators (world 1.5?). The auxiliary knowledge gives rise to new computations that are not just replication. At first it only performs simple transformations from measurement (pattern recognition) to action (pattern creation). When these computations become more and more complex, a simulation of the first world starts to emerge in the mind of the organism (second world) - a virtualisation of physical reality. A new kind of knowledge preserved by replication across minds can arise. It takes the form of memes in the third world. This knowledge survives the death of any particular person. This is not true for knowledge instantiated in the mind of a single person that didn't become memetic. This drive we have to create memetic knowledge and our intuitive recognition of good memetic knowledge has emerged from genetic mutation. How can we recreate this complex computation and run it on a computer? Just any recreation will not do. We also want this artificial person to be part of our third world - to share our memes. D.S.

2020-11-28

Product Information Index - Pii

Track your product structure and dependencies in a fun way. Automate the tedious task to collect and sort the information generated during product development. Query the result to find unimplemented requirements, failed tests, recent changes or just browse around and impress yourself with what you have accomplished.

Pii helps us with the complex work of tracking artifacts and their relationships during product development. Pii integrates with existing infrastructure and processes that you already use for product developmen and documentation. Pii automatically tracks changes and their impact on related artifacts. An artifact is the abstract idea that represents a physical thing or a digital record that exist in different versions. Custom data formats are integrated with Pii using Python funcions for parsing and categorising information.

Role-Relationship Model

Pii is based on a relational database with a browser frontend. Entities are represented by UUIDs and are associated with other entities and values through relations. Entities are assigned roles (types) dynamically. Roles decide which relations the entity can join. This is unlike traditional entity-relationship modeling where entities are modelled as relations. In the relational model of Pii, an entity is just an UUID taking on different roles to participate in relations. Let's call it an RR-Model.

Tracking Changes

Tracking changes in the contents of a file is a basic capability of Pii. A single line in tracker.py adds a file to Pii.

a = trackFile("path/to/filename", "Content-Type")

Initially, one MutableE entity representing the file and one ConstantE entity that represents the contents of the file are added to Pii. MutableE and ConstantE are roles. They are associated through the relation ContentEE. The MutableE entity is also assigned the role FileE with the additional relation PathES.

a = UUID()

a -- EntityE
a -- IdentityES -- "filename"

a -- MutableE

a -- FileE
a -- PathES -- "path/to/filename"

b = UUID()

b -- EntityE
b -- IdentityES -- "filename 2020-10-14T19:37:10.121"

b -- ConstantE
b -- ContentTypeES -- "Content-Type"
b -- ContentEB -- <11010...>
b -- ShaES -- "E0EE5BC391BB02D9891139EBBA3C674CFA1CA712"

a -- ContentEE -- b

When a change to the file content is detected an additional CostantE is created representing the new content.

c = UUID()

c -- EntityE
c -- IdentityES -- "filename 2020-10-14T19:42:13.443"

c -- ConstantE
c -- ContentTypeES -- "Content-Type"
c -- ContentEB -- <10110...>
c -- ShaES -- "69CE10436A9247A38A07B07BDC5A02B4CAAC3CF1"

a -- ContentEE -- c

Now we have two instances of the file stored in Pii, both related to the same file.

Schema

Column types in Pii are the usual suspects String, Timestamp, Binary Integer, Real and Entity.

String S is UTF-8 encoded text. (sqlite3 Text)
Timestamp T is ISO8601 date and time. (sqlite3 Text)
Binary B is a sequence of octets. (sqlite3 Blob)
Integer I is an integer number. (sqlite3 Integer)
Real R is a real number. (sqlite3 Real)
Entity E is an UUID. (sqlite3 Text)

All relations - both unary and binary - have the column left L. Binary relations also have the column right R. L and R contain entity UUIDs or values from the value types. The type of the columns are shown with letters in the name of the relation. Relations also have columns for creation time T and association A.

create table ConstantE (l text, t text, a text);
create table ContentEB (l text, r blob, t text, a text);

Rows in relations are never removed or changed, only appended. The A column indicates if the relation should be realized (True) or if it should be broken up (False). This enables us to track the state of the relation over time, undo changes and construct views with cardinality n:n, 1:n, n:1 and 1:1 - all present at the same time for application queries. The schema can be seen as an extension of the 7:th normal form.

For every relation, four views will be created that represents the cardinalities. For the relation ContentEE the views ContentEEcnn, ContentEEcn1, ContentEEc1n and ContentEEc11 will be available. These views only contain currently realized associations and therefore the column A is not needed.

create view MutableEcn as select l, t ...
create view ContentEEcnn as select l, r, t ...
create view ContentEEcn1 as select l, r, t ...
create view ContentEEc1n as select l, r, t ...
create view ContentEEc11 as select l, r, t ...

Information Model

The growth of the schema is open ended. An entity starts out as being only an UUID which is not even stored in the database. Roles are then added to the entity. An entity can take on just one or all roles in the system at once. New roles can be added at any time. It would be tedious if the presentation layer had to search through all relations to find a particular entity. We need to add information that the presentation layer can use when it wants to display an entity and its relations. We want to specify if an entity role (unary relation) participates on the left or the right side of a binary relation. We also want to know the roles a specific entity has. The relations LeftSS, RightSS and RoleES do this for us.

a -- RoleES -- "EntityE"
a -- RoleES -- "MutableE"
a -- RoleES -- "FileE"
b -- RoleES -- "EntityE"
b -- RoleES -- "ConstantE"
c -- RoleES -- "EntityE"
c -- RoleES -- "ConstantE"

"FileE" -- LeftSS -- "PathEScn1"

"ConstantE" -- LeftSS -- "MimeTypeEScn1"
"ConstantE" -- LeftSS -- "ValueEBcn1"
"ConstantE" -- LeftSS -- "ShaEScn1"

"MutableE" -- LeftSS -- "ContentEEc1n"
"ContentEEc1n" -- RightSS -- "ConstantE"

Note that the value types S, B, T, I, R are not added to RightSS. They are derived from the relation name when needed. It is only the entity roles that we need to model in this way.

From now on whenever I assign a role to an entity, e.g. x -- FileE it also means that there will be an x -- RoleES -- "FileE" row added to RoleES in addition to the row in relation FileE. And instead of specifying the LeftSS and RightSS relations I will use the following shorthands to define the information model.

FileE -- PathEScn1

ConstantE -- ContentTypeEScn1
ConstantE -- ContentEBcn1
ConstantE -- ShaEScn1

MutableE -- ContentEEc1n -- ConstantE

All entities should have the role EntityE with the relation IdentityES. This is the human readable name of the entity.

EntityE -- IdentityEScn1

All relations are always optional for an entity to participate in. Any entity that hasn't specified an identity will just not have a value for that relation.

The shape and display color of an entity can be changed by its roles. The last role assigned to the entity has precedence.

"MutableE" -- ShapeSS -- "box"
"ConstantE -- ColorSS -- "white"

Embedded Records

Now lets assume the file is a requirements specification that contains requirements. We use a function that can parse the document and extract the requirements. Add a second line to tracker.py.

rs = trackRequirements(a)

The requirements that are found will be added to Pii.

d = UUID()

d -- EntityE
d -- IdentityES -- "Requirement 17.3 Make it fast! v6.1"
d -- MutableE
d -- EmbeddedE

a -- ContainerE
a -- MemberEE -- d

e = UUID()

e -- ConstantE
e -- ContentTypeES -- "text/plain"
e -- ContentEB -- <11010...>
e -- ShaES -- "2CAF6FBAE0B00796E2B59656660941BC331FDEED"
e -- EmbeddedE

c -- ContainerE
c -- MemberEE -- e

d -- ContentEE -- e

# Information Model
ContainerE -- SectionEEc1n -- EmbeddedE

Both the mutable and the constant requirement entities are assigned the role EmbeddedE. This is not strictly necessary and could be seen as duplication of information. This is not a problem as Pii is not the original source of the information, it just models information already available elsewhere and it will never be updated manually. Therefore it is harmless to add redundant relations that will simplify navigation for the user.

We could also use EmbeddedE for FileEs that are stored in ContainerE zip FileEs.

Note that we have not assigned the role RequirementE to the MutableE. This is a role we want to save for a higher level entity.

Versions

Different versions of an artifact form a collection that we want to keep together. The low level change tracking that we get with ConstantE and MutableE are not suitable for this task. We may for example want to have several versions available in the file system at the same time and we may want to maintain parallel branches. We will introduce two entitiy roles called VersionE and ArtifactE and extend the MutableE d from above representing the requirement with VersionE. The artifact will also have the role RequirementE.

d -- VersionE

f = UUID()

f -- EntityE
f -- IdentityES -- "Requirement 17.3 Make it fast!"
f -- ArtifactE

f -- RequirementE

f -- VersionEE -- d

# Information Model
ArtifactE -- VersionEEc1n -- VersionE

Branches

A branch is an entity that is both an ArtifactE and a VersionE and it is used to group versions together for easier navigation. A dot in a version number indicates that we have a branch. We will rebuild the structure above by inserting a branch between the ArtifactE f and the VersionE d.

g = UUID()

g -- EntityE
g -- IdentityES -- "Requirement 17.3 Make it fast! v6.x"
g -- ArtifactE
g -- VersionE

f -- VersionEE -- g
g -- VersionEE -- d

The VersionE can be associated with both the top ArtifactE and the branch ArtifactE. This is probably a correct representation of how the model has evolved over time. First we have the branchless versions 1, 2, 3, 4, 5, 6, 7 directly associated with the top ArtifactE f. Then the need to branch version 6 arises and we create the branch 6.x where we put both version 6 (aka 6.0) and 6.1. Nothing prevents us from also associate ArtifactE f with 6.1 directly if we want to.

Designing and Building Things using Parts, Materials, Tools and Instructions

A product design is modelled as an ArtifactE. To be able to build items of this design we need to know which parts and materials to use, what tools we need and what instructions to follow. These four categories are in turn modelled as ArtifactEs and can be broken down in the same way.

ComposedE -- BoMPartEE1n -- BoMPartE

BoMPart -- CountEIcn1
BoMPart -- PartEEcn1 -- PartE

ComposedE -- BoMMaterialEEc1n - BoMMaterialE

BoMMaterialE -- AmountERcn1
BoMMaterialE -- UnitEScn1
BoMMaterialE -- MaterialEEcn1 - MaterialE

BoM stands for Bill of Materials which is a list of the things and how much of each we need to build something. The ArtifactE we are designing will come in different versions over time. Each VersionE entity of the ArtifactE will also have the relations above, but they will associate with other VersionEs of the ArtifactEs that are used as parts in each particular case.

Software and other information ArtifactEs seldom use more than one of each of the components that they are built from (licensing conditions could create exceptions). This motivates a simpler kind of relation in these cases.

AggregateE -- ComponentEEcnn -- ComponentE

We may want to use this relation for the ArtifactEs of physical products as well and only specify the BoM for the VersionEs. The reason is that the amount and exactly what is needed to build something can vary over time, but the ArtifactE represents all VersionEs of itself and should therefore not be too specific.

Finally, here are the relations for tools and instructions.

ManufactureE -- ToolEEcnn -- ToolE

WorkE -- InstructionEEcnn -- InstructionE

ElectricalE -- SchemaEEcnn -- SchemaE

MechanicalE -- DrawingEEcnn -- DrawingE

CompiledE -- SourceCodeEEcnn -- SourceCodeE

Manufacture and work should be read as nouns, e.g. a work of art, the manufacture was made of wood.

As we have seen above, digital records that are the result of the build process can be stored directly in the VersionE entity when it is assigned the role MutableE. Physical products are associated with the VersionE through the ItemEE or ProduceEE relations. The VersionE is assigned the BlueprintE role or the RecipeE role depending on what kind of product is produced.

BlueprintE -- ItemEEc1n -- ItemE

RecipeE -- ProduceEEc1n -- ProduceE

ProduceE -- UnitEScn1
ProduceE -- AmountERcn1

If you want to organise your output in batches then let the batch entity have both the BlueprintE+ItemE roles or the RecipeE+ProduceE roles just like we did with branches.

Batteries not Included

Products that doesn't come with all necessary parts to function as indended are called integrated. This is typical for software that dynamically loads modules at runtime.

IntegratedE -- ModuleEEcnn -- ModuleE

Specifications

Specifications like the requirement we modelled earlier are someting that dictates what other artifacts must be or behave like.

SpecificationE -- ImplementationEEcnn -- ImplementationE

Correctly tracked, this relation can be used to find out which changes made to specification still remain to be implemented. A key feature to fully comply with traceability. If an AnrifactE is an ImplementationE of a SpecificationE, then all VersionEs of the SpecificationE should have at least one ImplementationE among the VersionEs of the ArtifactE. If some are missing then we know we have things left to implement.

It is also valuable to track which TestEs aim to refute the ImplementationE with input from the SpecificationE.

SpecificationE -- TestEEcnn -- TestE

Note that a test can be an implementation of a test specification at the same time it is a refutation test for a product specification.

Tests

Tests can be organized in a hierarchical structure using the aggregate/component roles and relations. The test result associates a version of the test target with a specific version of the test that was performed.

TestResultE -- TestTargetEEcn1 -- TestTargetE
TestResultE -- TestEEcn1 -- TestE

Guides

Guides tells you how to use things. It can be user guides, service guides, and so on.

ApplianceE -- GuideEEcnn -- GuideE

Retired Entities

When an entity no longer is actively participating in the product structure it will take on the role RetiredE.

x -- RetiredE

I Want Pii

Download the code from https://github.com/TheOtherMarcus/Pii.

2020-10-25

The Process that Makes Product Development Possible

What is going on in our mind when we design, review, test? Could it be like evolution - the process that has designed all living things? A property of evolution that is hard to grasp is that it has no goal, not even survival. And still, knowledge is accumulated over time in the species that do survive. New ideas on how to be and act is created by random mutations, which are criticised by reality - including other species. But this doesn't seem to be the way we design things. We don't build random things and keep the ones that are useful, do we? That would take a lot of time. We design and build products much faster than that. And we do it on purpose to reach a goal, to satisfy a need. Or could it be that we are mistaken about what is really going on and that design is exactly like evolution? That the things we have are just random creations that could have been completely different. That the things that are useful to us are preserved until they no longer are and then forgotten. That designs mutate and are then outcompeted by improved versions of themselves. Take a look at testing for example. There is something about our need to test our creations that resembles evolution. We are never 100% sure that the things we design will work in reality. It is true that we don't go ahead and build any random thing, but still.

I have noticed that to be able to program a computer at all, you need to have a special ability. You need to be able to simulate in your head the effect of a code statement running on the computer. To do this you first need to be able to load the rules of interpretation the computer follows into your head. That we have this ability implies that we are universal simulators. And this, I think, is also the key to understand how we design things in general. It means that we don't have to build everything that we come to think of to see if they will work. We can simulate them in our head instead and quickly reject things that doesn't work. The same principle applies to any action we could choose to take. The effects of the action can be simulated ahead of time. This means creativity can be purely random at its core just like mutations are. We quickly simulate the ideas that flow from this process and reject the ones that fails. Some ideas we act on and a few end up as products.

But still, we are pursuing goals that can be quite far away from where we start. We know ahead of time where we want to end up. This doesn't seem very random, and still it probably is. Where did the goal come from in the first place? Random creativity. And we did simulate it and we have seen that the goal is worth pursuing. The gap from here to a goal far away has to be filled with intermediate goals, random as well, and the simulation tells us which are steps in the right direction. This process of breakdown continues until the goals are trivial for us to achieve.

The simulation is not perfect and we always risk going down paths that doesn't lead where we want when we try our ideas in reality, or when later stages of the simulation fails to connect an intermediate goal to either the beginning or the end. At that moment we backtrack and wait for new random ideas to try out. It is good to not put unnecessary constraints on the end goal because you don't know which turns the random road to the goal will take. Be open minded and try to use whatever your creativity comes up with, maybe even adjust the goal itself to better fit the ideas that become available. If all you have is a hammer...

If we take the simulation idea further it means our minds are collectively running a set of virtual realities - simulations of physical reality. An environment where memes (in the original, broader sense) can replicate, evolve and influence our thinking. Despite being abstract, memes can affect the real world through us humans because how we act in the real world are influenced by abstractions in the virtual world. Memes are ideas that spread through various forms of communication between people. Each individual uses creativity and simulation to turn the incoming message into a personal idea that in turn has the power to change the simulation. All ideas are not memes, only the ones that can spread between minds are.

If creativity is an unconscious process that we are born with, i.e. genetic knowledge, is it possible for us to improve it or are we stuck with what we got? If we can figure out which methods are used by creativity, e.g. recombination of already existing ideas, we could simulate these methods and thereby control our own creative process, even when the defective genes of an individual lack knowledge of that particular method. We are, after all, universal simulators.

To wrap things up. We have found design to be simulated tests of random ideas produced by our creativity. Review is just another person using the same mechanisms, and the transfer of ideas between humans also requires creativity and simulation at the receiving end. The purpose of real tests are to verify that we didn't make mistakes in the simulation and make sure the thing we have built is fit for its purpose. We will never be completely sure though, because the tests are also designed with the same process and can themselves have errors. Still, the rational thing to do is to use a product that has passed all tests, simulated or real.

If you have gotten this far I should tell you that these are not my own ideas. If you want to learn more make sure to check out David Deutsch, Donald Campbell, Karl Popper and the blogs and podcasts from Bruce Nielson (The Four Strands, The Theory of Anything), Brett Hall (ToKCast), Christofer Lövgren (Do Explain).

The Temporal and Spatial Structures of Product Development

We use time to construct products. Over time we learn things, invent solutions, encounter problems, make decisions. The information we collect is accumulated in the spatial structure of the product and the artifacts that surrounds it. New information changes the shape of the product and lifts it to a new configuration, slightly more refined and fit for its purpose. The project itself is a temporal structure that organizes the flow of information into and out of the development process.

Should we do the specifications first, the code first, the tests first? We should do a little of everything in parallel using the information we have acquired so far. If we don't know from the start what the product should do exactly, then we should focus on code and assembly before specification. The things we learn along the way will go into the specification. If the goal is clear from the start then the specifications should be filled with more content before time is spent on constructing the product. When we are done the final configuration should look approximately the same regardless of how we got there. We should have specifications, code, drawings, assemblies, tests and guides.

It is not only the functional aspects of the product that shapes it. Of equal importance are constraints imposed by efficient industrial production and how to perform compliance testing. From the start of the development project, strive to maintain an integrated product from end to end, extending into production and verification. It is not only the product itself that needs to be constructed, a variety of specialized tools for production and test also needs to be developed.

2020-09-09

Extensible C APIs

The problem we want to solve is how to add functionality to an existing C API in an elegant way. You may have seen the practice to add one extra parameter at the end of a function call, reserved for future use. It must always be 0 in the first version of the API. The implementation should enforce it to be 0 to avoid unpleasant surprises in the future.

uint32_t do_work(int32_t a, uint8_t b, void* rfu);

Later, when additional requirements make it necessary to vary how "work is done" and the variation needs one additional parameter, uint16_t c. It is added in the second version of the API.

struct do_work_extensionA {
uint16_t c;
void* rfu;
};
uint32_t do_work(int32_t a, uint8_t b, do_work_extensionA* extA);

When extA is 0 do_work() should behave exactly like the first version to be backwards compatible. Existing users can keep their code as it is until they need the new functionality.

You may have spotted the pattern, and to be sure we will add a third version of the API with yet more parameters.

struct do_work_extensionB {
    uint8_t* d;
    uint32_t e;
    void* rfu;
};
struct do_work_extensionA {
    uint16_t c;
    do_work_extensionB* extB;
};
uint32_t do_work(int32_t a, uint8_t b, do_work_extensionA* extA);

An alternative to the extension scheme described above is to use multiple entry points.

uint32_t do_work(int32_t a, uint8_t b);
uint32_t do_work_extA(int32_t a, uint8_t b, uint16_t c);
uint32_t do_work_extB(int32_t a, uint8_t b, uint16_t c, uint8_t* d, uint32_t e);

You decide which method suits you best.

2020-09-05

PCB Version and Assembly Discrimination

You should give embedded software a way to figure out its place in the world. "I just woke up. Where am I?" This can be done by giving the PCB a version number. Allocate a few GPIOs, say three, and you can have the same binary run on 8 different PCBs. The software has the corresponding 8-bit vector which tells itself or a bootloader if it is compatible with a certain PCB. 0b00000101 means the software is compatible with PCB version 0 and 2. This prevents old, incompatible software to run afoul on new PCBs which it hasn't got routines to handle. It is common to forget to check this in software before there are more than one PCB version, but this check is needed from the start to prevent that old versions of the software is installed and executed on new versions of PCBs they don't support.

The next step is to allocate a few GPIOs for variations in the assembly. These are by default high by internal pullups. Zero Ohm resistors can be mounted to sink them to ground. You now have the possibility to vary the assembled components on the board and all variants can be handled by the same binary. Exactly what it means that any one GPIO is low can be decided at the moment when the need for an alternate assembly arises.

If you have an analog pin to spare you can encode the assembly version with a voltage divider instead. The analog range is divided up in discrete intervals, each representing an assembly version. With 1% resistors it is reasonable to divide the full range (5V) into 25 separate ranges, i.e. 0.2V per range. One analog input can decode 25 different assembly versions. If you don't want the divider to always consume current you can connect the low side to a GPIO and turn off the divider by setting it high.

What if you only have one GPIO pin and want to encode more than two versions? Set the GPIO as output and charge a capacitor. Switch over to input and measure the discharge time. The more versions you need, the longer it will take to find out which is present.

It is possible to solve the problem with evolving hardware by building separate binaries for each configuration. The drawback is that you need to be careful with which binary you install on which hardware, because mistakes are not automatically detected. This is a complication I prefer to avoid.

Formal Language