# The Embedding

This week I have started some actual code of the derivation of commutativity of diagrams and implications. The first half of the week has gone to splitting Diagram into Diagram and Implication, as outlined in the previous post. Nothing really unexpected happened during that part, so there isn’t much to say about it, save for the thing that the code has become clearer and better organised. Furthermore, I have gained a better understanding of some corner cases, as well as implemented more robust handling for those corner cases.

The second half of the week was considerably more exciting and thought intensive: it was related to finding diagram embeddings. As it should be clear from the last post, this functionality lies at the foundation of deciding the commutativity of diagrams and implications. In what follows, I will refer to the diagram which we need to embed as to the pattern, and to the diagram into which we need to embed as to the model. This seems to be an almost universally accepted terminology and comes from the fact that finding subgraph isomorphisms often lies at the base of various pattern matching implementations.

I have started by selecting and analysing the excellent paper by J. R. Ullman, [Ullm1976], which describes a very clear way of enumerating all possible graph embeddings. This solution, however, was not exactly what I needed. First of all, the algorithm described in details in [Ullm1976] is actually meant for undirected graphs, whereas one can clearly see arrows in diagrams. Furthermore (a thought that has occurred to me quite late), diagrams, are actually multigraphs, in the sense that there can be more than one morphism between two objects. Yet further, a diagram embedding must preserve morphism properties, in the sense that the embedding must map a morphism in the pattern to a morphism in the model, which has exactly the same properties as the morphism in the pattern.

I attempted to find whether someone has addressed the directed multigraph embedding problem before; however, I haven’t managed to find any references on the Internet, so I started thinking on adapting Ullman’s solution to my case. The first thing I figured out was that I could reduce the directed multigraph embedding problem to a directed graph embedding problem. Indeed, take a diagram and flatten down all multiple morphisms going in the same direction between the same to objects to one directed edge between these two objects. Then construct directed graph embeddings and, for each such embeddings, for each directed edge $e$ of the flattened pattern, construct injective, property-preserving, mappings from the set of morphisms of the pattern, which were flattened to $e$, into the set of morphisms associated with the edge in the flattened model, to which $e$ is mapped by the subgraph isomorphism. (These mappings are actually property-preserving embeddings in their own right, but I won’t call them so, since I’m good and I understand that the blog post has just become a bit unclear, so to say :-D )

Let’s see an example. Consider the diagram comprising two different morphisms: $f:A\rightarrow B$ and $g:A\rightarrow B$, where $f$ has the property golden; this diagram is going to be out pattern. Now, consider the model: a diagram comprising three morphisms $\alpha:C\rightarrow D$, $\beta:C\rightarrow D$, and $\gamma:C\rightarrow D$, in which $\beta$ has the property golden. Quite obviously, all of our property-preserving embeddings should map $f$ to $\beta$, while $\latex g$ can be mapped to either $\alpha$ or $\beta$. Also note that the flattened pattern in this case is the graph consisting of a single edge $(A,B)$, while the flattened model is another one-edge graph, $(C,D)$. More complex diagrams are treated in a similar fashion: flatten the pattern and the model to directed graphs, find directed graph embeddings, and then find the property-preserving morphism mappings.

There was another slight surprise underway, however. Ullman does describe some of the modifications which will make the original algorithm capable of constructing directed graph embeddings, however, he has apparently forgot to describe one of them. I will give some definitions before going into more detail. Ullman uses $A$ to refer to the adjacency matrix of the pattern, $B$ to refer to the adjacency matrix of the model, and $M$ to refer to the matrix representing a mapping of the vertices of the pattern into the vertices of the model; $M_{ij} = 1$ means that vertex $i$ in the pattern is mapped to vertex $j$ in the model.

Now, for given $A$, $B$, and $M$, compute $C = M (M B)^T$. Condition (1) in [Ullm1976] states that, if $a_{ij} = 1\Rightarrow c_{ij} = 1$, for any vertices $i$ and $j$ of the patern, then $M$ represents an embedding. (As usual, $a_{ij}$ are elements of $A$ and $c_{ij}$ are elements of $C$). When I tried to actually use this criterion for a directed graph, I found that, apparently, $C^T$ should be used, instead of $C$. The formal explanation follows. By abuse of terminology, I will use “pattern” and “model” to refer to the flattened pattern and flattened model as well.

Let $D = M B$. $d_{ij} = 1$ means that $\exists! k . (m_{ik} = 1 \mbox{ and } b_{kj} = 1)$, where $k$ is a vertex of the model. In other words, this means that the vertex $i$ of the pattern is mapped to a unique vertex $k$ of the model, such that in the model there exists the (directed) edge $(k, j)$. Obviously, if $d^T_{ij}$ is an element of $D^T$, the role of the indices is reversed, that is: the vertex $j$ of the pattern is mapped to a unique vertex $k$ of the model, such that in the model there exists the (directed) edge $(k, i)$.

Now, $c_{ij} = 1$ means that $\exists! t . (m_{it} = 1 \mbox{ and } d^T_{tj} = 1)$. Deciphering the meanings of the values of the elements of these matrices, this means that the vertex $i$ of the pattern is mapped to a vertex $t$ of the model, vertex $j$ of the pattern is mapped to a vertex $k$ of the model, such that in the model there is the edge $(k, t)$. Now, suppose there is an edge $(i, j)$ in the pattern. $c_{ij} = 1$ means $M$ maps $i$ to $t$ and $j$ to $k$, such that the model contains the edge $(k, t)$. That is, the condition (1) as stated in [Ullm1976] and applied to directed graphs checks that $M$ actually reverses the direction of edges! Therefore, one must actually use $C^T$ to check for embeddings.

Now, since the original algorithm in [Ullm1976] was designed for undirected graphs, this extra transposition did not matter, and I think this is the reason why Ullman does not mention it.

I have implemented all the things I have described so far, so diagram embeddings kinda work :-) I have played with python generators a bit, so the code only produces embeddings on the as-needed basis. I did that because I thought of the situation when any diagram embedding will suffice, but also because using generators has resulted in what I believe to be more elegant code. The code abounds in comments, so I think it shouldn’t be a problem to comprehend for someone different from myself. I don’t have a formal proof for this statement, however, so, I guess, Tom is going to be the test subject for this supposition ^_^

There are still a couple things to do, though. First of all Ullman shows a nice optimisation to his algorithm; it looks pretty simple, so I’ll add it. I will then write a couple more tests, including some crash tests involving complete graphs. I will also have to rename the function which does all this magic from subdiagram_embeddings to diagram_embeddings, for obvious (I hope :-) ) reasons.

### References

[Ullm1976] J. R. Ullman, An Algorithm for Subgraph Isomorphism, J. Association of Computing Machinery, March, 1976, 16, 31–42.