---
date: 2014-09-09
title: Patch Theory
tags: [math, code]
description: In this article I want to develop a sound mathematical theory of version control. I will propose definitions of states, changes, patches and merges.
---

## Introduction

Versioning systems have been important tools in software development
(and other areas) for a long time. In recent years, git has been the
most successful one with github as the *place to be* for open source
projects.

In the first part of this article I want to develop a sound mathematical
theory of version control. I will propose definitions of states,
changes, patches and merges.

While git was created out of the need for proper versioning within the
linux project, [darcs](http://darcs.net/) is a versioning system that is
based on science and a *theory of patches*. While my theory won't be
*based on* git, it will most likely be influenced by my daily work with
it. So in the second part I will try to reflect on that and compare my
theory to the one that darcs is based on.

My focus will of course be what mathematicians like best: Creating
theories out of thin air and proving the existence of solutions without
ever thinking about the possibility to get to this solution
algorithmically in a decent amount of time. So in the third part I will
spend a little time thinking about the possibility to implement any of
it.

I will not give an introduction to how version control works today.
There are several really good introductions on the net.

I am not sure what good this article will bring. At the very least, I
expect writing it to be fun. But I also hope that a deeper understanding
of version control will make it easier to work with it. In the very best
(and least likely) case, I will find something that truly improves
version control for everyone, like a way to eliminate the need for
manual merges altogether or making git understand that I simply changed
indentation.

## The Basics

Say we have a set `S` of *states*. We always use capital letters for
states. For two states `A, B ∈ S` we say `(A, B)` or shorter `AB` is the
*change* from state `A` to state `B` and the elements of `C := S²` are
called *changes*.

`S` is in this abstract definition just any set. But in practice it
would more likely contain each and every point in history of each and
every software project there is, has been, and ever will be.

Note that there is some kind of natural composition of changes: When I
first change from `A` to `B`, and then from `B` to `C`, I have changed
from `A` to `C` (in formula: `AB * BC = AC`).

## Patches

Now imagine you have a repo with some files, one of them called
"main.py". Then you delete this file. This, in the terms we already
defined, is simply a change from one state to another.

Now imagine you had not deleted that file, change something in a
different file and went on to delete "main.py" then. This is also a
change from on state to another, but they are different states than in
the previous example. So although we did the same thing (delete
"main.py") we have two completely different changes.

The mathematical tool for saying "these things are different, but in the
current context they are really just the same" are *equivalence
relations*. So what we need to do to get from changes to patches is to
define an equivalence relation on the set of changes.

Then again, I would much prefer an axiomatic approach over actually
defining that relation in every detail. Too strict definitions always
carry the danger of incompatibility with related theories.

### The Equivalence Relation

I propose the following axioms for the equivalence relation `~`:

1.  `AB ~ CD => BA ~ DC`

2.  `AB ~ AC => B = C`

3.  `AB ~ A'B' and BC ~ B'C' => AC ~ A'C'`

4.  `AB ~ CD => AC ~ BD`

The elements of `P := C/~` are called patches. We denote patches by
lowercase letters.

The first axiom simply says that when two changes are equivalent, their
inverts must be equivalent, too.

The second axiom states that, if a patch can be applied to a state, the
result is unambiguous.

The third axiom tells us that patches are compatible with composition.

The fourth axiom is an interesting one, mostly because it does not
convey its true meaning easily. Think of it this way: We start at state
`A` and go to state `D`, but we have two different paths. The one takes
a detour over `B` while the other goes over `C`.

Say that `AB` and `CD` are equivalent. If you think in terms of line
based patches, this is only possible if `AC` does not touch any of the
lines that `CD` changes. And because in the end, both paths arrive at
`D`, this must mean that `AC` and `CD` are equivalent.

### Corollaries

-   For 1, 2 and 4 there is actually equivalence

-   1 ensures that for every patch `a` there is an inverse patch `a⁻¹`

-   `AA ~ BC <=> B = C` follows from 2 and 4 and shows that there is
    an identity patch we call `id`.

### Applying a Patch

We can interpret a patch `a` as a function (which is compatible to the
set theoretic definition of a function) with the following domain and
codomain:

    dom(a) := {X | ∃ Y ∈ S: XY ∈ a}
    cod(a) := {Y | ∃ X ∈ S: XY ∈ a}

We say, a patch `a` is applicable to a state `A`, iff `A ∈ dom(A)`.

## Composing patches

Up until now we have a notion of what a single patch is. We know how to
apply it where it is applicable.

But a repo consists of dozens, sometime thousands of patches. So it is
important to think about how multiple patches can be composed.

As patches, as we have defined them, are in fact functions, the first
guess might be to use function composition: If `a, b ∈ P` and `cod(a)
⊆ dom(b)`, we can write `b * a`.

The problem with this is that we can not be sure that `b * a` is a
patch. Axiom 2 makes sure that all changes in that set are equivalent.
But we do not know whether elements of the equivalence class are
missing. Take for example `a⁻¹ * a`. This should equal `id`, but in fact
it is `id` restricted to `dom(a)`.

There is a second problem. What do we do if the condition `cod(a) ⊆
dom(b)` is not complied with? We could simply say that these patches
conflict, but it is an important goal to create as few conflicts as
possible.

So I prefer this definition: For two patches `a, b` and `X, Y, Z ∈ S`
such that `XY ∈ a` and `YZ ∈ b`, the *composition* of `a` and `b` is
defined as `ab := [XZ]`.

-   The result is always a patch

-   It works in many cases where `cod(a) ⊆ dom(b)` is not
    complied with

-   It only creates a conflict if `cod(a) ∩ dom(b)` is empty

### Independence

Axiom 4 can now be reformulated in terms of patches:

``ab = bc => a = c``

Patches `a, b ∈ P` for which `ab = ba` is true are called *independent*.
`id` is independent from all patches.

## Further Ideas

These are notes on ideas on some parts that may be interesting to work
on in the future.

### Simple Questions

There are some questions about patches that seem harmless but are not
all that easy to get an answer for. Examples:

-   Is there a patch `a` with `dom(a) = S` except for `id`?

-   Is there a patch `a` with `dom(a) ∩ cod(a) != 0` except
    for `id`?

### Patch sets

The composition I gave in the previous chapter is not associative.
Example: Say patches `a, b` conflict. Then `(ab)b⁻¹` is not defined,
while `a(bb⁻¹)` is.

There is an obvious solution to this problem: Patches should simply
always be applied in an order that makes most sense. That is, we do not
look at (ordered) sequences of patches, but at *patch sets*. A merge, in
this scenario is simply a union of two patch sets.

But there are obvious issues with this approach, too. It is not clear
what the "order that makes most sense" should be and whether it is
unambiguous. And while most things I said before are compatible with
common version control systems we use today, this is not.

### State as Patches

Now that we have patches we can fix one state we may call `0` and
identify every state `A ∈ S` with the patch `[0A]` where `0` is
identified with `[00] = id`.

Of course this is only a small subset of patches (more specifically, it
is `{a ∈ P | 0 ∈ dom(a)}`). But it may have some interesting structural
properties.

### Conflicts

#### Group structure

The set of patches `P` has a structure similar to a group: There is a
neutral element `id` and for every element `a`, there is an inverse
element `a⁻¹` with `aa⁻¹ = id = a⁻¹a`.

But unfortunately the composition is only defined for patches `a, b ∈ P`
that comply with the condition `cod(a) ∩ dom(b) != 0`. The theory would
get a whole lot nicer if we could manage to extend `P` to contain one or
many elements that represent conflict. I did not come up with a sensible
way to force `P` into a group, but I guess it would be worth it.

#### Manual conflict resolution

An issue that I imagine is very important in developing actual version
control systems is aiding humans in resolving conflicts. I did only
scratch that topic when I said that I wanted to provoke as few conflicts
as possible.

But sometimes there are real conflicts and in those cases it is no good
trying to be smart. So breaking a patch into multiple parts, applying
those that can be applied automatically and presenting the other to a
human in a humane way is something that needs to be done.

I did not yet think about it at all.

## Conclusion

I believe I have come up with a stable foundation for patch theory. The
idea to define patches as equivalence classes over the square of states
proved to be flexible and elegant. I have analyzed patches to have a
group-like structure, but did not yet manage to force them into actually
being a group.

I guess my thoughts are still a bit messy. So sorry for that.