codist.distance

Functions to compute the edit distance between trees with given cost functions.

Module Attributes

Change

A change operation of the form (T | Lambda -> T | Lambda, ctx) Where ctx is either an index or Lambda providing some context for the change operation.

Functions

tree_dist(tree1, tree2[, cost])

Tree edit cost using the given cost function.

tree_edit(tree1, tree2[, cost])

Tree edit cost and edit path using the given cost function.

Classes

Cost(delete, float] = >, insert, float] = >, ...)

A set of tree edit cost functions for deleting, inserting and relabelling nodes.

Change: type[tuple[T | Λ, T | Λ, int | Lambda]] = Change

A change operation of the form (T | Lambda -> T | Lambda, ctx) Where ctx is either an index or Lambda providing some context for the change operation.

  • For insertions (Λ -> T, ctx), the ctx is the postorder index of the parent node in tree2 that T is being added under

  • For deletions (T -> Λ, ctx), the ctx is the postorder index of the node in tree1 that is deleted

  • For relabelings (T1 -> T2, ctx), the ctx is the postorder index of the node in tree1 that is relabeled

Note

The ctx variable provides some context for change operations, but does not provide, for example, the indices of the siblings that are inserted as children of T for an insertion operation.

class Cost(delete: ~collections.abc.Callable[[T], float] = <function Cost.<lambda>>, insert: ~collections.abc.Callable[[T], float] = <function Cost.<lambda>>, relabel: ~collections.abc.Callable[[T, T], float] = <function Cost.<lambda>>)

A set of tree edit cost functions for deleting, inserting and relabelling nodes.

By default, returns 1 except for the case γ(a -> a) which returns 0

Parameters:
  • delete – A cost function, (T) -> float for the change operation (T -> Λ). Default is (T) -> 1

  • insert – A cost function (T) -> float for the change operation -> T). Default is (T) -> 1

  • relabel – A cost function (T1, T2) -> float for the change operation (T1 -> T2). Default is (T1, T2) -> 0 if T1 == T2 else 1

Variables:
  • delete – A cost function, (T) -> float

  • insert – A cost function (T) -> float

  • relabel – A cost function (T1, T2) -> float

tree_dist(tree1: Tree, tree2: Tree, cost: ~codist.distance.Cost = <codist.distance.Cost object>) float

Tree edit cost using the given cost function.

Parameters:
  • tree1 – the initial tree

  • tree2 – the target tree

  • cost – a Cost object defining cost functions

Returns:

The edit distance between tree1 and tree2

tree_edit(tree1: Tree, tree2: Tree, cost: ~codist.distance.Cost = <codist.distance.Cost object>) tuple[float, tuple[Change, ...]]

Tree edit cost and edit path using the given cost function.

Parameters:
  • tree1 – the initial tree

  • tree2 – the target tree

  • cost – a Cost object defining cost functions

Returns:

A tuple containing the edit distance between tree1 and tree2 and a tuple of Change operations where each change operation is a 3-tuple of the form (T | Lambda -> T | Lambda, ctx) where Lambda is a singleton string: "Λ"