finger-tree

Kevin Wortman

Sat Feb 6 01:01:32 2016

Introduction

This library implements an abstract 2-3 finger tree, as described in Finger Trees: A Simple General-purpose Data Structure by Ralf Hinze and Ross Paterson.

A finger tree is a pure data structure (a.k.a. immutable or persistent), meaning that operations which take a tree as input leave that tree unchanged and still valid. So a mutation operation, such as adding an element, works by creating and returning a new tree object which shares almost all of its structure with the input tree.

A finger tree stores a collection of elements in left-to-right order. As discussed below, the left-to-right ordering may be oblivious to element contents (i.e. unsorted), in which case the tree is analogous to a deque where left and right correspond to front and back, respectively. Or, the left-to-right ordering may correspond to increasing sorted order according to some kind of key, in which case the tree is analogous to a binary search tree.

Finger trees are remarkably versatile. Finding, adding, or removing the left or right element takes constant amortized time. Finding, adding, removing, or splitting at an arbitrary ordered element takes logarithmic amortized time. Therefore a finger tree can implement a deque competitive with doubly linked list or a self-resizing vector; a set, bag, or map competitive with a self-balancing binary search tree (e.g. AVL or red-black); or a priority queue competitive with a binary heap.

Finger trees are inherently lazy, meaning that many parts of a tree are promises which are only forced when absolutely necessary. Consequently, the worst-case time efficiency of individual procedure calls is difficult to characterize precisely. We say that a procedure takes amortized O(X) time when it contributes at most O(X) time to any sequence of finger tree operations. In particular, some of that O(X) cost may be incurred immediately, while some of the cost may be deferred to some future procedure call that forces a promise. We say that an operation takes strictly O(Y) time when it takes at most O(Y) time immediately, with no time cost deferred to future operations.

Measurements and monotone predicates

Finger tree operations expect that tree elements are measurable. Procedures that depend upon measurements take arguments with the following names and required semantics:

This exceedingly abstract measurement scheme helps finger trees achieve their versatility.

Note that the Hinze and Paterson describe the space of measurements in terms of a monoid. We avoid that terminology since it is not universally understood in the Scheme community.

Conventions and preconditions

The following naming conventions and preconditions apply to all procedures in this library.

It is an error to mutate the internal structure of a finger tree, since finger trees are derived from other trees, so mutating a tree may have unanticipated "ripple effects" on derived trees. This library provides no way of mutating internal structure, so that could only be done with some kind of introspection library.

Mutating an object in a way that changes its measurement invalidates any finger tree containing that object.

Returned finger trees are not necessarily freshly allocated. Indeed, they may be an input finger tree, or part of an input tree.

add, measure, pred, and zero must represent measurement operations, as described above, which are mutually compatible and also compatible with the finger tree in question.

element is an object which is, or will be, an element of a tree.

gen must be a SRFI 121 generator.

k must be a non-negative integer.

lst must be a list of elements.

obj may be any Scheme object.

tree must be a finger tree.

Whole-tree operations

(make-finger-tree)

Create and return an empty finger tree. Strictly O(1) time.

(finger-tree add measure [element ...])

Create and return a finger tree containing element..., added in left-to-right (forward) order. Strictly O(n) time, where n is the number of elements.

(finger-tree? obj)

Returns #true if obj is a finger tree, or #false otherwise. Strictly O(1) time.

(finger-tree-empty? tree)

Returns #true if tree is empty, or #false if tree is non-empty. Strictly O(1) time.

(finger-tree-force tree)

Force every part of tree, then return tree. Ordinarily there is no need to use this procedure, as finger trees automatically force themselves on an efficient as-needed basis. This procedure exists for special cases where it is beneficial to front-load tree operations, for example if a server builds a tree as a preprocessing step before handling requests. Strictly O(log n) time.

Sequence operations

(finger-tree-append add measure tree1 tree2 ...)

Return a finger tree containing the contents of tree1, tree2, and so on, appended together. Amortized O(k log n) time, where k is the number of trees and n is the greatest number of elements in any tree.

(finger-tree-filter add measure proc tree)

Return a finger tree containing all the elements of tree that satisfy predicate proc. Note that proc is called on each element, not on each measurement. Strictly O(n) time.

(finger-tree-fold-left proc seed tree ...)

The semantics of this procedure are identical to SRFI 1 fold, except of course that the containers are finger trees instead of lists. Strictly O(n) time.

(finger-tree-fold-right proc seed tree ...)

The semantics of this procedure are identical to SRFI 1 fold-right, except of course that the containers are finger trees instead of lists. Strictly O(n) time.

(finger-tree-for-each proc tree ...)

The semantics of this procedure are identical to SRFI 1 for-each, except of course that the containers are finger trees instead of lists. Strictly O(n) time.

(finger-tree-length tree)

Return the number of elements in tree. Strictly O(n) time.

(finger-tree-map proc tree ...)

The semantics of this procedure are identical to SRFI 1 map, except of course that the containers are finger trees instead of lists. The order of elements is preserved, so this procedure is not suitable for measurement-ordered applications such as sorted sets, since in general, mapping elements permutes their order. Strictly O(n) time.

(finger-tree-reverse add measure tree)

Return a finger tree containing tree's elements in reverse order. Strictly O(n) time.

Deque operations

(finger-tree-left tree)

Return the leftmost element of tree. It is an error to call this procedure on an empty tree. Strictly O(1) time.

(finger-tree-right tree)

Return the rightmost element of tree. It is an error to call this procedure on an empty tree. Strictly O(1) time.

(finger-tree-add-left add measure tree element)

Add element as the new leftmost element of tree, and return the resulting tree. Amortized O(1) time.

(finger-tree-add-right add measure tree element)

Add element as the new rightmost element of tree, and return the resulting tree. Amortized O(1) time.

(finger-tree-remove-left tree)

Remove the leftmost element of tree, and return the resulting tree. It is an error to call this procedure on an empty tree. Amortized O(1) time.

(finger-tree-remove-right tree)

Remove the rightmost element of tree, and return the resulting tree. It is an error to call this procedure on an empty tree. Amortized O(1) time.

Scan and split

(finger-tree-scan add measure pred zero tree match absent)

Search tree for the first element whose measurement satisfies pred. When such a matching element exists, tail-call (match element); otherwise tail-call (absent).

pred must be a monotone predicate, as discussed above.

This procedure is analogous to the search or lookup operation in a search tree. It probes tree without altering it or creating new objects, so has better constant factors than finger-tree-split.

Strictly O(log n) time, assuming that each call to pred takes O(1) time.

(finger-tree-split add measure pred zero tree match absent)

Split tree around the first element whose measurement satisfies pred. When such an element exists, tail-call (match prefix element suffix), where prefix is a finger tree containing all elements left of element, element is the match, and suffix is a finger tree containing all elements right of element. Note that prefix and/or suffix may be empty. When no match exists, tail-call (absent).

pred must be a monotone predicate, as discussed above.

This procedure is strictly more powerful than finger-tree-scan, but since it constructs two new trees, its constant factors are worse. That said, finger-tree-split can be used to implement several essential higher-level operations.

Strictly O(log n) time, assuming that each call to pred takes O(1) time.

Conversion

(generator->finger-tree add measure gen [k])

Return a finger tree containing the elements of gen. When k is given, the first k elements are taken; it is an error for gen to have fewer than k elements. Specifying k allows the procedure to avoid restructuring the tree as it goes, improving constant factors. When k is unspecified, the entire contents of gen is taken, and the tree must restructure itself as it goes, worsening constant factors. Strictly O(n) time.

(list->finger-tree add measure lst)

Return a finger tree containing the elements of lst. This procedure avoids restructuring the tree as it goes, so has fast constant factors. Strictly O(n) time.

(finger-tree->generator tree)

Return a generator that yields the elements of tree in left-to-right order. Strictly O(1) time to create the generator, and of course the generator contains n elements.

(finger-tree->reverse-generator tree)

Return a generator that yields the elements of tree in right-to-left order. Strictly O(1) time to create the generator, and of course the generator contains n elements.

(finger-tree->list tree)

Return a freshly-allocated list containing the elements of tree in left-to-right order. Strictly O(n) time.