Inductive Sets of Data

Recursively Specified Data

There are 3 ways of defining a set of values inductively:

Top Down: a Natural Number is in set S iff:
- $n = 0$
- $n - 3 \in S$
Bottom Up: the set S is the smallest set contained in N that satisfies:
- $0 \in S$
- if $n \in S$ then $n + 3 \in S$
Rules of Inference
- $\overline{0 \in S}$
- $\frac{n \in S}{(n + 3) \in S}$

We can also define sets with grammars, eg:

List-of-Int ::= ()
List-of-Int ::= (Int . List-Of-Int)

Each line is called a production. This can also be written as:

List-of-Int ::= ()
            ::= (Int . List-of-Int)

List-of-Int ::= () | (Int . List-of-Int)

List-of-Int ::= ( {Int}* )

The last line is an example of a Kleene star, there is also a Kleene plus (${Int}^+$), they both work as in regex.

A syntactic derivation can be used to show a value is part of a set, eg:

List-of-Int
 => (Int . List-of-Int)
 => (Int . (Int . List-of-Int))
 => (Int . (Int . (Int . List-of-Int)))
 => (-7  . (Int . (Int . List-of-Int)))
 => (-7  . (3   . (Int . List-of-Int)))
 => (-7  . (3   . (14  . List-of-Int)))
 => (-7  . (3   . (14  . ())))

Useful sets

Lists that contain only symbols are called s-lists:

S-list ::= ( {S-exp}* )
S-exp  ::= Symbol | S-list

eg: (a b c)
    (a (b (c) d) e f)

We have binary tree:

Bintree ::= Int | (Symbol Bintree Bintree)

eg: 1
    (a (b 1 2) 3)

And lambda calculus:

LcExp ::= Identifier
      ::= (lambda (Identifier) LcExp)
      ::= (LcExp LcExp)

eg: (lambda (x) (+ x 5))

In the example, x is a bound variable.

These rules are context free in that a rule for a category (eg Bintree) can be applied whereever Bintree is referenced. This is not the case for something like a Binary Search Tree where there are rules on the orderings of nodes that aren’t mentioned in the grammar.

Deriving Recursive Programs

“When defining a procedure that operates on inductively defined data, the structure of the program should be patterned after the structure of the data.”

Write a procedure for each nonterminal.
In each procedure, write one alternative for each production.
- When a non-terminal occurs on the right hand side, write a recursive call.
TODO: Do some of the exercises.
- Calculate list length.
- Get n-th element of list.
- Remove first occurence of element in list.
- Whether a variable occurs free in a lambda expression.
- Substitute elements in a list.

Auxiliary Procedures and Context Arguments

Sometimes this doesn’t work - we need to pass additional context into our recursive called. Eg, a method that numbers each element of a list: (a b c) -> ((0 a) (1 b) (2 c)), we need to pass the current value to the recursive call. The Auxiliary Procedure would be number-elements-from list index for the main procedure number-elements.

Exercises

1.15: (duple n x) returns a list containing n copies of x.

def duple(n, x):
    if n <= 0:
        return []
    else:
        return [x] + duple(n - 1, x)
    
assert duple(0, 3) == []
assert duple(3, 'a') == ['a', 'a', 'a']

1.16: (invert lst) where lst is a list of pairs returns a list with those pairs reversed.

def invert(lst):
    if lst:
        x, *xs = lst
        return [(x[1], x[0])] + invert(xs)
    else:
        return []

assert invert([]) == []
assert invert([(1, 2), (3, 4)]) == [(2, 1), (4, 3)]
assert invert([('a', 1), ('b', 2), ('c', 3)]) == [(1, 'a'), (2, 'b'), (3, 'c')]

1.18: (swapper s1 s2 slist) returns a list the same as slist, but with all occurrences of s1 replaced by s2 and all occurrences of s2 replaced by s1.

def swapper(s1, s2, slist):
    if slist:
        x, *xs = slist
        if type(x) == list:
            return [swapper(s1, s2, x)] + swapper(s1, s2, xs)
        elif x == s1:
            return [s2] + swapper(s1, s2, xs)
        elif x == s2:
            return [s1] + swapper(s1, s2, xs)
        else:
            return [x] + swapper(s1, s2, xs)
    else:
        return []

assert swapper('a', 'd', ['a', 'b', 'c', 'd']) == ['d', 'b', 'c', 'a']
assert (swapper('a', 'd', ['a', 'd', [], 'c', 'd']) ==
        ['d', 'a', [], 'c', 'a'])
assert (swapper('x', 'y', [['x'], 'y', ['z', ['x']]]) ==
        [['y'], 'x', ['z', ['y']]])

1.21: product(s1, s2) that produces the Cartesian product of the two lists.

def right_product(x, s2):
    if s2:
        s, *ss = s2
        return [(x, s)] + right_product(x, ss)
    else:
        return []

def product(s1, s2):
    if s1:
        s, *ss = s1
        return right_product(s, s2) + product(ss, s2)
    else:
        return []

assert (product(['a', 'b', 'c'], ['x', 'y']) ==
        [('a', 'x'), ('a', 'y'),
         ('b', 'x'), ('b', 'y'),
         ('c', 'x'), ('c', 'y')])

1.24: (every? pred lst) returns #f if any element of lst fails to satisfy pred, and returns #t otherwise.

def every(pred, lst):
    if lst:
        x, *xs = lst
        if pred(x):
            return every(pred, xs)
        else:
            return False
    else:
        return True

assert every(lambda x: x % 2 == 0, [2, 4, 6])
assert not every(lambda x: x % 2 == 0, [2, 4, 7])

1.27: (flatten slist) returns a list of the symbols contained in slist in the order in which they occur when slist is printed. Intuitively, flatten removes all the inner parentheses from its argument.

def flatten(slist):
    if slist:
        x, *xs = slist
        if type(x) == list:
            return flatten(x) + flatten(xs)
        else:
            return [x] + flatten(xs)
    else:
        return []

assert flatten([1, 2, 3]) == [1, 2, 3]
assert flatten([[1], [], [2, []], [3]]) == [1, 2, 3]

1.31: Write the following for a binary tree:

Constructors:
- leaf
- interior-node
Predicates:
- leaf?
Observers:
- lson
- rson
- contents-of for both leaves and interior nodes.

LEAF = 'leaf'
NODE = 'node'

TAG_INDEX = 0
VALUE_INDEX = 1
LSON_INDEX = 2
RSON_INDEX = 3

def leaf(value):
    return (LEAF, value)

def interior_node(value, left, right):
    return (NODE, value, left, right)

def is_leaf(bintree):
    return bintree[TAG_INDEX] == LEAF

def lson(node):
    return node[LSON_INDEX]

def rson(node):
    return node[RSON_INDEX]

def contents_of(bintree):
    return bintree[VALUE_INDEX]

1.32: Write double-tree that takes a binary tree and produces another with all the integers in the leaves doubled.

def double_tree(tree):
    if is_leaf(tree):
        return leaf(2 * contents_of(tree))
    else:
        return interior_node(
            contents_of(tree),
            double_tree(lson(tree)),
            double_tree(rson(tree)))

assert (double_tree(leaf(2)) == leaf(4))
assert (double_tree(interior_node('a', leaf(2), leaf(3))) ==
        interior_node('a', leaf(4), leaf(6)))

1.33 Write mark-leaves-with-red-depth that takes a binary tree and produces a binary tree of the same shape but where the leaves contain the number of nodes between it and the nearest interior node with the value red. The root will have the value red.

*I think there may have been an error in the example given in the question - either that or I don’t understand it.**

def mark_leaves_with_red_depth(tree, distance = 0):
    if is_leaf(tree):
        return leaf(distance)
    else:
        # This does break a bit from functional style.
        if contents_of(tree) == 'red':
            distance = 0
        else:
            distance = distance + 1
        
        return interior_node(
            contents_of(tree),
            mark_leaves_with_red_depth(lson(tree), distance),
            mark_leaves_with_red_depth(rson(tree), distance))

tree = interior_node('red',
            interior_node('bar',
                leaf(26),
                leaf(12)),
            interior_node('red',
                leaf(11),
                interior_node('quux',
                    leaf(117),
                    leaf(14))))

expected_tree = interior_node('red',
            interior_node('bar',
                leaf(1),
                leaf(1)),
            interior_node('red',
                leaf(0),
                interior_node('quux',
                    leaf(1),
                    leaf(1))))

assert mark_leaves_with_red_depth(tree) == expected_tree

1.34: Write a procedure path that takes an integer n and a binary search tree that contains that integer and returns a list of lefts and rights showing how to find n.

def path(n, bst, acc = []):
    if bst:
        current = bst[0]
        left = bst[1]
        right = bst[2]
        if n == current:
            return acc
        
        left_path = path(n, left, acc + ['left'])
        if left_path:
            return left_path
        
        right_path = path(n, right, acc + ['right'])
        if right_path:
            return right_path
        
        # Yeah, we could shorten this, but I like the symmetry.
        return []
    else:
        return []

# This one uses the fact the elements are ordered.
def path_search(n, bst, acc = []):
    if bst:
        current = bst[0]
        if n == current:
            return acc
        elif n < current:
            return path(n, bst[1], acc + ["left"])
        else:
            return path(n, bst[2], acc + ["right"])
    else:
        return []
                

bst = (14,
          (7,
              (),
              (12, (), ())),
          (26,
              (20,
                  (17, (), ()),
                  ()),
              (31, (), ())))

assert path(17, bst) == ['right', 'left', 'left']
assert path_search(17, bst) == ['right', 'left', 'left']

1.35: Write number-leaves that takes a binary tree and returns one like the original except the contents of the leaves are numbered starting from 0.

def number_leaves(bintree):
    return number_leaves_aux(bintree, 0)[0]

def number_leaves_aux(bintree, index):
    if is_leaf(bintree):
        return (leaf(index), index + 1)
    else:
        right_pair = number_leaves_aux(rson(bintree), index)
        updated_index = right_pair[1]
        left_pair = number_leaves_aux(lson(bintree), updated_index)
        updated_index = left_pair[1]
        return (interior_node(contents_of(bintree),
                right_pair[0],
                left_pair[0]),
            updated_index)

print(number_leaves(tree))