Trie

pronounced as 'Try'

Concept

trie

Search complexity can be brought to optimal limit(key length).

Insert and search costs O(key_length), however the memory requirements of trie is O(ALPHABET_SIZE * key_length * N) where N is number of keys in trie.

Operations

  • insert
  • search
  • start_with_prefix

  • delete

During delete operation we delete the key in bottom up manner using recursion. The following are possible conditions when deleting key from trie.

  1. key may not be there in trie. should not modify trie
  2. key present as unique key. delete all the nodes
  3. key is prefix key of another long key. unmark the is_word bool
  4. key present in trie, having at least one other key as prefix. delete nodes from end of key until first leaf node of longest prefix key.

Data structures to use

  • dictonary
  • array[26]

Implementation

class TrieNode(object):
    def __init__(self):
        self.is_word = False
        self.children = {}

class Trie(object):
    def __init__(self):
        # root node
        self.root = TrieNode()

    def insert(self, word):
        current = self.root
        for letter in word:
            if letter not in current.children:
                current.children[letter] = TrieNode()
            current = current.children[letter]
        current.is_word = True

    def search(self, word):
        current = self.root
        for letter in word:
            if letter not in current.children:
                return False
            current = current.children[letter]
        return current.is_word

    def startsWith(self, prefix):
        current = self.root
        for letter in prefix:
            if letter not in current.children:
                return False
            current = current.children[letter]
        return True

    def delete(self, word):
        current = self.root
        path = [current]
        for letter in word:
            if letter not in current.children: # not in trie
                print "not in the trie"
                return
            current = current.children[letter]
            path.append(current)
        if not current.is_word: # not a word in trie
            return
        # try delete the nodes reversely
        # start from the end's parent and the last letter
        for current, letter in zip(reversed(path[:-1]), reversed(word)):
            if len(current.children) <= 1:
                del current.children[letter]
                del current
            else:
                break

can also use defaultdict from collections

References