Binary Indexed Tree
The structure was first used for data compression. Now it is often used for:
- storing frequencies
- manipulatin cumulative frequency tables
Basic problem
we have n boxes, possible queries are
- add marble to box
i - sum marbles from box
kto boxl
Naive array would have O(1) and O(n).
Using Binary indexed trees the query 2 can be O(log n)
Notation
- BIT
- MaxVal
- f[i] - freq for index
i - c[i] - cumulative freq for index
i(f[1] + f[2] + ... + f[i]) - tree[i] - sum of frequencies stored in BIT, aka tree frequency
Basic idea
Each integer can be represented as sum of powers of two.(二进制)
idx is some index, Let r be the position of the last digit 1 in binary notation.
tree[idx] = f[idx - 2^r + 1] + ..... + f[idx]
suppose we are looking for cumulative frequency of index 13(1101).
c[1101] = tree[1101] + tree[1100] + tree[1000]
isolate last digit
x & ~(x-1)
Operations
Read cumulative frequency
def read(idx):
sum = 0
while idx:
sum += tree[idx]
idx -= (idx & ~(idx-1))
return sum
time: O(log MaxVal)
change single frequency at some position and update tree
update tree frequency at all indexes that are responsible for frequency whose value are changing.
In changing, we should increment value at the current index, add the last digit
to index and go on while the index <= maxVal
def update(idx, val):
while idx <= MaxVal:
tree[idx] += val
idx += (idx & ~(idx-1))
time: O(log MaxVal)
Read the actual frequency at a position
One approach is to have one additional array.
f[idx] = read[idx] - read[idx-1]->2 * O(log n)
def readSingle(idx):
sum = tree[idx]
while idx: # special case 0
stop_idx = idx - (idx & ~(idx-1))
idx -= 1 # get idx - 1
while idx != stop_idx:
sum -= tree[idx]
idx -= (idx & ~(idx-1))
scaling the entire tree by a constant factor
def scale(c):
for i in range(1, MaxVal + 1):
tree[i] = tree[i] / c
find index with given cumulative frequency
Similar as binary search
def find(target):
idx = 0
bitMask = 1 << 31
while bitMask and idx < MaxVal:
mid = idx + bitMask # mid point
if target == tree[mid]:
return mid
elif target > tree[mid]: # mid should be included in the cumulative
idx = mid # higher half
target -= tree[mid]
bitMask = bitMask >> 1 # half the range
if target != 0:
return -1
else:
return idx
2D BIT
Suppose we have a plane with dots, you can make 3 queries:
- set dot at (x, y)
- remove dot from (x, y)
- count number of dots in rectangle(0,0), (x, y)
In this case, each element of the tree will contain array - tree[max_x][max_y]
Updating indexes of x-coordinates is the same as before. For example, suppose
we are setting/removing dot (a, b), we will call update(a, b, 1)/update(a, b, -1).
def update(x, y, val):
while x <= max_x:
updatey(x, y, val)
x += (x & ~(x-1))
def updatey(x, y, val):
while y <= max_y:
tree[x][y] += val
y += (y & ~(y-1))
can be written as one function
def update(x, y, val):
while x <= max_x:
y1 = y
while y1 <= max_y:
tree[x][y] += val
y1 += (y1 & ~(y1-1))
x += (x & ~(x-1))
read function is changed similarly
Conclusion
- easy to code
- each query takes constant or logarithmic time
- require linear memory space
- can use it as an n-dimensional data structure