In this series we’re exploring the data structure - Binary Indexed Tree (BIT)  [Fenwick tree], BIT is used for storing frequencies and manipulating cumulative frequencies.
This series includes 3 parts. This is the first part of the series. 1. Binary Indexed Tree (BIT) Part-1 2. Binary Indexed Tree (BIT) Part-2 3. Binary Indexed Tree (BIT) Part-3 [In progress]
Part 1
This series includes 3 parts. This is the first part of the series. 1. Binary Indexed Tree (BIT) Part-1 2. Binary Indexed Tree (BIT) Part-2 3. Binary Indexed Tree (BIT) Part-3 [In progress]
Part 1
Consider a frequency table F, that stores frequencies of numbers in the range [1...N].  Such a table can be represented using a one dimensional array.
F[i], 1 <= i <= N
F[i] stores frequency of number i. 
In this representation frequency of a number i is stored at index i, this defines a one-to-one mapping between number and index.
Below is a sample frequency table F storing frequencies of number in the range [1..16]
For example element at index 5 stores frequency of number 5, which is 8.
Since we use number itself as table index and since the numbers ranges from 1 to N, we will not use 0th index of the table.
The problem we are trying to solve is - given such a table, we want answer following type of queries:
    Query #1: Find cumulative frequency at a number m inclusive.
    int CumFreq(m)
    Query #2: Update frequency of a number m by x.
     void Update(m, x)
    Query #3: Find/Read frequency of a number m. 
     int Read(m)
     Query #4: Find a number with given frequency f.
     int FindByFrequency(f)
Naive apporach 1: Using F table
Since table F is an array which is indexed using number,  type  #2 and #3 queries can be answered in O(1) time.
#2 Update the frequency of number m by x
   Action<in, int> Update = (m, x) => { F[m] + x; }
#3 Read frequency of a number
   Func<int> Read = (m) => { return F[m]; }
The cumulative frequency at a number m can be calculated by adding the frequency of number m to the cumulative frequency of the previous number m - 1. The cumulative frequency for the number 1 is same as its frequency since there is no cumulative frequency before it.
By just using frequency table F, achieving #1 requires iterating through the table from 1 to m
              i=m  
CF(m) = Σ F(i)
              i=1
Func<int> CumFreq = (m) => { 
  int cf = 0;
  for (int i = 1; i <= m; i++) { cf += F[i]; }
  return cf;
};
Using this method to find the cumulative frequency takes O(N).
Naive approach 2: Using CF table
Another approach is maintaining a table CF, where CF[j] stores the cumulative frequency at number j.
This approach will take O(N) time for pre-computing the CF table. 
The value stored at index j of table CF is:
             j  
CF(j) = Σ F[a]
            a=1
Using CF table type #1 and #3 queries can be answered in O(1).
#1 Cumulative frequency at number m
Func<int> CumFreq = (m) => { return CF[m]; }
#3 Find frequency of a number m
 Func<int> Read = (m) => { 
  if (m == 1) { return CF[m]; }
  return CF[m] - CF[m - 1];
}
#2 type queries takes O(N) in this case. Updating frequency of a number m by x requires adding x to cumulative frequency of all numbers following m, inclusive.
Action<int, int> Update = (m, x) => { 
  for (int i = m; i < CF.Length; i++) { CF[i] += x; }
};
Comparison of naive approaches with BIT
Before we start exploring BIT, let’s compare running time of above two naive approach with upcoming BIT approach so we know what BIT guarantee.
| 
Query | 
Using F table | 
Using CF table  | 
Using BIT | 
| 
int Read(int m)  | 
O(1) | 
O(1) | 
O(log(N))  | 
| 
void Update(int m, int x) | 
O(1)  | 
O(N) | 
O(log(N)) | 
| 
int CumFreq(int m)  | 
O(N) | 
O(1) | 
O(log(N))  | 
| 
Notes: 
 | |||


 
No comments:
Post a Comment