Thursday, June 27, 2013

The complexity for Fibonacci numbers in SAS

Fibonacci numbers have a lot of applications. There have been a lot of interesting researches regarding Fibonacci numbers since 800 years ago. Mathematically they are the numbers from a sequence, which is defined as F_n = F_{n-1} + F_{n-2}\hspace{5}with\hspace{5}F_0 =0, \hspace{2} F_1=1.

Experiment and Result

Fibonacci numbers can be derived from either the original definition that is a typical example of recursion, or a mathematically simplified form that is called closed-from expression. In SAS, a user-defined Fib1 function in PROC FCMP can realize the recursion expression, and another Fib2 function is created to express the closed-form expression. I insert both functions to a loop from 10 to 40 with step size of 5 and record the real time costed.
According to the figure above, both algorithms have no actually difference while N is relative small. When N becomes 30 or bigger, the recursion expression function spends a lot of system time. When N is even greater than 40, SAS enters a freezing status and I have to break from the command manually. Thus, the curve for the recursion expression seems to fit an exponential relationship fairly well. On the contrary, the algorithm of closed-from expression takes almost constant time for the execution,

Complexity

  • Recursion expression
    T(n) = T(n-1) + T(n-2) + O(1), which may imply O(2^{n-1}) + O(2^{n-2}) + O(1) = O(2^n).
  • Closed-form expression
    The equation is fairly straightforward and described as F_n= {\phi^n-(-\phi)^{-n}\over\sqrt{5}}\hspace{2}with\hspace{2} \phi={(1+\sqrt{5})\over2}.  Here \phi is the well-known golden ratio. Its complexity is a constant, which suggests O(n) = O(n-1) = O(1).

Conclusion

Mathematics sometimes significantly decreases the complexity of the computation, in this case, from O(2^n) to O(1). It is also useful for us to estimate how much resources are needed for specific purpose.

****(1) Bulid user-defined functions *****************;
proc fcmp outlib = work.test.fibonacci;  
   * Create the 1st function based on recursion; 
   function fib1(n);
      if n<0 then return(0);
      else if n = 1 or n = 2 then return(1);
      else return(fib1(n-1) + fib1(n-2));
   endsub;
   * Create the 2nd function based on closed form; 
   function fib2(n);
       phi = (1 + sqrt(5))/2;
       return(round((phi**n - (1-phi)**n) / sqrt(5)));
   endsub;
quit; 

****(2) Run the tests sequentially********************;
options cmplib = work.test fullstimer;
%macro test();
   * Create datasets for verification
   %do j = 10 %to 40 %by 5;
      data _fib1_&j;
         do i = 1 to &j;
            x = fib1(i);
            output;
         end;
         %put Recursion method at &j;
      run;
      data _fib2_&j;
         do i = 1 to &j;
            x = fib2(i);
            output;
         end;
         %put Closed form method at &j;
      run;
   %end;
%mend;

* Output SAS log to text file;
proc printto log = "c:\tmp\test.log" new;
run;
%test()
proc printto;
run;

****(3) Display the results***************************;
proc datasets nolist;
   delete _:;
quit;

data one;
  infile "c:\tmp\test.log" truncover;
  input @1 line $80.;
  if index(line, 'real') > 0;
run;

data two;
   set one;
   length method $20.;
   retain number 5;
   * Ignore the time used by PROC PRINTTO;
   if _n_ > 1; 
   if mod(_n_ , 2) = 0 then do; 
      method = "Recursion"; 
      number + 5;
   end;
   else method = "Closed form";
   time = input(substr(line, 21, 5), 5.2);
run;

proc sgplot data = two;
   series x = number y = time / group = method;
   yaxis label = "Time elapsed by seconds" grid;
run;

Good math, bad engineering

As a formal statistician and a current engineer, I feel that a successful engineering project may require both the mathematician’s abilit...