Once you go functional, you can never go back

Intro

Being an imperative (procedural, object-oriented), von Neumann style programmer for many years now, I thought there was nothing better out there. Few months ago, I came across a new paradigm (to me, at least) called functional programming. Ever heard of LISP? I instantly liked it. Of course, I still do most of my work in imperative languages, but, slowly, I'm making the switch, at least for simpler programs. In this article, I will give you a small theoretical introduction to functional programming and state some arguments on why I think functional programming is, in some aspects, better than imperative. There are, of course, aspects, where imperative programming is unbeatable, and I will try to cover those too. I will not focus on only one language, but two a lil' bit different functional languages called Scheme (LISP dialect) and Haskell (mainstream functional language with a lot of potential). Some features may not be available in both.

Introduction to functional programming

Functional programming languages fall under the declarative languages category, meaning algorithms are written in form of relations between steps, not step by step, as in imperative style. Let's take a look at the following example, to make it clear.

Fibonacci sequence in C++
#include <iostream>

using namespace std;

int main() {
	int num = 0;

	cin >> num;

	int i = 0;

    unsigned long a = 1;
    unsigned long b = 0;

    unsigned long fib = 0;

    while(i++ < num) {
    	b = fib;

        fib = a + b;
        
        cout << fib << endl;

        a = b;
    }
	return 0;
}

Fibonacci sequence in LISP-oid Scheme
(define (fib n)
	(cond 
	   ((= n 0) 0)
	   ((= n 1) 1)
	   (else (+ (fib (- n 1)) (fib (- n 2))))
          )
)

Please note that Scheme implementation is a lot worse complexity-wise than it's C++ counter-part. You should never use pure recursion in this form as it's time complexity is exponential, O(2^n) in worst-case scenario. This is just an example of simple notation. It's complexity can be improved a lot (reaching C++) by accumulating values and passing them through recursive calls.

As you can see, notation in functional languages is much closer to classic mathematical notation.

Functional languages (FL) features

In order to call a language purely functional, it has to have 6 distinct features. I will give a short description about each.

Edit: I, in no way, claim that all functional languages have these 6 features. The features of functional programming languages are referenced from a book called "Introduction to Scheme". These are the feature FLs should posess, but, usally, that's not the case (not even in my examples, Scheme and Haskell).

Lazy evaluation and non-strict semantics

Lazy eval, for short, is one of the most important features of pure FL. Basically, nothing is calculated/evaluated until it's really required. If it's not required, it will not be calculated at all, simple as that. Sounds quite logical, don't you think? Lazy eval allows lovely things like creation of infinite structures (i. e. list of cardinals) and frees the programmer from lots of unnecessary if clauses.

In pure FL, all operators and all evaluations are non-strict, meaning that an operator can succeed even if one of it's operands is not initialized and is not needed. This happens because of the previously described lazy eval feature. This is usually present in most imperative languages as well (&&, || operators in C-like languages). The difference is that, every expression in FL is non-strict, not just logic operators.

Functions as first class citizens

Functions in FL are just like any other data/objects. They can be passed to other functions as parameters, they can be return values of functions and they can be a part of any data structure (tuple, list). This allows creation of higher-order functions (functions that manipulate other functions).

No side effects

Side effects that are possible in a lot of imperative languages are nonexistent in FL. That's also quite logical, when you think about it. In mathematics, f(x) is always equal to f(x), so why not make it so in programming, as well? Controlled side effects are possible in some FL.

Static binding

All variables are allocated at compile time and are statically bound. No need to go into more details.

No explicit flow control

As I've said in the beginning, writing an algorithm in a FL, is not a matter of writing it step by step, line by line, but rather determining relations. Therefore, you can't directly control the program flow, as it's not just from top to bottom. Controlling the flow is only possible through function calls/expression evaluation.

No commands/procedures

Everything in FL is an expression and should be treated that way. Even the whole program itself is just an expression. An expression that needs to be evaluated. Therefore, there are no explicit procedures, just pure expressions. Since the program is an expression, it's execution is just expression evaluation.

The only one important thing left to say is that FL are based on Lambda calculus by Church. Going into details about it is not the point of this article, but it's and interesting read, and I recommend it.

Enough of theory, let's get to the point.

The pros

Productivity

The thing that I most adore about FL is that algorithms are usually a lot shorter and a lot more elegant. Shorter codes with less typing imply better productivity.

Notation

As you've seen in an example above, notation in is very close to classic mathematical notation. This is actually a nice thing (in my opinion), because it enables you to implement algorithms directly from their definition. Let's take a look at a merge sort algorithm written in Scheme, and in Haskell.

MS in Scheme
(define (mergesort x)
  (if (= 0 (length x))
      '()
      ;else
      (if (= (length x) 1)
          x
          ;else
          (combine (mergesort (firstHalf x (/ (length x) 2))) (mergesort (lastHalf x (/ (length x) 2)))
                   )
          )
      )
)
(define (combine list1 list2)
  (if (null? list1) list2
      ;else
      (if (null? list2) list1
          ;else
          (if (<= (car list1) (car list2))
              ;car of list 1 is second element of list 2
              (cons (car list1) (combine (cdr list1) list2))
              ;else
              (cons (car list2) (combine list1 (cdr list2)))
      )
          )
      )
        )
 
(define (firstHalf L N)
  (if (= N 0)
      null
   )
  (if (or (= N 1) (< N 2)) 
      (list (car L))
      ;else
      (cons (car L) (firstHalf (cdr L) (- N 1)))))
 
(define (lastHalf L N)
  (if (= N 0) L); Base Case
  (if (or (= N 1) (< N 2))
      (cdr L)
      ;else
      (lastHalf (cdr L) (- N 1)))
      )

MS in Haskell
sort :: Ord a => [a] -> [a]
sort []         =  []
sort [x]        =  [x]
sort xs         =  merge (sort ys) (sort zs)
  where (ys,zs) = split xs
 
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys)
      | x<=y      = x : merge xs (y:ys)
      | otherwise = y : merge (x:xs) ys
 
split []        =  ([], [])
split [x]       =  ([x], [])
split (x:y:xs)  =  (x:l, y:r)
  where (l, r)  =  split xs

As you can see, we can follow the definition directly (split the list in two, sort both halves, merge both halves). Implementation of such algorithm in an imperative language requires a lot more work, and the code is a lot less elegant.

MS in C++
void merge(int a[], const int low, const int mid, const int high)
{
        // Variables declaration. 
        int * b = new int[high+1-low];
        int h,i,j,k;
        h=low;
        i=0;
        j=mid+1;
        // Merges the two array's into b[] until the first one is finish
        while((h<=mid)&&(j<=high))
        {
                if(a[h]<=a[j])
                {
                        b[i]=a[h];
                        h++;
                }
                else
                {
                        b[i]=a[j];
                        j++;
                }
                i++;
        }
        // Completes the array filling in it the missing values
        if(h>mid)
        {
                for(k=j;k<=high;k++)
                {
                        b[i]=a[k];
                        i++;
                }
        }
        else
        {
                for(k=h;k<=mid;k++)
                {
                        b[i]=a[k];
                        i++;
                }
        }
        // Prints into the original array
        for(k=0;k<=high-low;k++) 
        {
                a[k+low]=b[k];
        }
        delete[] b;
}
 
void merge_sort( int a[], const int low, const int high )               // Recursive sort ...
{
        int mid;
        if( low < high )
        {
                mid = ( low + high ) / 2;
                merge_sort( a, low, mid );
                merge_sort( a, mid + 1, high );
                merge( a, low, mid, high );
        }
}

* Algorithms were taken from wikibooks.org

Pattern matching (Haskell only)

Instead of writing a lot of ifs for different situations, we can just use pattern matching and let the language determine which one to use. The example below shows how easy it is to use it.

List reverse in Haskell
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]

ZF expressions (Haskell only)

ZF expression (introduced in KRC) are one of my favorite features. ZF stands for Zermelo-Fraenkel, based on their set theory. ZF expressions basically allow programmers to use set notation used in mathematics. Let's take a look at the quick sort algorithm implemented with ZF expressions.

qs []    = []
qs [x]    = [x]
qs (x:xs) = quick [ u | u <- xs, u < x ] ++ [x] ++ quick [ u | u <- xs, u >= x ]

Woah, only three lines? How many lines can you get in an imperative language? Take a look below, it is a lot longer.

void qsort(int a[], int lo, int hi) {
  int h, l, p, t;

  if (lo < hi) {
    l = lo;
    h = hi;
    p = a[hi];

    do {
      while ((l < h) && (a[l] <= p)) 
          l++;
      while ((h > l) && (a[h] >= p))
          h--;
      if (l < h) {
          t = a[l];
          a[l] = a[h];
          a[h] = t;
      }
    } while (l < h);

    a[hi] = a[l];
    a[l] = p;

    qsort(a, lo, l-1);
    qsort(a, l+1, hi);
  }
}

Something similar can be achieved in Scheme as well using filter higher-order function.

(define (qs l)
  (if (null? l)
      '()
      (append (qs (filter (lambda (x) (<= x (car l))) (cdr l))) (list (car l)) (qs (filter (lambda (x) (> x (car l))) (cdr l))))
  )
)

Mistake chance reduction

Since there's no real flow control and no real variables, chance of producing errors converges to zero. Also, the chance is reduced by eliminating side effects.

Different problem solving way

Solving problems in FL is, in some way, easier than in imperative, mostly because of the possibility of purely mathematical notation. This, of course, is rather personal, and should be taken as both a pro and a con, since it can be either of that, depending on the programmer.

Lambda functions

Lambda functions are getting more and more attention in imperative programming as well nowadays. For example, they were reintroduced in Java 1.8. Lambda functions/expressions are, simply said, small functions (rarely, they can be big) without a name. Their best application is inline usage, where defining a separate function would be an unnecessary thing to do. The following example shows how it's defined.

Lambda in Scheme
(lambda (x y) (+ x y y)) ; (f(x, y) = x+y+y

Lambda in Haskell
 x y -> x + y + y

As I've said, lambdas are also available in Java.
Lambda in Java
(x, y) -> { return x + y + y };

Higher-order functions

As described above, higher-order functions are used to manipulate other functions. They can take functions as parameters, and they can return functions as well. Depending on implementation, in some FL, all functions are Curry's, and in some you have to define them as Curry's. Some popular examples are Map and Reduce (by Reduce, I'm referring to foldr) functions.

Map in Scheme
(define (map f l)
  (if (null? l)
      '()
      (cons (f (car l)) (map f (cdr l)))
  )
)

Map in Haskell
map :: (a -> b) -> [a] -> [b]  
map _ [] = []  
map f (x:xs) = f x : map f xs


Reduce in Scheme
(define (reduce f b l)
  (if (null? l)
      b
      (f (car l) (reduce f b (cdr l)))
  )
)

Reduce in Haskell
foldr f z []     = z 
foldr f z (x:xs) = f x (foldr f z xs) 

Since this is impossible in imperative languages, I can't give you a comparison.

Edit: After publishing this article, it was brought to my attention that I was purely wrong on this one. These are possible in most imperative languages as well, yet they are not used so often. I apologize.

The cons

Everything has its bad sides, that's just how it is. Let's go on through some of the most prominent.

No iteration

In FL, everything is achieved by using recursion and there's no iteration in any kind (at least it shouldn't be). This can be a problem for an imperative programmer. Letting go of loops may be the harder thing for me. Some things are just harder to solve only with recursion.

No flow control

Even though I talked about this as a pro, it's also a con. Having no control over the program flow can sometime make you feel rather limited.

Outro

Even though functional languages may not be applicable to everything, it's still a fun paradigm to learn and I recommend it. I'm not, in any way, saying that you should give up everything you know and switch, as that's not something that I'm willing to do either. I'm saying that for simpler algorithms, programming in functional paradigm can make you a lot more productive by eliminating some of the bad elements of imperative style.

Edit: After publishing the article, it received lots of negative thoughts. I just want to say, that this is not an anti-imperative post. I, myself, am an imperative programmer and I'm not giving up imeprative style for functional. The whole point of this article was a small theorethical introduction to functional languages and functional programming with some examples of interesting features unavailable/rarely used in imperative style. I, once again, apologize for the couple of mistakes I've made.