Article Author: Frank Antonsen
Introduction
Functional Programming (FP) is getting more and more acceptance into the mainstream. Scripting languages such as Perl and Python have added more and more concepts from FP into their cores. Even the Standard Template Library (STL) of modern C++ includes many ideas taken from FP, and the most common library, Boost, contains even more. The increased interest in FP becomes very clear when you notice that both Sun and Microsoft are working on FP languages. Sun is working on a language called Fortress, which is a mixed FP and OOP language for highly demanding numerical calculations, for which people would at present typically use an old-timer like Fortran. And Microsoft very quickly ported a functional language to the .NET platform as part of their experiments with generics.
This programming language is called F# and I introduced you to the basics in the first article of the series. In that article I showed you how to define variables and functions with the let keyword and how to execute code with the do statement. I also showed you how to work with F# in interactive mode, which is great for trying out ideas.
The previous article covered only simple data types, such as lists and tuples (they are simple seen from F#, they are not all that simple in C#). In this article I will show you how to define more complicated data types. I will be doing this by working on a fairly simple file application: Given a directory name it constructs a tree of all files and subdirectories found within that root. This is an example of a Recursive Data Structure (RDS). Given this tree, the program will then use higher order functions to carry out operations on these files and directories. For instance, finding all files matching a given name or being of a certain size or type, and/or copying these to a back-up directory. If you have followed my earlier articles, you will have seen how to do this in C#. The F# solution is going to be considerably simpler.
System Requirements
To run this code, you should install the F# compiler from Microsoft first. It also comes with a plug-in for Visual Studio 2003/2005, which you should also install, if you plan to use VS. It will run on any computer with at least the .NET framework v 1.1 installed. The samples in this article are based on version 1.1.8 of F# released in January 2006.
However, if you want to use the interactive environment, fsi, you will need to use .NET v2.0.
Installing and Compiling the Sample Code
The sample download for this article contains a VS solution containing 2 projects: StringApplication and FileApplication. These correspond to the applications that were developed in the FP in C# series so-far. To install the applications, simply download it from the ASPToday site, unzip to a folder and you can open it within VS to compile it. Note that the sample download is identical to that for the first article of the series.
The F# Version of the FileApplication
The first major application in this series of articles on Functional Programming in C# was a function which could take a directory tree and perform some action on all the files found in it. To re-write this in F#, I must first of all be able to define the tree data structure. Normally, you would probably model a tree as a list of lists in F#, but I’ll use the safer method of defining a tree data type here. This increases type safety and illustrates an important facet of the language, but it will also lead to slightly more complicated code, although it won’t be complicated compared to the original C# solution.
Defining New Data Types in F#
Let me pause a minute and talk a bit more about how data types can be defined in F#. Modern programming is very data centric, and you will usually start a project by first analyzing the data structures involved. All data structures are defined in the same way in F#. This makes it easier to program in this functional language than in most other .NET languages. To define an enum-like data structure, for instance you write
type weekdays = Monday | Tuesday | Wednesday | Thursday | Friday
| Saturday | Sunday;
You match against these just like you do against any other data, for example:
let Frenchdays day =
match day with Monday -> "lundi" | Tuesday -> "mardi" | Wednesday -> "mercredi" | Thursday -> "jeudi" | Friday -> "vendredi" | Saturday -> "samedi" | Sunday -> "dimanche"
Note that, contrary to C#, such an enum-like data structure does not define a new type. In C# you would have to write weekdays.Monday but F# will automatically infers that Monday refers to the weekdays structure (at least if there are no ambiguities). This is the power of pattern matching for you.
Similarly, a struct-like data structure is defined like this:
type person = {
name : string;
age : int;
height : int;
shoesize: int;
}
Here I do need to explicitly declare the type of a field – there is no way for the compiler to infer what type I have in mind merely by looking at the name. To explicitly declare a type, F# uses a colon after the name followed by the name of the type. You can do this in any declaration, but usually you would prefer to leave that to the automatic type inference engine. You access the fields the same way as you would in C#, i.e. using the ~.‘-operator, e.g. supposing me is a person, I can write me.name and presumably get "Frank Antonsen" as the result. This data type is called a record. But it becomes more interesting when you start adding parameters to some of the types. In C#, you may have heard of nullable types. This is a new feature added to .NET 2.0. The idea is that some data may be either null or have a specific value, typically this could be data read from a SQL data base. The F# analogue of this could for instance be:
type floatNull = Nothing | Just of float
You would use it like this:
let tostring x = match x with
Nothing -> ""
| Just x -> x.ToString()
Here, I used the standard .NET ToString() method just to show you how easy it is to call the familiar .NET methods from F#. Since this is a fairly common thing to want, all ML-languages have a more elegant generic type defined for precisely this. This is known as an option. You specify that a variable is to be an option by adding the keyword option after the type, you could make the shoesize optional by writing shoesize : int option; in the person data structure above. This is a generic data type, formally defined by something like type ‘a option = None | Some of ‘a for any type. So to use this, you would have to replace Nothing with None and Just x with Some x in the function tostring() above.
Just like you would normally put type definitions into interfaces in C#, you would place type definitions in special .fsi (for F# interface) or .mli (for ML interface of course) files. Where you would use inheritance hierarchies of interfaces to provide a hierarchical structure for your own data types in C#, you would use modules in F#. These are the ones you include with the open statement.
Back to the Application – Working with Trees
Now that you know how to define all kinds of data types in F#, I can return to the original problem: writing an application that does something to all files under a given directory. Earlier, I actually gave a "pseudo-executable pseudo-code" version of the list data type, namely:
datatype list of t = null | (elem of t) :: list;
Similarly a binary tree would be defined as:
datatype btree of t = EmptyTree | Node of t * btree of t * btree of t
This means that a binary tree of any data type t, is either the empty tree, Null, or it is a node with an associated value of type t together with the two subtrees. This isn’t valid F# (it is a mixture of standard ML and Haskell), instead I would have to write:
type ‘t btree = EmptyTree | Node of ‘t * ‘t btree * ‘t btree
To turn this into a general tree, I must replace the pair of trees with a list, for example:
type ‘t tree = EmptyTree | Node of ‘t * ‘t tree list
Actually, you could represent a tree as a list of lists, but here I’ll stick to using a specific type. Since F # doesn’t implement all the OOP parts of OCaml, or the more advanced features of standard ML, I will then have to provide explicit implementations. In this respect, the C# implementation was perhaps more elegant than the pure F# one. But, as you will see, F# provides some extra benefits. The perfect solution would probably be a mixture of the two. Actually, the F# language is rapidly growing and new features are added to it on a regular basis, so by the time you read this, F# may very well have already implemented all of these missing features.
The two most important utilities from the C# file application project were the two population methods FileList() and DirTree(), returning a list of files and a directory tree respectively. Since the standard .NET library already comes with nice functions for obtaining a list of files or subdirectories, I’ll use these. This is also used to show you how seamlessly F# integrates into the rest of the .NET world. The code is as follows: (if you are using .NET v1.1, you will need to replace the Array with CompatArray):
open System.IO
let FileList dir = let files = Directory.GetFiles(dir) in Array.to_list files
This invokes the standard GetFiles() method, returning a string[], which is then converted to a F# list of strings. To write the DirTree() function, I need to be able to append a list to a tree as a list of children to a given node and attach a whole new subtree. The easiest way to do this is like this:
///<summary>
///basic tree manipulation functions
///</summary>
let lappend1 x tr = match tr with EmptyTree -> Node (x, []) | Node(x1,x2) -> Node (x1, Node(x, []) :: x2)
let lappend tr1 tr2 = match tr2 with EmptyTree -> tr1 | Node(x1,x2) -> Node (x1, tr1 :: x2)
let rec lappend_list lst tr = match lst with [] -> tr | x1 :: x2 -> lappend_list x2 (lappend1 ×1 tr)
let rec add_children lst tr = match lst with [] -> tr | x1 :: x2 -> add_children x2 (lappend x1 tr)
Notice, by the way, that F# supports XMLDocs just like C#.
There are (at least) four different ways of appending to a tree:
- Append a single value as a leaf
- Append an entire tree as a subtree
- Append a list of single values as leaves
- Append a list of trees as children
The four functions above correspond to each of these possibilities. Incidentally, this shows a weakness in pure FP languages: I cannot overload a function to have the same name but taking arguments of different types. In this case OOP polymorphism is superior to FO polymorphism, but you should expect the FP languages to catch up. With this out of the way, the DirTree() function is straightforward to write. It’s a single if statement in F#:
let rec DirTree dir =
if (Directory.Exists(dir)) then
let files = lappend_list (FileList dir) (lappend1 dir EmptyTree);
and subdirs = CompatArray.to_list
(Directory.GetDirectories(dir));
in
add_children (List.map DirTree subdirs) files;
else EmptyTree
Here, I’ve introduced two local variables files (a tree with the path as root and the list of files as children) and subdirs (a list of subdirectories). I could of course instead have put their definitions directly into the body of the function, but this would have made the code a little too terse for my taste. Notice, by the way, the keyword and between the two local definitions – this tells F# that these two definitions belong together in a single statement. Once, you have the basic data types defined and have a small number of general utility functions at your disposal, coding in an FP-style becomes easy, at least for this kind of problems of an inherently recursive nature.
To build the directory tree I just give the path as an argument to DirTree and I get a tree I can traverse or manipulate it as I see fit. In order to be able to do this in a simple manner, I need a higher order function for applying a given function to a recursive data structure. This is rather easy, although there is one subtlety involved. Let me first show you the code, and then address the subtlety:
(* applying a function to a list or a tree *)
let rec apply f lst = match lst with [] -> () | x1 :: x2 -> (f x1; apply f x2)
let rec apply_tree f tr = match tr with EmptyTree -> () | Node(x1,x2) -> (f x1; apply (apply_tree f) x2)
The subtlety involved is the return value in the very first case. Remember that in F# everything must have a value: Well the value () above is the special unit type analogous to C#‘s void. I mentioned this type earlier, but this is the first time I actually need it. As I said, you won’t need this type very often. This situation is special, because I’m really only applying functions for their side effects (for instance printing information) and am not interested in what the function does to the elements of the tree.
Now the fun begins. All I need to do now to make this into a practical application is plug in a function of my own and write apply_tree myfunc (DirTree path), to apply it to all files and subdirectories in the directory specified by the string path.
For instance, to simply list all files and directories below a given path, I can define a new function:
let listing path = apply_tree print_string (DirTree path)
Similarly, to be able to filter all entries, I need a filter function which takes a directory tree and returns a list of elements which match a given predicate. One way of doing this in F# is like this:
let rec filter p tr = match tr with
EmptyTree -> []
| Node(x, children) -> let rec filter2 p lst = match lst with
[] -> []
| x1 :: x2 -> (filter p x1) (filter2 p x2)
in
if p x then x :: (filter2 p children)
else filter2 p children
<p></pre><!-- @END —>
This is a generic function working with a tree of any type, not just a directory trees (which are trees of
strings). Its type is (‘a -> bool) -> ‘a tree -> ‘a list. As I have said previously, in FP you often re-factor common behavior out into a generic function.
Anonymous Functions
Recall, that when I did the original C# version of the file application, I had to define a lot of small helper functions, which were really only used once. For instance, whenever I wanted to pass a function to a higher order function, e.g. to find all files older than a given date, I had to define a function and declare it as a delegate of the proper type before I could pass it as an argument. Most of these functions were probably only ever needed once, so it seems like a waste of time and space (not to mention namespace) to have to define a global function each time. FP languages have always had a simpler solution, which, incidentally, has also finally found its way into .NET v2. I’m referring to the concept of anonymous functions. In computer science literature, these sometimes go under the name of lambda expressions, since this was the name given to them by Church in his original work in mathematical logics. I prefer the less fancy, but more meaningful name "anonymous function", so that’s the name I’m going to use here.
In a nutshell, an anonymous function is a function where you declare the function body but don’t bother give it a name. Hence, it won’t be globally accessible, but will be put up for garbage collection, when it goes out of scope. Let me start with a rather advanced example just to give you an indication of the power of this. Suppose I want to define a new higher order function, compose(), which given two functions, simply returns the function defined by the composition in the usual mathematical sense. Here’s how I would do it in F#
let compose f g = fun x -> g (f x)
Simple, isn’t it? What it says, in more or less plain English, is this: the composite of two functions is the function I get by first applying the one function to the argument, then applying the second function to the result. The fun keyword introduces an anonymous function. Actually, I can re-define any function as an anonymous function by writing let f = fun x -> {function_body}, but it is hard to come up with a convincing case where you would want to do that.
Because combining functions is such a common thing in FP, F# has a number of predefined operators for just that, e.g. my compose function above is the same as the operator >>.
Another interesting usage of anonymous functions is to bind arguments. Suppose you have a function of, say, two variables. This could be a function, let’s call it member(), testing for membership ‘a > ‘a list> bool. You might then want to define a function which tests for membership of a fixed list (this could be a list of important customers or allowed/banned URL’s). In F# it goes like this:
let isImportantCustomer =
fun cust -> member cust ListOfImportantCustomers
let isAllowed = fun url -> member url ListOfAllowedUrls
In the previous article I showed how you could also bind the first element using currying. With anonymous functions you can fix any element. This has shown how you could use an anonymous function as part of a definition. You can also use it as an argument and this is what I want to use it for in this application. The function filter() which I defined earlier, takes a predicate (i.e. a Boolean-valued function of one argument, ‘a -> bool) and a tree. It then creates a list of those elements of the tree for which the predicate is true. This is a prime example of a place where I would like to insert a function on-the-fly so to speak on a "use once and throw away" basis. For instance, say I want to find all files or directories with a name with fewer than 10 characters, then I’d just write:
filter (fun x -> if String.length x < 10 then true else false)
(DirTree path)
And there is no need to define a new global function just for performing this test. Actually, since every expression has a value in F#, I can abbreviate this to:
filter (fun x-> String.length x < 10) (DirTree path)
If you’ve been following the development of LINQ and the proposal for version 3 of the C# language, you will notice that not only is type inference F#-style being ported to C#, but also the functional-style anonymous functions. Version 2 of the .NET framework already brought us type inference for delegates and anonymous delegate definitions, but version 3 promises to go even further in the direction of F#. It will be interesting to see how F# will develop in the mean time.
Working with Classes and Interfaces
Now you’ve seen how to define enum and struct-like data types and how they can even be recursive. The next natural step is to consider classes and interfaces. There are two different cases here really:
- Using classes and interfaces defined in other parts of the .NET framework, for instance written in C# or VB.NET.
- Defining new classes and interfaces in F#, either for use internally in an F# program or for use in other .NET programs as a library.
You have already seen a few examples of how to use methods defined in the general .NET framework in F#. I have for instance used the string’s Length() method and Console.WriteLine(). But these were either static methods or methods on a basic data type (the string). In F#, you generally don’t specify the type of a variable but leave that to the type inference algorithm. This works well for simple types (int, bool, double, string) and for the simple types built on top of these as lists, tuples or records. But how can it work for classes?
This is one of the places where the F# language is developing rapidly, so I will only give a brief overview here. You should check with the release notes of the latest version of F# to see if anything has changed. This section should only be taken as an indication of where the language is heading.
In C++ and C# a class is very similar to a struct (a struct is really just a special case of a class that is allocated on the stack instead of the heap). And, as you have seen, the F# analog of a C# struct is the record. So my first step is going to be modification of those.
Let me extend the simple person record I defined earlier as an example. I want to save the first and last name in different fields and have a method returning the full name. In F#, I have at least two choices. Either I can include a function as a member returning the full name (just like a property in C#) or I can augment a record with such a function. The first go would be like this:
type person = { first: string; last: string; full: unit->string}
But this won’t really work. I would have to give an implementation of the function full() for each instance of the person record. What I want is a function which is shared between all instances, but I can’t provide such a definition as part of the record definition, at least not directly. The way to go is to augment the basic record, without the full() function with such a function. F# has a special key word for this – with. You first define an anonymous record and then augment it with the with-keyword, like this:
type person = { first: string; last: string;}
with member this.full = this.first ^ " " ^ this.last
end
Here the member keyword instructs F# to augment the following function as a member function (i.e. a method) to be called with the standard dot-notation. So, the first part, the part in braces, defines the pure data part, while the with-clause adds a member function or property. Now you’ve seen how to add a function whose implementation is shared among all instances of a record, it is not surprising that defining a class is very similar. To turn the person record above into a class, I can write the following definition:
type person =
class
val first : string
val last : string
new (fst, lst) = { first = fst; last = lst; }
member this.full = this.first ^ " " ^ this.last
end
Thus, a class definition is specified like this:
- I use the class-keyword (hardly surprising)
- Every attribute (member variable) is declared with the val-keyword and the type is given
- A constructor is defined by defining the function new
- Each method is declared using the member keyword
To create an instance of my person class, I do essentially the same as in C#: I call its new method:
let me = new person("Frank", "Antonsen")
do print_string me.full
Interfaces and Inheritance
To turn the person class into an interface, I need to replace the class-keyword with interface – again, no big surprise here – and mark all methods as abstract. This is probably what you would expect. The syntax is of course different from both C# and VB.NET, but by now you should be more accustomed to it. Here is what it looks like in F#:
type IPerson =
interface
abstract full : unit -> string
end
To write a class, NewPerson, that inherits this interface, I simply write:
type NewPerson =
class
val first : string
val last : string
new (fst, lst) = { first = fst; last = lst; }
interface IPerson with
member this.full = this.first ^ " " ^ this.last
end
You can inherit from as many interfaces as you like, just like you can in C#. To inherit from another class the syntax is only slightly different. Instead of writing interface .. with as above, I have to write inherit … . And then I have to mark each function as override much the same way I have to in C#. For instance, if I want to add an age property to my first person class, I can create a new class inheriting from it like this:
type person2 =
class
inherit person
val age: int
new (fst, lst, ag) = { inherit person(fst, lst); age = ag; }
end
Here, I didn’t override the full-method so there was no need for the with-clause. If I want to override a function in a base class, I will have to mark that function as abstract.
You can only inherit from one class in F# (like C#). Notice how concise the F# definitions are when compared to C# or VB.NET.
Accessing this and base
You may have noticed that the word this appears quite often in these definitions. Actually I’ve been cheating a bit. There is no such reserved word in F#. You can use any previously unused variable to denote this, in fact the examples in the HTML pages that come with the F# distribution often use simply x, and the compiler will automatically infer that it is a placeholder for an instance of the class or record you’re defining. If the variable name you decide to use has been used in the same scope, a warning will be returned. This is rather surprising, but it allows you to use any convention you prefer. For instance, if you come from a C++/C#/Java/J# background you can use this (as I do). If you come from the VB world you can use Me instead, while if you happen to come from Smalltalk you can use self. Programming doesn’t get much more flexible than that!
Actually, there is a good reason for this, although it is a rather subtle one. Consider the way you use this in C#. Here it denotes the present of an instance of the class, but it is used as part of the definition of the class, i.e. it is used before the type (the class) is completely known. This means that this is not a normal variable, and in languages with a this-keyword there are different restrictions on its use.
Looking at it in another way, this "just" means that a class definition is a recursive definition. And, as you have seen, F# is very good at handling recursive definitions.
In the last example, where I inherited from a base class, you saw an example of how to call a method in the base class. Inside the constructor, new, for person2, I called the constructor for the base class, person, by writing the function body as { inherit person(fst, lst); age = ag; }. This was okay because I only had a simple place in which to call a method in the base class. But if I need to call base class methods in several different locations, this notation becomes cumbersome. Here, F# provides a convenient short-cut. I can inherit from a base class and assign a name to it, for example base, once and for all in the definition of the inherited class. Neither base nor this are reserved words in F#, so you can use any naming convention you please. The syntax is like this
type NewClass =
class
inherit person as base
…
end
Object Expressions – Classes on the Fly
The last thing about OOP in F# I want to mention here is the concept of object expressions. Essentially, these are anonymous classes allowing you to inline class implementations; very much like you can inline function or delegate definitions with anonymous functions. Obviously, you would normally only want to use this feature for small, light-weight classes.
A straightforward example is implementing a comparer class. The .NET framework provides you with the IComparable and IComparer interfaces which you can use in various sorting methods. These are normally fairly light-weight, and in an FP-world you would have used a comparer-function instead, but .NET is not a pure FP world, so we have to use an object implementing a simple interface. Suppose I want to sort a .NET array of person objects based on the surnames, The IComparer object doing this is:
{new IComparer with Compare(a,b) =
let alast = (unbox a).last and blast = (unbox b).last in
if alast < blast then 1 else if alast > blast then -1 else 0}
Here, the code in braces defines an anonymous object of the IComparer type. This can then be plugged in directly anywhere where an IComparer instance is used. I introduced two local variables, alast and blast, to show you how to incorporate any kind of F# code in such an anonymous object, and secondly, how to unbox in F#. This is important when you want to use C# types from F# or vice-versa. As you have seen, F# is very type-safe, and there are no implicit conversions. Thus no implicit boxing and unboxing either, unless you invoke standard .NET methods taking objects as their argument. This is actually very healthy, it forces you to explicitly tell the compiler that you are aware of losing type safety here by telling it explicitly to box or unbox a variable. This is done with the box and unbox commands, which are part of the F# core. You can think of it as similar to the unsafe construct in C#, which allows you to use older C++ idioms, even though C# is designed to be safer.
Other F# Features
Before I go, I ought to mention that because of space considerations, there are a lot of F# features I didn’t cover. Operator overloading, for instance. In C# you have a very limited list of operators you can overload. In F# you can re-define any operator, and even define new ones yourself. There is also much more to be said about the type system – constraining allowed types in expressions for example.
Furthermore, I didn’t cover mutual recursion. In F# both data types and functions can be mutually recursive, they can be defined in terms of each other. Even classes can be mutually recursive. This is a very powerful feature but you will probably not be using it unless you’re very familiar with FP in general and F# in particular.
Finally, one big omission is how to do GUI programming in F#. Now that you have a fully operative FP language available to you in the .NET platform, you would like to re-use at least the standard classes such as the WinForms. I did touch briefly on this, and although the F# syntax for this is very simple and rather elegant, it is also an area in which the new language is developing, and thus changing, rapidly.
The way you can work with OOP concepts, in particular how you can constrain types is likely to change going forward. This is a related area where a lot has happened in the FP community. Since modern functional languages uses type inference to automatically let the compiler figure out the most general type which can be assigned to an expression, there obviously may be cases where you would like to further constrain the allowed types. This is a bit like the type constraints which have been put into .NET generics, but in FP, all expressions are inherently generic, of course. However, this opens up some very interesting possibilities for multi-paradigm programming where you program part of an application in C# (or VB.NET for that matter), for instance the core GUI layer, while you program other parts in F#, for example a textual user interface (TUI) or some kind of artificial intelligence (AI) layer. The entire subject of FP-OOP interoperability is fascinating, and you can already begin to find examples of F#-C# interoperability on the F# wikis and newsgroups.
Conclusion
This was a short tour of F#. I showed you how to rewrite the C# applications discussed in earlier articles in F#. To do this, I used many of the new language’s features, such as local functions, guarded expressions, tuples and user defined data types. You also saw some of the built-in functions and how they can help you re-write your code. I also showed you how to define anonymous functions and object expressions, which are essentially anonymous classes.
All of this has been about compiling F# programs, but there is much more to the language than just a compiler. It also comes with an interactive interpreter, fsi.exe, which I mentioned in the first F# article, as well as its own profiling tools. Speaking of profiling, I haven’t covered how F# compares to C# when it comes down to raw performance, but the language is very, very fast, as you will no doubt discover this for yourself.
This two part series has hopefully given you a good idea of what F# is like and how to use it for solving the problems you face in your daily programming. If you look around at the source code distributed together with the F# compiler, you will find many very advanced examples as well. These are very interesting, because many of these demonstrate problems for which the corresponding code in C# would be very hard to write. It’s also worth mentioning that F# and ML in general, have their roots in compiler construction, so it is not surprising that the language is particularly powerful when you write parsers.

