Tuesday, August 18, 2009

erlang: testing many conditions easily with lists of funs...

Sometimes you have to test many things before being able to choose the next action...

In many languages, you'll end up using a bunch of "if then else".
But in erlang, and the power of fun()s, you can efficiently write a simple function that will do all the job for you :p

Here is our purpose: call many functions with one argument.
For this example, we need to determine a file type with its filename.

Let's say that:
- the filename could be a valid 'word' temporary file,
- or a 'excel' temporary file or
- a known file type.

Firts let's define a simple fun that takes a list of fun and stop evaluating those fun as soon as a result is found:
% the simple case where the list is empty
any(_, []) -> undefined;

% the general case when the list contains funs...
any(Arg, [ {F, PrepareFun} | Funs] ) ->
        case F( PrepareFun(Arg) ) of
                undefined ->
                        any(Arg, Funs);

                _V ->
                        _V
        end.

In this code you'll notice that there are two fun()s:
- the 'F',
- the 'PrepareFun'.
The idea is that 'PrepareFun' will be called before calling 'F' to filter the argument 'Arg'.
Imagine that sometimes you need to extract the basename from the filename, or whatever else...

The code is a simple list iteration that recurse only if the result of the function call is 'undefined'.

Now that we have a valid fun that can iterate over a list of funs and stop whenever a valid result is found (or end of list), let's get back to our example, and build our 'filetype' function:
fileType(File) ->
        any( File, [ 
                        {fun word_temp/1, fun filename:basename/1}, 
                        {fun db_extension/1, fun lists:reverse/1},
                        {fun excel_temp/1, fun filename:basename/1}
                ]).

You can read this code like this:
"any of the funs from the list may determine the type of the file".
And once found, stops.

Let's describe those called functions 'db_extension/1', 'word_temp/1', 'excel_temp/1'...

First 'db_extension':
You'll notice that we test only the end of the filename, that's why the filename is
reversed before being passed to the function:
db_extension( "pmt."  ++ _ ) -> temp;
db_extension( "PMT."  ++ _ ) -> temp;
db_extension( "xcod." ++ _ ) -> doc;
db_extension( "cod."  ++ _ ) -> doc;
db_extension( "xslx." ++ _ ) -> xls;
db_extension( "slx."  ++ _ ) -> xls;
db_extension( "xtpp." ++ _ ) -> ppt;
db_extension( "tpp." ++ _ ) -> ppt;
db_extension( _ ) -> undefined.


The 'word_temp/1' need to call the basename of the file but we don't need the full path, so 'PrepareFun' is simply 'filename:basename/1' in this case:
word_temp( "~$"   ++ _) -> temp;
word_temp( "~WRD" ++ _) -> temp;
word_temp( "~WRL" ++ _) -> temp;
word_temp( _ ) -> undefined.


For 'excel_temp/1', the temp file is determined by a number written as 8 hexadecimal values. We use the re module to easily match this with the filename. In this case the 'PrepareFun' is also the 'filename:basename/1':
excel_temp( File ) ->
        ReList = [ <<"^[0-9A-Z]{8}$">> ],
        do_re(File, ReList).
        
% We are able to test many re but in the specific 
% case the list contains only one element...
do_re(_, []) -> undefined;
do_re(Subject, [ Re | Rest ]) ->
        case re:run(Subject, Re, [{capture,none}]) of
                nomatch ->
                        do_re(Subject, Rest);

                match ->
                        temp
        end.

From the re module, options "capture none" is used to only returns if the re match, and not the part that successfully match...
(this is simple optimisation, since we don't care about the matching part)

If we look at back at what we've done here, we can see that
fileType(File) ->
        any( File, [ 
                        {fun word_temp/1, fun filename:basename/1}, 
                        {fun db_extension/1, fun lists:reverse/1},
                        {fun excel_temp/1, fun filename:basename/1}
                ]).

can really easily extended with other functions, as long as those new functions take only one parameter...
fileType(File) ->
        any( File, [ 
                        {fun word_temp/1, fun filename:basename/1}, 
                        {fun db_extension/1, fun lists:reverse/1},
                        {fun excel_temp/1, fun filename:basename/1},
                        {fun firefox_temp/1, fun filename:basename/1},
                        {fun directory_temp/1, fun(X) -> X end}
                ]).

Conclusion:
Building list of functions is an efficient way of "testing many conditions".

1 comment:

Unknown said...

Good Stuff!

Want an Erlang Syntax Highlighter for your blog? Try this:
http://jldupont.blogspot.com/2009/06/erlang-syntax-highlighter.html

Sticky