Wednesday, July 9, 2008

Dealing with multiple pages of results easily

We you want to deal with multiples pages of responses and using MySQL, you are always dealing with some sort of "LIMIT" statement. Since erlang is a vm, you can efficiently retrieving results while the user is not asking you to... You can transparently in the background prefetch data for him. This is what I call a pager.

I've build simple module to that for me :

-module(db_pager).
-export([retrieve/1, loop/2]).

new(Request, Start, Step) ->
Prepared =
Pid = spawn(?MODULE, loop, [Start, Step]),
retrieve(Pid),

retrieve(Pid) ->
io:format("DEBUG: ~p ! {next, ~p}~n", [ Pid, self() ]),
Pid ! { next, self() },
receive
{eof, Msg} ->
io:format("eof: ~p~n", [Msg]),
{eof, Msg};

{data, Data} ->
io:format("Received: ~p~n", [Data]),
timer:sleep(3000),
retrieve(Pid);

stop ->
stop;

_E ->
io:format("Unhandled: ~p~n", [_E]),
client(Pid)
after 2000 ->
timeout
end.

select(Fields, Table) ->
[ Head | Rest ] = Fields,
[ <<"select ">>, atom_to_list(Head),
lists:foldl(fun(X, Acc) ->
[ [ <<",">> | atom_to_list(X) ] | Acc ]
end, [], Rest),
<<" from ">>, atom_to_list(Table) ].

sql(Current, Step) ->
Query = select([id,lastname,firstname], users),
Next = Current + Step,
case mysql:fetch(mysql, [
Query, <<" where id between ">>,
integer_to_list(Current), <<" and ">>, integer_to_list(Next) ]) of

{data, Data} ->
case mysql:get_result_rows(Data) of
[] ->
{eof, empty};

Res ->
%io:format("DEBUG: sql: data: ~p~n", [Data]),
{data, Res}
end;

{error, Data} ->
io:format("DEBUG: sql: data: ~p~n", [Data]),
{eof, mysql:get_result_reason(Data)}
end.

loop(Current, Step) ->
loop(Current, Step, []).

loop(Current, Step, []) ->
Res = sql(Current, Step),
loop(Current + Step, Step, Res);

loop(Current, Step, Res) ->
Next = Current + Step,
receive
{next, Who} ->
Who ! Res,
NewRes = sql(Current, Step),
loop(Next, Step, NewRes);

stop ->
stop

after 30000 ->
timeout
end.



This code build a sql query that contains some "BETWEEN" statement for some "id" field:

Query = select([id,lastname,firstname], users),
Next = Current + Step,
case mysql:fetch(mysql, [
Query, <<" where id between ">>,
integer_to_list(Current), <<" and ">>, integer_to_list(Next) ]) of


This part specifically builds a string like this:

select id,lastname,firstname where id between X and Y

Where X and Y are variable, X stands for the Current offset and Y the Next offset.

When the code is called multiple times the X and Y are automatically computed, so every row from the table will be returned one page (Y - X elems) at a time. The "loop/1" function is here for that:

loop(Current, Step, Res) ->
Next = Current + Step,
receive
{next, Who} ->
Who ! Res,
NewRes = sql(Current, Step),
loop(Next, Step, NewRes);

stop ->
stop

after 30000 ->
timeout
end.

The receiver Who will have informations one page at a time. The data sent to him is fetched before he needs to, except the first time:

loop(Current, Step, []) ->
Res = sql(Current, Step),
loop(Current + Step, Step, Res);

4 comments:

Unknown said...

What is client(Pid) in retrieve/1 supposed to call?

Antoine said...

Arg, you're right i've not pasted the "client/1" fun :), I reedit the post and fix this...

Andrew Dashin said...

Hi!

This is a bad idea to use BETWEEN :) For example you even can't be sure how many records do exist between 1 and 10.

Antoine said...

I see, you're right, I will fix and use Limit.

Sticky