EazyErl !: Monitoring your servers with sysstat (sar)

There's sometimes things that are so helpfull that you think that everyone is aware of them, but sometimes this is not the case. Here I'll talk about a little package that is so powerful and efficient that you won't change anymore...
Taken from the ubuntu man page:


DESCRIPTION
       The sar command writes to standard output the contents of  selected  cumula-
       tive activity counters in the operating system. The accounting system, based
       on the values in the count and interval parameters, writes  information  the
       specified  number of times spaced at the specified intervals in seconds.  If
       the interval parameter is set to zero, the sar command displays the  average
       statistics  for the time since the system was started. The default value for
       the count parameter is 1. If its value is set to zero, then reports are gen-
       erated continuously.  The collected data can also be saved in the file spec-
       ified by the -o filename flag, in  addition  to  being  displayed  onto  the
       screen.  If filename is omitted, sar uses the standard system activity daily
       data file, the /var/log/sysstat/sadd file, where the dd parameter  indicates
       the  current  day.   By  default  all the data available from the kernel are
       saved in the data file. Exceptions are interrupts and disks data, for  which
       the  relevant  options  must  be explicitly passed to sar (or to its backend
       sadc ) when the data file is created (see options below).

"sar" comes with the sysstat package. Once it's installed you can monitor your server like never before...

Here's the description of the sysstat package from the author

The sysstat utilities are a collection of performance monitoring tools for Linux. 
These include sar, sadf, mpstat, iostat, pidstat and sa tools. Go to the Features page to display 
a list of sysstat's features, or see the Documentation page to learn some more about them.

For example, you can watch realtime the network usage:


# sar -n DEV 1 0
Linux 2.6.22-15-generic (xXxXx)  07/09/2008

11:26:36 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
11:26:37 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:26:37 AM      eth0      5.05      0.00      0.86      0.00      0.00      0.00      0.00

11:26:37 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
11:26:38 AM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
11:26:38 AM      eth0      4.00      0.00      0.45      0.00      0.00      0.00      0.00
...

Today, I'll introduce the erlang-sar package that's able to retrieve information from the sar command.

The application is composed of a collector "sar_collector", a helper module "sar_values" and the main module "sar".
Here comes a quick sample session:


% Starting the collector
sar_collect:start().

% Retrieving the data
sar:stats(cpu).
[{cpu,idle,<<"98.62">>},
 {cpu,steal,<<"0.00">>},
 {cpu,iowait,<<"0.00">>},
 {cpu,system,<<"0.18">>},
 {cpu,nice,<<"0.00">>},
 {cpu,user,<<"1.20">>}]

% Retrieving more data
sar:stats([cpu,mem]).
[{swap,swpcad,<<"33236">>},
 {swap,usage,<<"64.72">>},
 {swap,used,<<"389872">>},
 {swap,free,<<"212492">>},
 {mem,kbcached,<<"84496">>},
 {mem,kbbuffers,<<"63408">>},
 {mem,memused,<<"98.78">>},
 {mem,kbmemused,<<"508984">>},
 {mem,kbmemfree,<<"6308">>},
 {cpu,idle,<<"97.83">>},
 {cpu,steal,<<"0.00">>},
 {cpu,iowait,<<"0.75">>},
 {cpu,system,<<"0.20">>},
 {cpu,nice,<<"0.00">>},
 {cpu,user,<<"1.22">>}]

The module "sar_values" also export an "extractor" function that can be used to build fun()s:


% build a Mem fun()
Mem = sar_values:extractor(mem).

% Calling Mem fun() on sar:stats()
Mem(sar:stats([cpu,mem])).
[{kbcached,<<"84496">>},
 {kbbuffers,<<"63480">>},
 {memused,<<"98.77">>},
 {kbmemused,<<"508976">>},
 {kbmemfree,<<"6316">>}]

% Calling it on sar:stats()
Mem(sar:stats()).
[{kbcached,<<"84496">>},
 {kbbuffers,<<"63520">>},
 {memused,<<"98.80">>},
 {kbmemused,<<"509100">>},
 {kbmemfree,<<"6192">>}]

With this package you have access to all the data sar can export for you.
Here's the "sar.erl" code:


-module(sar).

-export([systat/0, stats/0, stats/1, option/1]).
-export([extract/1]).
-define(OPTIONS, "-u -r -v -c -q -n DEV").
-define(DATA, "/tmp/last").

systat() ->
 Cmd = "sadf " ++ ?DATA ++ " -- " ++ ?OPTIONS,
 execute(".", Cmd).

stats() ->
 Cmd = "sadf " ++ ?DATA ++ " -- " ++ ?OPTIONS,
 {ok, _, Bin} = execute(".", Cmd),
 extract(Bin).

stats(List) when is_list(List) ->
 Args = lists:foldl(fun(X, Acc) -> case option(X) of
     error ->
      Acc;
     T ->
      [ $ , T | Acc ]
    end end, [], List),
 Cmd = "sadf " ++ ?DATA ++ " -- " ++ lists:reverse(Args),
 {ok, _, Bin} = execute(".", lists:flatten(Cmd)),
 extract(Bin);

stats(Elem) ->
 stats([Elem]).

option(cpu) ->
 "-u";
option(disk) ->
 "-d";
option(sock) ->
 "-n SOCK";
option(eth0) ->
 "-n DEV";
option(eth1) ->
 "-n DEV";
option(eth2) ->
 "-n DEV";
option(proc) ->
 "-c";
option(run) ->
 "-q";
option(mem) ->
 "-r";
option(inode) ->
 "-v";
option(switch) ->
 "-w";
option(swaping) ->
 "-W";
option(_) ->
 error.

execute(_Host, Cmd) ->
 Port = open_port({spawn, Cmd}, [ exit_status, binary ] ),
 wait(Port, []).

wait(Port, Content) ->
 receive 
  {Port, {data, BinData}} ->
   %error_logger:info_msg("dump:~n~p~n", [BinData]),
   NewContent = [ BinData | Content ],
   wait(Port, NewContent);

  {Port, {exit_status, Status}} ->
   %error_logger:info_msg("exit_code: ~p~n", [Status]),
   {ok, Status, Content};

  {Port, eof} ->
   %error_logger:info_msg("Port closed"),
   port_close(Port),
   {ok, eof, Content};

  {Port, exit} ->
   error_logger:info_msg("Received : ~p~n", [Port]),
   Content
 end.

extract(Bin) ->
 sar_values:extract(iolist_to_binary(Bin)).

You can see the "option/1" function that let you convert atoms into command line arguments easily. I use also this function to test if sar is able to handle a specific parameter. For example and with the help of my webservice I can query remote stats easily:

http://monitoring.lan/stats/q/cpu/servername

Here's the "sar_collect" module


-module(sar_collect).

-export([systat/1, sartime/1, start/0, start/1]).
-export([extract/1]).
        spawn(?MODULE, systat, []).

start(Seconds) ->
        spawn(?MODULE, systat, [Seconds]).

% update the file every second        
systat(0) -> 
        loop(1);

systat(Seconds) ->
        loop(Seconds).

%update the file every 59 seconds
systat() ->
        loop(59).

loop(Seconds) when Seconds < 60 ->
 Cmd = lists:flatten([ "sar -o /tmp/last.tmp ", integer_to_list(Seconds),  " 1" ]),
 execute(".", Cmd),
 file:rename("/tmp/last.tmp", "/tmp/last"),
 timer:sleep(60 - Seconds),
 receive
  stop ->
   exit(normal);

  {interval, NewSeconds} ->
   loop(NewSeconds);

  _A ->
   loop(Seconds)

 after 0 ->
  loop(Seconds) 

 end;

%default update 20 seconds (arbitrary chosen)
loop(_Seconds) ->
 loop(20).

execute(Host, Cmd) ->
 Port = open_port({spawn, Cmd}, [ {cd, Host}, exit_status, binary ] ),
 wait(Port, []).

wait(Port, Content) ->
 receive 
  {Port, {data, _BinData}} ->
   wait(Port, Content);

  {Port, {exit_status, _Status}} ->
   ok;

  {Port, eof} ->
   port_close(Port),
   Content;

  {Port, exit} ->
   error_logger:info_msg("Received : ~p~n", [Port]),
   Content
 end.

Finally there is the "sar_values" source code:


-module(sar_values).

-export([extract/1, extractor/1, sort/1]).
-export([parse/1, parse_value/2]).

extract(Bin) ->
 extract(Bin, []).

extract(Bin, Stats) ->
 case parse(Bin) of
  {Class, Type, Rest} ->
   %io:format("~p.~p", [Class, Type]),
   case  parse_value(Rest, <<>>) of
    {more, Value, More} ->
     NewStats = [ {Class, Type, Value} | Stats ],
     extract(More, NewStats);

    {eof, Value} ->
     NewStats = [ {Class, Type, Value} | Stats ],
     NewStats 
   end;

  eof -> 
   Stats 
 end.

parse(<<"%user", Rest/binary >>) -> {cpu, user, Rest};
parse(<<"%nice", Rest/binary>>) -> {cpu, nice, Rest};
parse(<<"%system", Rest/binary>>) -> {cpu, system, Rest};
parse(<<"%iowait", Rest/binary>>) -> {cpu, iowait, Rest};
parse(<<"%steal", Rest/binary>>) ->  {cpu, steal, Rest};
parse(<<"%idle", Rest/binary>>) -> {cpu, idle, Rest};

parse(<<"kbmemfree", Rest/binary>>) -> {mem, kbmemfree, Rest};
parse(<<"kbmemused", Rest/binary>>) -> {mem, kbmemused, Rest};
parse(<<"%memused", Rest/binary>>) -> {mem, memused, Rest};
parse(<<"kbbuffers", Rest/binary>>) -> {mem, kbbuffers, Rest};
parse(<<"kbcached", Rest/binary>>) -> {mem, kbcached, Rest};

parse(<<"kbswpfree", Rest/binary>>) -> {swap, free, Rest};
parse(<<"kbswpused", Rest/binary>>) -> {swap, used, Rest};
parse(<<"%swpused", Rest/binary>>) -> {swap, usage, Rest};
parse(<<"kbswpcad", Rest/binary>>) -> {swap, swpcad, Rest};

parse(<<"dentunusd", Rest/binary>>) -> {inode, dentryunused, Rest};
parse(<<"file-sz", Rest/binary>>) -> {inode, fileopened, Rest};
parse(<<"inode-sz", Rest/binary>>) -> {inode, inodes, Rest};
parse(<<"super-sz", Rest/binary>>) -> {inode, super, Rest};
parse(<<"%super-sz", Rest/binary>>) -> {inode, superusage, Rest};
parse(<<"dquot-sz", Rest/binary>>) -> {inode, dquotsz, Rest};
parse(<<"%dquot-sz", Rest/binary>>) -> {inode, dquotszusage, Rest};
parse(<<"rtsig-sz", Rest/binary>>) -> {rtsig, count , Rest};
parse(<<"%rtsig-sz", Rest/binary>>) -> {rtsig, usage, Rest};

parse(<<"totsck", Rest/binary>>) -> {sock, total, Rest};
parse(<<"tcpsck", Rest/binary>>) -> {sock, tcp, Rest};
parse(<<"udpsck", Rest/binary>>) -> {sock, udp, Rest};
parse(<<"rawsck", Rest/binary>>) -> {sock, raw, Rest};
parse(<<"ip-frag", Rest/binary>>) -> {sock, ipfrag, Rest};

parse(<<"runq-sz", Rest/binary>>) -> {procs, running, Rest};
parse(<<"plist-sz", Rest/binary>>) -> {procs, total, Rest};

parse(<<"ldavg-15", Rest/binary>>) -> {load, min15, Rest};
parse(<<"ldavg-1", Rest/binary>>) -> {load, min1, Rest};
parse(<<"ldavg-5", Rest/binary>>) -> {load, min5, Rest};

parse(<<"pswpin/s", Rest/binary>>) -> {swaping, pswpin, Rest};
parse(<<"pswpout/s", Rest/binary>>) -> {swaping, pswpout, Rest};

parse(<<"l0", Rest/binary>>)   -> parsebis(Rest, l0);
parse(<<"eth0", Rest/binary>>) -> parsebis(Rest, eth0);
parse(<<"eth1", Rest/binary>>) -> parsebis(Rest, eth1);
parse(<<"eth2", Rest/binary>>) -> parsebis(Rest, eth2);

parse(<<>>) -> eof;
 
parse(Bin) ->
 {_, Next} = split_binary(Bin, 1), 
 parse(Next).

parsebis(<<"rxpck/s", Rest/binary>>,  Category) -> {Category, rxpck, Rest};
parsebis(<<"txpck/s", Rest/binary>>,  Category) -> {Category, txpck, Rest};
parsebis(<<"rxbyt/s", Rest/binary>>,  Category) -> {Category, rxbyt, Rest};
parsebis(<<"txbyt/s", Rest/binary>>,  Category) -> {Category, txbyt, Rest};
parsebis(<<"rxcmp/s", Rest/binary>>,  Category) -> {Category, rxcmp, Rest};
parsebis(<<"txcmp/s", Rest/binary>>,  Category) -> {Category, txcmp, Rest};
parsebis(<<"rxmcst/s", Rest/binary>>, Category) -> {Category, rxmcst, Rest};
parsebis(Bin, Category) -> 
 {_, Next} = split_binary(Bin, 1), 
 parsebis(Next, Category).

parse_value(<<$\t, Rest/binary>>, _Value) ->
 parse_value(Rest, _Value);
parse_value(<<$ , Rest/binary>>, _Value) ->
 parse_value(Rest, _Value);

parse_value(<<$\n, _Rest/binary>>, Value) ->
 {more, Value, _Rest};

parse_value(<<>>, Value) ->
 {eof, Value};

parse_value(Bin, Value) ->
 {H, Next} = split_binary(Bin, 1),
 parse_value(Next, iolist_to_binary([Value, H])).

extractor(Motif) ->
 fun(L) when is_list(L) ->
  [ {Y, Z} || {X, Y, Z} <- L, X == Motif]
 end.

sort(List) ->
 lists:sort( fun({X, _V}, {Y, _W}) when X < Y -> 
   true; 
   (_A, _B) -> false 
  end, List).

Now that Erlang is R12B, I'm not so sure if "binary parsing code" is really as efficient as it can...

2 comments:

nem said...: Hi there - this looks really useful, maybe akin to or better than os_mon for platforms with sar available.

Is this code available from a source control repository anywhere?; July 10, 2008 at 11:07 PM
Antoine said...: I'll put it on google I think.
Or somewhere else, really soon.; July 11, 2008 at 12:58 AM

EazyErl !

Wednesday, July 9, 2008

Monitoring your servers with sysstat (sar)

2 comments:

Sticky

Pageviews last month