
Ed's short guide on utmp(x)
Some time ago I spent some time working with utmp(x) and I
noticed many applications/programmers make wrong assumptions about
what it is, how it works and how to write applications that use it.
This is why I decided to write an article about it, which is
hopefully of any use to people who want to use this interface.
First of all, a small introduction. My name is Ed Schouten and I am a developer at The FreeBSD Project. I am a user
since 2005 and a developer since May 2008. In the last couple of
years I've been working on FreeBSD's TTY layer, the terminal
emulator for the console, our utmpx implementation and many other
things.
What is utmp(x)?
Many operating systems provide facilities to somehow log user
login sessions. In many cases there are three types of queries which
are of interest:
- Which users are currently logged in?
- Who logged in over the last month?
- How long ago did a certain user log in?
In order to answer all these queries, most UNIX-like operating
systems store login records in three databases, namely:
- utmp(x), which holds entries for currently active user
login sessions,
- wtmp(x), where all log entries are simply appended to
the end of the file and could be log rotated, depending on the
operating system you use, and
- lastlog(x), which holds the last login record for each
user, which may or may not be available, depending on the
operating system you use.
The utmp(x) and wtmp(x) hold various types of records, namely
user logins, user logouts, but also changes to the system clock and
system reboots. These events need to be stored as well, since they
must be taken into account when calculating the length of a user
login session.
Some history
In order to understand why things work the way they do, it's good
to explain some historical information about utmp and utmpx.
In the beginning...
In a very ancient history many UNIX-like operating systems had a
header file called <utmp.h>. This header file basically
had the following contents. The actual structure differs between
implementations.
struct utmp {
char ut_line[];
char ut_name[];
char ut_host[];
time_t ut_time;
};This structure defines the layout of a single entry in the utmp
and wtmp databases, which was often stored at /etc/utmp or /var/run/utmp and /var/log/wtmp.
Applications that wanted to update records in the utmp database
had to know at which location in the file they had to write the new
record. This offset was often based on the name of the TTY. The ttyslot()
function was often provided to obtain this offset. It typically
returned the line number at which the TTY was listed in /etc/ttys.
Later many operating systems also provided a structure in the
same header file which looked as follows:
struct lastlog {
char ll_line[];
char ll_host[];
time_t ll_time;
};This structure was used by the lastlog database. The lastlog
database was often indexed by user ID, which means there is no need
to store the username as well. It can be derived from the
offset at which the entry is stored.
System V: utmpx
The System V developers implemented a replacement for utmp and
wtmp called utmpx and wtmpx. utmpx users a different header file
called <utmpx.h>. This header
file basically declares the following things:
struct utmpx {
char ut_user[];
char ut_id[];
char ut_line[];
pid_t ut_pid;
short ut_type;
struct timeval ut_tv;
};
#define EMPTY 0x...
#define BOOT_TIME 0x...
#define OLD_TIME 0x...
#define NEW_TIME 0x...
#define USER_PROCESS 0x...
#define INIT_PROCESS 0x...
#define LOGIN_PROCESS 0x...
#define DEAD_PROCESS 0x...
void endutxent(void);
struct utmpx *getutxent(void);
struct utmpx *getutxid(const struct utmpx *);
struct utmpx *getutxline(const struct utmpx *);
struct utmpx *pututxline(const struct utmpx *);
void setutxent(void);This interface has many advantages over the old utmp interface,
namely:
- Records are typed. With utmp, various values for the username
and the TTY line name were reserved to denote changes in system
time or reboots. With utmpx, the ut_type field is used to hold the record
type.
- The ut_tv field provides
microsecond-precision, which is better than the time_t used for utmp. This may seem
overkill, though.
- The utmpx file is no longer indexed using ttyslot(), but by a random identifier
generated by the application, stored in ut_id. By using the same identifier for
logout records, it can easily be determined which record should be
removed from the utmpx database. This means you can also log
sessions not related to TTYs, or even multiple sessions on the
same TTY, generated by su or login.
- The interface provides utility functions, which means you
don't have to open the database files by hand. This may even imply
the utmpx database isn't stored on disk.
So what happened to wtmpx? Well, the implementation often offered
a function like updwtmpx() which
could be used to append records to the wtmpx log file.
Standardization
Finally the utmpx interface got standardized, which means it is
now part of POSIX.
POSIX only standardizes the previously given structure and utility
functions, which means there is no standardized way to update the
wtmpx and lastlogx database, when provided.
Operating system support
Right now there are lots of UNIX-like operating systems that
support utmp and utmpx. Below is a list of which operating system
implements what.
- Linux
implements both utmp and utmpx and they are exactly the same,
except for the spelling. Both structures are the same and the
utility functions write to the same database. The on-disk records
are equal to the structures themselves. Linux has a utmp, wtmp and
lastlog database. Care must be taken that the structure is the
same on both 32 and 64 bit architectures for binary compatibility,
so members like ut_tv aren't actual
struct timevals.
- Oracle
Solaris has a similar approach as Linux, but you must keep in
mind that the on-disk records are not the same. The header also
provides a structure futmpx
describing the on-disk record. Solaris only provides a utmpx and
wtmpx database.
- Mac
OS X has a utmpx implementation which is a subset of NetBSD's
implementation. They only implement utmpx and lastlogx. When
using pututxline(), both the utmpx
and lastlogx database are updated. Entries are also logged into ASL,
which gives you effectively the same functionality as wtmpx. Mac
OS X still ships with <utmp.h> which is almost identical
to FreeBSD's version, but it should not be used.
- FreeBSD
8.x and older only implement utmp using the same primitive
structure as given earlier. All fields are too small to be useful.
For example, the hostname field can only hold up to 16 bytes,
which is just enough for a numerical IPv4 address.
- FreeBSD
9.0 and newer only implement utmpx, where pututxline() also logs to three
databases. Records can be extracted from all three databases using
the same API. The setutxdb()
function can be used to switch between the databases. The on-disk
format is completely undescribed, so do not attempt to write to
these files directly. The file format is exactly the same across
all supported platforms.
- NetBSD has
a utmpx implementation which has an API similar to Oracle
Solaris. They also provide a utmp implementation similar to
FreeBSD's.
- OpenBSD's
utmp implementation is similar to FreeBSD's. Someone should
implement utmpx on OpenBSD. There are also many other things missing.
How to write portable code
As a FreeBSD developer, I have to say utmpx support on FreeBSD
was long overdue. FreeBSD 9.0 was released in January 2012, which
means we'll be stuck with non-utmpx for quite some time to come.
In my opinion application maintainers should just do the
following:
- Use utmpx exclusively, or at least prefer utmpx over utmp.
utmpx is a standard and utmp is not, even though the Linux man
page somehow wants to talk you into using utmp. By the time your
software is released, OpenBSD will probably be the only common
operating system that does not implement utmpx.
- Only use the previously mentioned functions. Only call updwtmpx() on operating systems that
support this call. A good way to check whether this function is
available without using Autoconf, is checking whether WTMPX_FILE is defined. This seems to work
quite good in practice.
- Only use the structure fields mentioned previously or at least
check whether they exist using Autoconf or similar. I've seen many
pieces of code that use ut_name
instead of ut_user. This works on
most operating systems, but this is not standardized. FreeBSD 9.0
will not support this construction, for example.
- Only include the header file you need. Don't include <utmp.h> when using utmpx. This
prevents compilation on systems that only provide utmpx.
- If you have decided to ditch utmp completely, just let the
package maintainers of the respective operating systems maintain
the utmp bits. It is our fault for not implementing utmpx properly
to begin with.
- Never ever, in the name of God, attempt to open, read from
or write to the utmpx, wtmpx and lastlogx files directly.
There is never a reason which justifies doing so. I will
personally hurt kittens if you do so.
Finally I want to say, when in doubt about what is the best thing
to do, be sure to ask the respective operating system
maintainers (which includes me). Good luck!
Last modified: Thu Jan 12 22:59:20 2012 +0100