Terminology concerning strings

Roland Illig roland.illig at gmx.de
Mon Apr 4 09:35:44 UTC 2005


Hi all,

in the last time I have programmed a bit with strings, and I have found 
four properties of them which need to be distinguished and which should 
be named consistently throughout the whole Midnight Commander.

* the _size_ of a string (as well as for other objects) is the number of
   bytes that is allocated for it. For arrays, it is the number of
   entries of the array. For strings it is at least _length_ + 1.

* the _length_ of a string is the number of characters in it, excluding
   the terminating '\0'.

* the _width_ and _height_ of a string are the size of a box on the
   screen that would be needed to display the string.

Currently these differences are not recognized by most of the code. 
Therefore I'd like to rename all matching variables according to this 
scheme: For the string variable s, the _size_ is called ssize, the 
_length_ is called slen, the _width_ is called swidth, and the _height_ 
is called sheight.

Example:
     char *fname = g_strdup (s);
     size_t fnamewidth, fnameheight, fnamesize, fnamelen;

     fnamelen = strlen(fname);
     fnamesize = fnamelen + 1;
     msglen(fname, &fnamewidth, &fnameheight);
     /* FIXME: currently does not work with multibyte strings */

Generally, for computing the width and height of a string, special 
routines are needed. The menu item captions, for example, don't display 
the first '&' char, so in these cases the _width_ == _length_ - 1.

As we currently don't have a unified string processing module (proposed 
as ecssup.{c,h} by me) we need to be aware of multibyte strings for 
which the _length_ is much more than the _width_, even for one-line strings.

Roland



More information about the mc-devel mailing list