Terminology concerning strings
Leonard den Ottolander
leonard at den.ottolander.nl
Wed Apr 6 13:58:13 UTC 2005
Hi Egmont,
On Wed, 2005-04-06 at 13:14, Koblinger Egmont wrote:
> a) One can allocate a larger buffer than strlen+1. For example,
> x=malloc(10); strcpy(x, "asdf"); in this example length is 4, size is 10.
> Or is size==5 in this case?
I am not sure if you should count the ending 0 char. I would say not,
but if you do size = strlen + size(<chartype>) anyway. The +
size(<chartype>) should affect the buffer allocation, not the
calculation of the string size. So for single byte chars I would say
size = length (not + 1).
> b) Each multibyte character (e.g. any accented letters in UTF-8) counts as 1
> for length, but at least two for size.
According to
http://www.gnu.org/software/libc/manual/html_node/Extended-Char-Intro.html wchar_t on GNU systems is 4 bytes by default. Internal representation of multibyte strings always uses fixed widths or something like x[3] wouldn't work (without scanning the string). So in case x in the above example is a wchar_t you overflow the buffer nicely ;) .
Leonard.
--
mount -t life -o ro /dev/dna /genetic/research
More information about the mc-devel
mailing list