Request for discussion - how to make MC unicode capable

Pavel Roskin proski at gnu.org
Tue Feb 27 08:11:55 UTC 2007


On Sat, 2007-02-24 at 14:57 +0200, Pavel Tsekov wrote:
> Hello,
> 
> I'd like to initiate a discussion on how to make MC
> unicode deal with multibyte character sets. I'd like
> to hear from the developers of the UTF-8 patch and
> from the ncurses maintaner. Anyone else who can help
> with their expertise is also welcome. This has been
> a major drawback for quite some time and it needs to
> be addressed ASAP.

Yes, thank you for addressing this issue!  I just want to give you some
general advice based on my experience.

Don't try to keep backward compatibility from the beginning, no matter
how important it is.  Code for the most advanced API first, and then
backport the changes to older APIs if needed.

The main reason is that the new API introduces new concepts.  The
concepts are based on better understanding if the issue.  Retaining the
code that is not based on those conceptions next to the new code would
create a maintenance nightmare.  In some cases, the new API enforces the
new rules.  Don't let the offenders to hide behind conditional
statements.

In case of Unicode, the new concept is distinction between bytes and
characters.  Many functions need to be checked that they don't mix them.
It's totally impractical to write a preprocessor conditional every time
something is changed.  It's better to change to code for Unicode support
and then think how to provide backward compatibility for the whole
source tree with minimal changes throughout the code.

Another reason is that the programmer's time is very expensive and
should be used properly.  A programmer should be testing how his code is
working rather than whether it compiles for an old libc.  Very few
actual bugs (i.e. incorrect runtime behavior, as opposed to often
trivial compile issues) are discovered as a result of portability
problems.  Much more bugs are discovered on the primary development
system by the main developer.

People opposing the changes are often more vocal that those who need the
changes.  The later category may not be using mc at all.  Perhaps they
tried mc and didn't like how it looked on the Unicode capable terminal.
Or maybe they were affected by bugs caused by distribution patches.

Those who don't want the changes can be usually satisfied by later
changes that restore the old behavior or the resource consumption.
Again, existing users could be asked to contribute portability fixes and
optimization.  It's an easier job than converting the code to the new
concepts and untangling the mess of function interdependencies.

And those who threaten to switch to different software or to fork the
project are usually not very good contributors to begin with.  The won't
be missed.

In more practical terms, I suggest that mc uses only ncurses or S-Lang
for Unicode.  Doing two ports would exhaust already limited resources.
I think the preference should be given to ncurses because it's not
trying to be an interpreted language or anything else other than a
screen library.

-- 
Regards,
Pavel Roskin




More information about the mc-devel mailing list