TODO list for the next version
Pavel Roskin
proski at gnu.org
Tue May 29 20:41:54 UTC 2001
> catdoc is rather popular, and if you neither saw word2x, there is no
> difference. Besides, there is catdoc in mc.ext already, you just
> have to move "#". You can skip the part about excel, but I'd still
> like catdoc to be default.
Ok, I have applied a patch bases on yours. It tries "catdoc", "word2x" and
then "strings". Excel files are handled by "xls2csv" and the fallback is
also "strings".
By the way:
1) Spaces should be backslashed in the "type" directives. It was wrong for
MS Word, but I assume that you haven't tested type recognition for Excel.
2) I removed "Document" and "Worksheet" at the end. My "file" command
(RedHat 7.1) would sometimes print "Microsoft Excel 5.0 Worksheet" which
wouldn't match. Even worse, it prints "Microsoft Office Document" for my
MS Word files, but fortunately they all have doc extention.
The resulting entries in mc.ext are:
# Microsoft Word Document
regex/\.([Dd]o[ct]|DO[CT]|[Ww]ri|WRI)$
View=%view{ascii} catdoc -w %f || word2x -f text %f - || strings %f
type/Microsoft\ Word
View=%view{ascii} catdoc -w %f || word2x -f text %f - || strings %f
# Microsoft Excel Worksheet
regex/\.([Xx]l[sw]|XL[SW])$
View=%view{ascii} xls2csv %f || strings %f
type/Microsoft\ Excel
View=%view{ascii} xls2csv %f || strings %f
--
Regards,
Pavel Roskin
More information about the mc-devel
mailing list