Discussion:
[Refdb-devel] Latin-1 character 160 (nbsp) in manpages
David Nebauer
2006-08-18 10:06:23 UTC
Permalink
Hi Markus,

The following manpages cause groff errors when they are being processed
for display: refdbib, bib2ris, db2ris, en2ris, marc2ris, med2ris,
refdbxml, refdba, refdbc and refdbd.

The error for refdbc is representative:
----------------------------------------------------------------------------
/usr/share/man/man1/refdbc.1:143: warning: can't find numbered character 160
/usr/share/man/man1/refdbc.1:143: warning: can't find numbered character 160
----------------------------------------------------------------------------

There are two Latin-1 characters 160 (non-breaking space) in line 143 of
refdbc.1.

Line 143 is: 'Table 1. refdbcrc'.

The non-breaking spaces can be made visible by changing them to hashes:
-------------------------------------------------------------------------
<< first line proves there are no hash ('#') characters in manpage source >>
$ cat refdbc.1 | grep '#'
<< now transform non-breaking space to hash
***@hezmana:temp$ cat refdbc.1 | tr '\240' '#' | grep '#'
Table#1.#refdbcrc
$
-------------------------------------------------------------------------

The hashes show the location of the two non-breaking spaces.

In every case where this error occurs it is due to a table caption line
similar to the one for refdbc.1.

These characters are not present in the xml source and are obviously
introduced during the transformation of xml to groff.

On my xterm the relevant region of transformed groff input is displayed as:
-------------------------------------------------------------------------
...
refdbc evaluates the refdbcrc configuration file at startup to
initialize it-
self. Table1.refdbcrc

| Variable | Default | Comment |
...
-------------------------------------------------------------------------

You can see groff strips out the offending characters. Incidentally, the
layout would look better if the table caption had its own line.

I know I'm being picky here but an error is an error...

Regards,
David.
Markus Hoenicka
2006-08-18 20:20:15 UTC
Permalink
Hi David,
Post by David Nebauer
----------------------------------------------------------------------------
/usr/share/man/man1/refdbc.1:143: warning: can't find numbered character 160
/usr/share/man/man1/refdbc.1:143: warning: can't find numbered character 160
----------------------------------------------------------------------------
I can't reproduce this problem over here. I've tried this on FreeBSD,
Cygwin, and Debian (testing) without getting warnings like this. I
wonder whether this is a localization issue.

In any case, the DocBook stylesheets indeed put the non-breaking space
where you see it, and they do so on purpose. Just have a look at the
manpages/charmap.groff.xsl file which contains the mapping of the
groff construct "\ " to the character #160.

For the time being I've changed the driver file which RefDB uses to
process the man pages.
Post by David Nebauer
You can see groff strips out the offending characters. Incidentally, the
layout would look better if the table caption had its own line.
While I was at it, I also put the caption on a separate line. The only
downside is that the tables are not numbered. I don't think this is
going to cause problems as the tables are always rendered where they
belong.
Post by David Nebauer
I know I'm being picky here but an error is an error...
Actually it is a good sign if we have the time to ponder problems like
these. This means that the rest of the program is more or less working
ok.

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
Michael(tm) Smith
2006-08-21 03:01:42 UTC
Permalink
Post by Markus Hoenicka
For the time being I've changed the driver file which RefDB uses to
process the man pages.
That driver file should no longer be needed except for backward
compatibility with older versions of the DocBook XSL stylesheets.
The current version, 1.70.1, fully supports real table output
using tbl(1) markup. See the attached example (refdbib.1).

I've also attached a patch to the driver file. The patch causes
the driver to check the value of the DocBook XSL stylesheets
$VERSION param. If the middle part is 70 or greater, it uses the
table-processing code in the stylesheets as-is; otherwise, it
falls back to outputting the table rows in paragraphs.

--Mike
David Nebauer
2006-08-21 08:21:32 UTC
Permalink
Post by Michael(tm) Smith
The current version, 1.70.1, fully supports real table output
using tbl(1) markup. See the attached example (refdbib.1).
The table output in the example looks stunningly good.

Regards,
David.
Michael(tm) Smith
2006-08-25 04:27:51 UTC
Permalink
Post by David Nebauer
Post by Michael(tm) Smith
The current version, 1.70.1, fully supports real table output
using tbl(1) markup. See the attached example (refdbib.1).
The table output in the example looks stunningly good.
That's tbl(1) magic. The stylesheet just converts the DocBook
table markup to tbl markup. The only tricky parts are in the
handling of tables that have horizontal and vertical spans, but as
far as I can tell from my testing, I think I managed to get the
code for converting those to tbl markup done right.

--Mike

Markus Hoenicka
2006-08-21 21:34:39 UTC
Permalink
Hi Mike,
Post by Michael(tm) Smith
I've also attached a patch to the driver file. The patch causes
the driver to check the value of the DocBook XSL stylesheets
$VERSION param. If the middle part is 70 or greater, it uses the
table-processing code in the stylesheets as-is; otherwise, it
falls back to outputting the table rows in paragraphs.
The new manpage output is a real jawdropper. Thanks for providing the
patch.

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
Michael(tm) Smith
2006-08-18 15:22:38 UTC
Permalink
Post by David Nebauer
The following manpages cause groff errors when they are being processed
for display: refdbib, bib2ris, db2ris, en2ris, marc2ris, med2ris,
refdbxml, refdba, refdbc and refdbd.
----------------------------------------------------------------------------
/usr/share/man/man1/refdbc.1:143: warning: can't find numbered character 160
/usr/share/man/man1/refdbc.1:143: warning: can't find numbered character 160
----------------------------------------------------------------------------
There are two Latin-1 characters 160 (non-breaking space) in line 143 of
refdbc.1.
Version 1.69.1 and the current 1.70.1 version of the DocBook XSL
stylesheets should be correctly converting those instances into
the corresponding roff escape ("\ "). If the characters are not
getting converted due to a stylesheet bug, I can fix the bug and
get a 1.70.2 release out relatively soon.

--Mike
Markus Hoenicka
2006-08-21 21:31:59 UTC
Permalink
Hi Mike,
Post by Michael(tm) Smith
Post by David Nebauer
There are two Latin-1 characters 160 (non-breaking space) in line 143 of
refdbc.1.
Version 1.69.1 and the current 1.70.1 version of the DocBook XSL
stylesheets should be correctly converting those instances into
the corresponding roff escape ("\ "). If the characters are not
getting converted due to a stylesheet bug, I can fix the bug and
get a 1.70.2 release out relatively soon.
I believe this explains why I never managed to reproduce this
problem. Apparently the DocBook stylesheets were at least at 1.69.1 on
all boxes that I've tested this with. I'm glad this is one of these
"self-solving" problems.

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
Loading...