OLD | NEW |
| (Empty) |
1 <refentry xmlns="http://docbook.org/ns/docbook" | |
2 xmlns:xlink="http://www.w3.org/1999/xlink" | |
3 xmlns:xi="http://www.w3.org/2001/XInclude" | |
4 xmlns:src="http://nwalsh.com/xmlns/litprog/fragment" | |
5 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" | |
6 version="5.0" xml:id="make.index.markup"> | |
7 <refmeta> | |
8 <refentrytitle>make.index.markup</refentrytitle> | |
9 <refmiscinfo class="other" otherclass="datatype">boolean</refmiscinfo> | |
10 </refmeta> | |
11 <refnamediv> | |
12 <refname>make.index.markup</refname> | |
13 <refpurpose>Generate XML index markup in the index?</refpurpose> | |
14 </refnamediv> | |
15 | |
16 <refsynopsisdiv> | |
17 <src:fragment xml:id="make.index.markup.frag"> | |
18 <xsl:param name="make.index.markup" select="0"/> | |
19 </src:fragment> | |
20 </refsynopsisdiv> | |
21 | |
22 <refsection><info><title>Description</title></info> | |
23 | |
24 <para>This parameter enables a very neat trick for getting properly | |
25 merged, collated back-of-the-book indexes. G. Ken Holman suggested | |
26 this trick at Extreme Markup Languages 2002 and I'm indebted to him | |
27 for it.</para> | |
28 | |
29 <para>Jeni Tennison's excellent code in | |
30 <filename>autoidx.xsl</filename> does a great job of merging and | |
31 sorting <tag>indexterm</tag>s in the document and building a | |
32 back-of-the-book index. However, there's one thing that it cannot | |
33 reasonably be expected to do: merge page numbers into ranges. (I would | |
34 not have thought that it could collate and suppress duplicate page | |
35 numbers, but in fact it appears to manage that task somehow.)</para> | |
36 | |
37 <para>Ken's trick is to produce a document in which the index at the | |
38 back of the book is <quote>displayed</quote> in XML. Because the index | |
39 is generated by the FO processor, all of the page numbers have been resolved. | |
40 It's a bit hard to explain, but what it boils down to is that instead of having | |
41 an index at the back of the book that looks like this:</para> | |
42 | |
43 <blockquote> | |
44 <formalpara><info><title>A</title></info> | |
45 <para>ap1, 1, 2, 3</para> | |
46 </formalpara> | |
47 </blockquote> | |
48 | |
49 <para>you get one that looks like this:</para> | |
50 | |
51 <blockquote> | |
52 <programlisting><indexdiv>A</indexdiv> | |
53 <indexentry> | |
54 <primaryie>ap1</primaryie>, | |
55 <phrase role="pageno">1</phrase>, | |
56 <phrase role="pageno">2</phrase>, | |
57 <phrase role="pageno">3</phrase> | |
58 </indexentry></programlisting> | |
59 </blockquote> | |
60 | |
61 <para>After building a PDF file with this sort of odd-looking index, you can | |
62 extract the text from the PDF file and the result is a proper index expressed in | |
63 XML.</para> | |
64 | |
65 <para>Now you have data that's amenable to processing and a simple Perl script | |
66 (such as <filename>fo/pdf2index</filename>) can | |
67 merge page ranges and generate a proper index.</para> | |
68 | |
69 <para>Finally, reformat your original document using this literal index instead
of | |
70 an automatically generated one and <quote>bingo</quote>!</para> | |
71 | |
72 </refsection> | |
73 </refentry> | |
OLD | NEW |