OLD | NEW |
| (Empty) |
1 Hyphen - hyphenation library to use converted TeX hyphenation patterns | |
2 | |
3 (C) 1998 Raph Levien | |
4 (C) 2001 ALTLinux, Moscow | |
5 (C) 2006, 2007, 2008, 2010 László Németh | |
6 | |
7 This was part of libHnj library by Raph Levien. | |
8 | |
9 Peter Novodvorsky from ALTLinux cut hyphenation part from libHnj | |
10 to use it in OpenOffice.org. | |
11 | |
12 Compound word and non-standard hyphenation support by László Németh. | |
13 | |
14 License is the original LibHnj license: | |
15 LibHnj is dual licensed under LGPL and MPL (see also README.libhnj). | |
16 | |
17 Because LGPL allows GPL relicensing, COPYING contains now | |
18 LGPL/GPL/MPL tri-license for explicit Mozilla source compatibility. | |
19 | |
20 Original Libhnj source with OOo's patches are managed by Rene Engelhard | |
21 and Chris Halls at Debian: | |
22 | |
23 http://packages.debian.org/stable/libdevel/libhnj-dev | |
24 and http://packages.debian.org/unstable/source/libhnj | |
25 | |
26 | |
27 OTHER FILES | |
28 | |
29 This distribution is the source of the en_US hyphenation patterns | |
30 "hyph_en_US.dic", too. See README_hyph_en_US.txt. | |
31 | |
32 Source files of hyph_en_US.dic in the distribution: | |
33 | |
34 hyphen.tex (en_US hyphenation patterns from plain TeX) | |
35 | |
36 Source: http://tug.ctan.org/text-archive/macros/plain/base/hyphen.tex | |
37 | |
38 tbhyphext.tex: hyphenation exception log from TugBoat archive | |
39 | |
40 Source of the hyphenation exception list: | |
41 http://www.ctan.org/tex-archive/info/digests/tugboat/tb0hyf.tex | |
42 | |
43 Generated with the hyphenex script | |
44 (http://www.ctan.org/tex-archive/info/digests/tugboat/hyphenex.sh) | |
45 | |
46 sh hyphenex.sh <tb0hyf.tex >tbhyphext.tex | |
47 | |
48 | |
49 INSTALLATION | |
50 | |
51 ./configure | |
52 make | |
53 make install | |
54 | |
55 UNIT TESTS (WITH VALGRIND DEBUGGER) | |
56 | |
57 make check | |
58 VALGRIND=memcheck make check | |
59 | |
60 USAGE | |
61 | |
62 ./example hyph_en_US.dic mywords.txt | |
63 | |
64 or (under Linux) | |
65 | |
66 echo example | ./example hyph_en_US.dic /dev/stdin | |
67 | |
68 NOTE: In the case of Unicode encoded input, convert your words | |
69 to lowercase before hyphenation (under UTF-8 console environment): | |
70 | |
71 cat mywords.txt | awk '{print tolower($0)}' >mywordslow.txt | |
72 | |
73 DEVELOPMENT | |
74 | |
75 See README.hyphen for hyphenation algorithm, README.nonstandard | |
76 and doc/tb87nemeth.pdf for non-standard hyphenation, | |
77 README.compound for compound word hyphenation, and tests/*. | |
78 | |
79 Description of the dictionary format: | |
80 | |
81 First line contains the character encoding (ISO8859-x, UTF-8). | |
82 | |
83 Possible options in the following lines: | |
84 | |
85 LEFTHYPHENMIN num minimal hyphenation distance from the left word end | |
86 RIGHTHYPHENMIN num minimal hyphation distance from the right word end | |
87 COMPOUNDLEFTHYPHENMIN num min. hyph. dist. from the left compound word boundary | |
88 COMPOUNDRIGHTHYPHENMIN num min. hyph. dist. from the right comp. word boundary | |
89 | |
90 hyphenation patterns see README.* files | |
91 | |
92 NEXTWORD separate the two compound sets (see README.compound) | |
93 | |
94 Default values: | |
95 Without explicite declarations, hyphenmin fields of dict struct | |
96 are zeroes, but in this case the lefthyphenmin and righthyphenmin | |
97 will be the default 2 under the hyphenation (for backward compatibility). | |
98 | |
99 Comments | |
100 | |
101 Use percent sign at the beginning of the lines to add comments to your | |
102 hpyhenation patterns (after the character encoding in the first line): | |
103 | |
104 % comment | |
105 | |
106 ***************************************************************************** | |
107 * Warning! Correct working of Libhnj *needs* prepared hyphenation patterns. * | |
108 | |
109 For example, generating hyph_en_US.dic from "hyphen.us" TeX patterns: | |
110 | |
111 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 | |
112 | |
113 or with default LEFTHYPHENMIN and RIGHTHYPHENMIN values: | |
114 | |
115 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 2 3 | |
116 perl substrings.pl hyphen.gb hyph_en_GB.dic ISO8859-1 3 3 | |
117 **************************************************************************** | |
118 | |
119 OTHERS | |
120 | |
121 Java hyphenation: Peter B. West (Folio project) implements a hyphenator with | |
122 non standard hyphenation facilities based on extended Libhnj. The HyFo module | |
123 is released in binary form as jar files and in source form as zip files. | |
124 See http://sourceforge.net/project/showfiles.php?group_id=119136 | |
125 | |
126 László Németh | |
127 <nemeth (at) openoffice (dot) org> | |
OLD | NEW |