| OLD | NEW |
| (Empty) |
| 1 // Copyright (c) 2012, the Dart project authors. Please see the AUTHORS file | |
| 2 // for details. All rights reserved. Use of this source code is governed by a | |
| 3 // BSD-style license that can be found in the LICENSE file. | |
| 4 | |
| 5 /** | |
| 6 * Bidi stands for Bi-directional text. | |
| 7 * According to [Wikipedia](http://en.wikipedia.org/wiki/Bi-directional_text): | |
| 8 * Bi-directional text is text containing text in both text directionalities, | |
| 9 * both right-to-left (RTL) and left-to-right (LTR). It generally involves text | |
| 10 * containing different types of alphabets, but may also refer to boustrophedon, | |
| 11 * which is changing text directionality in each row. | |
| 12 * | |
| 13 * Utility class for formatting display text in a potentially | |
| 14 * opposite-directionality context without garbling layout issues. | |
| 15 * Mostly a very "slimmed-down" and dart-ified port of the Closure Birectional | |
| 16 * formatting libary. If there is a utility in the Closure library (or ICU, or | |
| 17 * elsewhere) that you would like this formatter to make available, please | |
| 18 * contact the Dart team. | |
| 19 * | |
| 20 * Provides the following functionality: | |
| 21 * | |
| 22 * 1. *BiDi Wrapping* | |
| 23 * When text in one language is mixed into a document in another, opposite- | |
| 24 * directionality language, e.g. when an English business name is embedded in a | |
| 25 * Hebrew web page, both the inserted string and the text following it may be | |
| 26 * displayed incorrectly unless the inserted string is explicitly separated | |
| 27 * from the surrounding text in a "wrapper" that declares its directionality at | |
| 28 * the start and then resets it back at the end. This wrapping can be done in | |
| 29 * HTML mark-up (e.g. a 'span dir=rtl' tag) or - only in contexts where mark-up | |
| 30 * can not be used - in Unicode BiDi formatting codes (LRE|RLE and PDF). | |
| 31 * Providing such wrapping services is the basic purpose of the BiDi formatter. | |
| 32 * | |
| 33 * 2. *Directionality estimation* | |
| 34 * How does one know whether a string about to be inserted into surrounding | |
| 35 * text has the same directionality? Well, in many cases, one knows that this | |
| 36 * must be the case when writing the code doing the insertion, e.g. when a | |
| 37 * localized message is inserted into a localized page. In such cases there is | |
| 38 * no need to involve the BiDi formatter at all. In the remaining cases, e.g. | |
| 39 * when the string is user-entered or comes from a database, the language of | |
| 40 * the string (and thus its directionality) is not known a priori, and must be | |
| 41 * estimated at run-time. The BiDi formatter does this automatically. | |
| 42 * | |
| 43 * 3. *Escaping* | |
| 44 * When wrapping plain text - i.e. text that is not already HTML or HTML- | |
| 45 * escaped - in HTML mark-up, the text must first be HTML-escaped to prevent XSS | |
| 46 * attacks and other nasty business. This of course is always true, but the | |
| 47 * escaping cannot be done after the string has already been wrapped in | |
| 48 * mark-up, so the BiDi formatter also serves as a last chance and includes | |
| 49 * escaping services. | |
| 50 * | |
| 51 * Thus, in a single call, the formatter will escape the input string as | |
| 52 * specified, determine its directionality, and wrap it as necessary. It is | |
| 53 * then up to the caller to insert the return value in the output. | |
| 54 */ | |
| 55 | |
| 56 class BidiFormatter { | |
| 57 | |
| 58 /** The direction of the surrounding text (the context). */ | |
| 59 TextDirection contextDirection; | |
| 60 | |
| 61 /** | |
| 62 * Indicates if we should always wrap the formatted text in a <span<,. | |
| 63 */ | |
| 64 bool _alwaysSpan; | |
| 65 | |
| 66 /** | |
| 67 * Create a formatting object with a direction. If [alwaysSpan] is true we | |
| 68 * should always use a `span` tag, even when the input directionality is | |
| 69 * neutral or matches the context, so that the DOM structure of the output | |
| 70 * does not depend on the combination of directionalities. | |
| 71 */ | |
| 72 BidiFormatter.LTR([alwaysSpan=false]) : contextDirection = TextDirection.LTR, | |
| 73 _alwaysSpan = alwaysSpan; | |
| 74 BidiFormatter.RTL([alwaysSpan=false]) : contextDirection = TextDirection.RTL, | |
| 75 _alwaysSpan = alwaysSpan; | |
| 76 BidiFormatter.UNKNOWN([alwaysSpan=false]) : | |
| 77 contextDirection = TextDirection.UNKNOWN, _alwaysSpan = alwaysSpan; | |
| 78 | |
| 79 /** Is true if the known context direction for this formatter is RTL. */ | |
| 80 bool get isRTL() => contextDirection == TextDirection.RTL; | |
| 81 | |
| 82 /** | |
| 83 * Formats a string of a given (or estimated, if not provided) | |
| 84 * [direction] for use in HTML output of the context directionality, so | |
| 85 * an opposite-directionality string is neither garbled nor garbles what | |
| 86 * follows it. | |
| 87 * If the input string's directionality doesn't match the context | |
| 88 * directionality, we wrap it with a `span` tag and add a `dir` attribute | |
| 89 * (either "dir=rtl" or "dir=ltr"). | |
| 90 * If alwaysSpan was true when constructing the formatter, the input is always | |
| 91 * wrapped with `span` tag, skipping the dir attribute when it's not needed. | |
| 92 * | |
| 93 * If [resetDir] is true and the overall directionality or the exit | |
| 94 * directionality of [text] is opposite to the context directionality, | |
| 95 * a trailing unicode BiDi mark matching the context directionality is | |
| 96 * appended (LRM or RLM). If [isHtml] is false, we HTML-escape the [text]. | |
| 97 */ | |
| 98 String wrapWithSpan(String text, [bool isHtml=false, bool resetDir=true, | |
| 99 TextDirection direction]) { | |
| 100 if (direction == null) direction = estimateDirection(text, isHtml); | |
| 101 var result; | |
| 102 if (!isHtml) text = htmlEscape(text); | |
| 103 var directionChange = contextDirection.isDirectionChange(direction); | |
| 104 if (_alwaysSpan || directionChange) { | |
| 105 var spanDirection = ''; | |
| 106 if (directionChange) { | |
| 107 spanDirection = ' dir=${direction.spanText}'; | |
| 108 } | |
| 109 result= '<span$spanDirection>$text</span>'; | |
| 110 } else { | |
| 111 result = text; | |
| 112 } | |
| 113 return result.concat(resetDir? _resetDir(text, direction, isHtml) : ''); | |
| 114 } | |
| 115 | |
| 116 /** | |
| 117 * Format [text] of a known (if specified) or estimated [direction] for use | |
| 118 * in *plain-text* output of the context directionality, so an | |
| 119 * opposite-directionality text is neither garbled nor garbles what follows | |
| 120 * it. Unlike wrapWithSpan, this makes use of unicode BiDi formatting | |
| 121 * characters instead of spans for wrapping. The returned string would be | |
| 122 * RLE+text+PDF for RTL text, or LRE+text+PDF for LTR text. | |
| 123 * | |
| 124 * If [resetDir] is true, and if the overall directionality or the exit | |
| 125 * directionality of text are opposite to the context directionality, | |
| 126 * a trailing unicode BiDi mark matching the context directionality is | |
| 127 * appended (LRM or RLM). | |
| 128 * | |
| 129 * In HTML, the *only* valid use of this function is inside of elements that | |
| 130 * do not allow markup, e.g. an 'option' tag. | |
| 131 * This function does *not* do HTML-escaping regardless of the value of | |
| 132 * [isHtml]. [isHtml] is used to designate if the text contains HTML (escaped | |
| 133 * or unescaped). | |
| 134 */ | |
| 135 String wrapWithUnicode(String text, [bool isHtml=false, bool resetDir=true, | |
| 136 TextDirection direction]) { | |
| 137 if (direction == null) direction = estimateDirection(text, isHtml); | |
| 138 var result = text; | |
| 139 if (contextDirection.isDirectionChange(direction)) { | |
| 140 result = '''${direction == TextDirection.RTL ? RLE : LRE}$text$PDF'''; | |
| 141 } | |
| 142 return result.concat(resetDir? _resetDir(text, direction, isHtml) : ''); | |
| 143 } | |
| 144 | |
| 145 /** | |
| 146 * Estimates the directionality of [text] using the best known | |
| 147 * general-purpose method (using relative word counts). A | |
| 148 * TextDirection.UNKNOWN return value indicates completely neutral input. | |
| 149 * [isHtml] is true if [text] HTML or HTML-escaped. | |
| 150 */ | |
| 151 TextDirection estimateDirection(String text, [bool isHtml=false]) { | |
| 152 return estimateDirectionOfText(text, isHtml); //TODO~!!! | |
| 153 } | |
| 154 | |
| 155 /** | |
| 156 * Returns a unicode BiDi mark matching the surrounding context's [direction] | |
| 157 * (not necessarily the direction of [text]). The function returns an LRM or | |
| 158 * RLM if the overall directionality or the exit directionality of [text] is | |
| 159 * opposite the context directionality. Otherwise | |
| 160 * return the empty string. [isHtml] is true if [text] is HTML or | |
| 161 * HTML-escaped. | |
| 162 */ | |
| 163 String _resetDir(String text, TextDirection direction, bool isHtml) { | |
| 164 // endsWithRtl and endsWithLtr are called only if needed (short-circuit). | |
| 165 if ((contextDirection == TextDirection.LTR && | |
| 166 (direction == TextDirection.RTL || | |
| 167 endsWithRtl(text, isHtml))) || | |
| 168 (contextDirection == TextDirection.RTL && | |
| 169 (direction == TextDirection.LTR || | |
| 170 endsWithLtr(text, isHtml)))) { | |
| 171 if (contextDirection == TextDirection.LTR) { | |
| 172 return LRM; | |
| 173 } else { | |
| 174 return RLM; | |
| 175 } | |
| 176 } else { | |
| 177 return ''; | |
| 178 } | |
| 179 } | |
| 180 } | |
| OLD | NEW |