OLD | NEW |
(Empty) | |
| 1 # Strings |
| 2 |
| 3 Dart strings are immutable: once you create a string, you cannot change it. |
| 4 You can always build a string out of other strings, or assign the results |
| 5 of calling a method on a string to a new string. |
| 6 |
| 7 String literals can be written in three ways: with single quotes ('with |
| 8 embedded "double" quotes'), with double quotes: "with embedded 'single' |
| 9 quotes"), or with triple quotes ('''With single quotes''', """With double |
| 10 quotes"""). Triple quoted strings can span multiple lines with associated |
| 11 whitespace preserved. |
| 12 |
| 13 Dart does not have a char type. Indexing operations on strings give you |
| 14 one-character strings. |
| 15 |
| 16 Dart strings support string concatenation and expression interpolation. The |
| 17 String class provides methods for searching inside a string, extracting |
| 18 substrings, handling case, trimming whitespace, replacing a part of a |
| 19 string, and more. The StringBuffer class lets you programmatically build |
| 20 up a string in an efficient manner. You can use regular expressions |
| 21 (RegExp objects) to search within strings and to replace parts of strings. |
| 22 |
| 23 Dart string characters are encoded in UTF-16. Decoding UTF-16 yields Unicode |
| 24 code points. Borrowing terminology from Go, Dart uses the term `rune` for an |
| 25 integer representing a Unicode code point. The runes of a String are accessible |
| 26 throught the `runes` getter. |
| 27 |
| 28 Dart strings support the full Unicode range, and cover every alphabetic system |
| 29 in use in the whole world. The String library provides support for the correct |
| 30 handling of extended UTF-16 characters. |
| 31 |
| 32 ## Concatenating Strings |
| 33 |
| 34 ### Problem |
| 35 |
| 36 You want to concatenate strings in Dart. You tried using `+`, but |
| 37 that resulted in an error. |
| 38 |
| 39 ### Solution |
| 40 |
| 41 Use adjacent string literals: |
| 42 |
| 43 var fact = 'Dart' 'is' ' fun!'; // 'Dart is fun!' |
| 44 |
| 45 ### Discussion |
| 46 |
| 47 Adjacent literals also work over multiple lines: |
| 48 |
| 49 var fact = 'Dart' |
| 50 'is' |
| 51 'fun!'; // 'Dart is fun!' |
| 52 |
| 53 They also work when using multiline strings: |
| 54 |
| 55 var lunch = '''Peanut |
| 56 butter''' |
| 57 '''and |
| 58 jelly'''; // 'Peanut\nbutter and\njelly' |
| 59 |
| 60 You can concatenate adjacent single line literals with multiline strings: |
| 61 |
| 62 var funnyGuys = 'Dewey ' 'Cheatem' |
| 63 ''' and |
| 64 Howe'''; // 'Dewey Cheatem and\n Howe' |
| 65 |
| 66 |
| 67 #### Alternatives to adjacent string literals |
| 68 |
| 69 You can also use the `concat()` method on a string to concatenate it to another |
| 70 string: |
| 71 |
| 72 var film = filmToWatch(); |
| 73 film = film.concat('\n'); // 'The Big Lebowski\n' |
| 74 |
| 75 Since `concat()` creates a new string every time it is invoked, a long chain of |
| 76 `concat()`s can be expensive. Avoid those. Use a StringBuffer instead (see |
| 77 _Incrementally building a string efficiently using a StringBuffer_, below). |
| 78 |
| 79 Use can `join()` to combine a sequence of strings: |
| 80 |
| 81 var film = ['The', 'Big', 'Lebowski']).join(' '); // 'The Big Lebowski' |
| 82 |
| 83 You can also use string interpolation to concatenate strings (see |
| 84 _Interpolating expressions inside strings_, below). |
| 85 |
| 86 |
| 87 ## Interpolating expressions inside strings |
| 88 |
| 89 ### Problem |
| 90 |
| 91 You want to create strings that contain Dart expressions and identifiers. |
| 92 |
| 93 ### Solution |
| 94 |
| 95 You can put the value of an expression inside a string by using ${expression}. |
| 96 |
| 97 var favFood = 'sushi'; |
| 98 var whatDoILove = 'I love ${favFood.toUpperCase()}'; // 'I love SUSHI' |
| 99 |
| 100 You can skip the {} if the expression is an identifier: |
| 101 |
| 102 var whatDoILove = 'I love $favFood'; // 'I love sushi' |
| 103 |
| 104 ### Discussion |
| 105 |
| 106 An interpolated string, `string ${expression}` is equivalent to the |
| 107 concatenation of the strings 'string ' and `expression.toString()`. |
| 108 Consider this code: |
| 109 |
| 110 var four = 4; |
| 111 var seasons = 'The $four seasons'; // 'The 4 seasons' |
| 112 |
| 113 It is equivalent to the following: |
| 114 |
| 115 var seasons = 'The '.concat(4.toString()).concat(' seasons'); // 'The 4 seas
ons' |
| 116 |
| 117 You should consider implementing a `toString()` method for user-defined |
| 118 objects. Here's what happens if you don't: |
| 119 |
| 120 class Point { |
| 121 num x, y; |
| 122 Point(this.x, this.y); |
| 123 } |
| 124 |
| 125 var point = new Point(3, 4); |
| 126 print('Point: $point'); // "Point: Instance of 'Point'" |
| 127 |
| 128 Probably not what you wanted. Here is the same example with an explicit |
| 129 `toString()`: |
| 130 |
| 131 class Point { |
| 132 ... |
| 133 |
| 134 String toString() => 'x: $x, y: $y'; |
| 135 } |
| 136 |
| 137 print('Point: $point'); // 'Point: x: 3, y: 4' |
| 138 |
| 139 |
| 140 ## Escaping special characters |
| 141 |
| 142 ### Problem |
| 143 |
| 144 You want to put newlines, dollar signs, or other special characters in your stri
ngs. |
| 145 |
| 146 ### Solution |
| 147 |
| 148 Prefix special characters with a `\`. |
| 149 |
| 150 print(Wile\nCoyote'); |
| 151 // Wile |
| 152 // Coyote |
| 153 |
| 154 ### Discussion |
| 155 |
| 156 Dart designates a few characters as special, and these can be escaped: |
| 157 |
| 158 - \n for newline, equivalent to \x0A. |
| 159 - \r for carriage return, equivalent to \x0D. |
| 160 - \f for form feed, equivalent to \x0C. |
| 161 - \b for backspace, equivalent to \x08. |
| 162 - \t for tab, equivalent to \x09. |
| 163 - \v for vertical tab, equivalent to \x0B. |
| 164 |
| 165 If you prefer, you can use `\x` or `\u` notation to indicate the special |
| 166 character: |
| 167 |
| 168 print('Wile\x0ACoyote'); // same as print('Wile\nCoyote'); |
| 169 print('Wile\u000ACoyote'); // same as print('Wile\nCoyote'); |
| 170 |
| 171 You can also use `\u{}` notation: |
| 172 |
| 173 print('Wile\u{000A}Coyote'); // same as print('Wile\nCoyote'); |
| 174 |
| 175 You can also escape the `$` used in string interpolation: |
| 176 |
| 177 var superGenius = 'Wile Coyote'; |
| 178 print('$superGenius and Road Runner'); // 'Wile Coyote and Road Runner' |
| 179 print('\$superGenius and Road Runner'); // '$superGenius and Road Runner' |
| 180 |
| 181 If you escape a non-special character, the `\` is ignored: |
| 182 |
| 183 print('Wile \E Coyote'); // 'Wile E Coyote' |
| 184 |
| 185 |
| 186 ## Incrementally building a string efficiently using a StringBuffer |
| 187 |
| 188 ### Problem |
| 189 |
| 190 You want to collect string fragments and combine them in an efficient manner. |
| 191 |
| 192 ### Solution |
| 193 |
| 194 Use a StringBuffer to programmatically generate a string. A StringBuffer |
| 195 collects the string fragments, but does not generate a new string until |
| 196 `toString()` is called: |
| 197 |
| 198 var sb = new StringBuffer(); |
| 199 sb.write('John, '); |
| 200 sb.write('Paul, '); |
| 201 sb.write('George, '); |
| 202 sb.write('and Ringo'); |
| 203 var beatles = sb.toString(); // 'John, Paul, George, and Ringo' |
| 204 |
| 205 ### Discussion |
| 206 |
| 207 In addition to `write()`, the StringBuffer class provides methods to write a |
| 208 list of strings (`writeAll()`), write a numerical character code |
| 209 (`writeCharCode()`), write with an added newline ('writeln()`), and more. Here |
| 210 is a simple example that show the use of these methods: |
| 211 |
| 212 var sb = new StringBuffer(); |
| 213 sb.writeln('The Beatles:'); |
| 214 sb.writeAll(['John, ', 'Paul, ', 'George, and Ringo']); |
| 215 sb.writeCharCode(33); // charCode for '!'. |
| 216 var beatles = sb.toString(); // 'The Beatles:\nJohn, Paul, George, and Ringo
!' |
| 217 |
| 218 Since a StringBuffer waits until the call to `toString()` to generate the |
| 219 concatenated string, it represents a more efficient way of combining strings |
| 220 than `concat()`. See the _Concatenating Strings_ recipe for a description of |
| 221 `concat()`. |
| 222 |
| 223 ## Converting between string characters and numerical codes |
| 224 |
| 225 ### Problem |
| 226 |
| 227 You want to convert string characters into numerical codes and back. |
| 228 |
| 229 ### Solution |
| 230 |
| 231 Use the `runes` getter to access a string's code points: |
| 232 |
| 233 'Dart'.runes.toList(); // [68, 97, 114, 116] |
| 234 |
| 235 var smileyFace = '\u263A'; // ☺ |
| 236 smileyFace.runes.toList(); // [9786] |
| 237 |
| 238 The number 9786 represents the code unit '\u263A'. |
| 239 |
| 240 Use `string.codeUnits` to get a string's UTF-16 code units: |
| 241 |
| 242 'Dart'.codeUnits.toList(); // [68, 97, 114, 116] |
| 243 smileyFace.codeUnits.toList(); // [9786] |
| 244 |
| 245 ### Discussion |
| 246 |
| 247 Notice that using `runes` and `codeUnits` produces identical results |
| 248 in the examples above. That happens because each character in 'Dart' and in |
| 249 `smileyFace` fits within 16 bits, resulting in a code unit corresponding |
| 250 neatly with a code point. |
| 251 |
| 252 Consider an example where a character cannot be represented within 16-bits, |
| 253 the Unicode character for a Treble clef ('\u{1F3BC}'). This character consists |
| 254 of a surrogate pair: '\uD83C', '\uDFBC'. Getting the numerical value of this |
| 255 character using `codeUnits` and `runes` produces the following result: |
| 256 |
| 257 var clef = '\u{1F3BC}'; // 🎼 |
| 258 clef.codeUnits.toList(); // [55356, 57276] |
| 259 clef.runes.toList(); // [127932] |
| 260 |
| 261 The numbers 55356 and 57276 represent `clef`'s surrogate pair, '\uD83C' and |
| 262 '\uDFBC', respectively. The number 127932 represents the code point '\u1F3BC'. |
| 263 |
| 264 #### Using codeUnitAt() to access individual code units |
| 265 |
| 266 To access the 16-Bit UTF-16 code unit at a particular index, use |
| 267 `codeUnitAt()`: |
| 268 |
| 269 'Dart'.codeUnitAt(0); // 68 |
| 270 smileyFace.codeUnitAt(0); // 9786 |
| 271 |
| 272 Using `codeUnitAt()` with the multi-byte `clef` character leads to problems: |
| 273 |
| 274 clef.codeUnitAt(0); // 55356 |
| 275 clef.codeUnitAt(1); // 57276 |
| 276 |
| 277 In either call to `clef.codeUnitAt()`, the values returned represent strings |
| 278 that are only one half of a UTF-16 surrogate pair. These are not valid UTF-16 |
| 279 strings. |
| 280 |
| 281 |
| 282 #### Converting numerical codes to strings |
| 283 |
| 284 You can generate a new string from runes or code units using the factory |
| 285 `String.fromCharCodes(charCodes)`: |
| 286 |
| 287 new String.fromCharCodes([68, 97, 114, 116]); // 'Dart' |
| 288 |
| 289 new String.fromCharCodes([73, 32, 9825, 32, 76, 117, 99, 121]); |
| 290 // 'I ♡ Lucy' |
| 291 |
| 292 new String.fromCharCodes([55356, 57276]); // 🎼 |
| 293 new String.fromCharCodes([127932]), // 🎼 |
| 294 |
| 295 You can use the `String.fromCharCode()` factory to convert a single rune or |
| 296 code unit to a string: |
| 297 |
| 298 new String.fromCharCode(68); // 'D' |
| 299 new String.fromCharCode(9786); // ☺ |
| 300 new String.fromCharCode(127932); // 🎼 |
| 301 |
| 302 Creating a string with only one half of a surrogate pair is permitted, but not |
| 303 recommended. |
| 304 |
| 305 ## Determining if a string is empty |
| 306 |
| 307 ### Problem |
| 308 |
| 309 You want to know if a string is empty. You tried ` if(string) {...}`, but that |
| 310 did not work. |
| 311 |
| 312 ### Solution |
| 313 |
| 314 Use `string.isEmpty`: |
| 315 |
| 316 var emptyString = ''; |
| 317 emptyString.isEmpty; // true |
| 318 |
| 319 A string with a space is not empty: |
| 320 |
| 321 var space = ' '; |
| 322 space.isEmpty; // false |
| 323 |
| 324 ### Discussion |
| 325 |
| 326 Don't use `if (string)` to test the emptiness of a string. In Dart, all |
| 327 objects except the boolean true evaluate to false. `if(string)` will always |
| 328 be false. |
| 329 |
| 330 |
| 331 ## Removing leading and trailing whitespace |
| 332 |
| 333 ### Problem |
| 334 |
| 335 You want to remove leading and trailing whitespace from a string. |
| 336 |
| 337 ### Solution |
| 338 |
| 339 Use `string.trim()`: |
| 340 |
| 341 var space = '\n\r\f\t\v'; // We'll use a variety of space characters. |
| 342 var string = '$space X $space'; |
| 343 var newString = string.trim(); // 'X' |
| 344 |
| 345 The String class has no methods to remove only leading or only trailing |
| 346 whitespace. But you can always use regExps. |
| 347 |
| 348 Remove only leading whitespace: |
| 349 |
| 350 var newString = string.replaceFirst(new RegExp(r'^\s+'), ''); // 'X $space' |
| 351 |
| 352 Remove only trailing whitespace: |
| 353 |
| 354 var newString = string.replaceFirst(new RegExp(r'\s+$'), ''); // '$space X' |
| 355 |
| 356 |
| 357 ## Calculating the length of a string |
| 358 |
| 359 ### Problem |
| 360 |
| 361 You want to get the length of a string, but are not sure how to |
| 362 correctly calculate the length when working with Unicode. |
| 363 |
| 364 ### Solution |
| 365 |
| 366 Use string.length to get the number of UTF-16 code units in a string: |
| 367 |
| 368 'I love music'.length; // 12 |
| 369 'I love music'.runes.length; // 12 |
| 370 |
| 371 ### Discussion |
| 372 |
| 373 For characters that fit into 16 bits, the code unit length is the same as the |
| 374 rune length: |
| 375 |
| 376 var hearts = '\u2661'; // ♡ |
| 377 hearts.length; // 1 |
| 378 hearts.runes.length; // 1 |
| 379 |
| 380 If the string contains any characters outside the Basic Multilingual |
| 381 Plane (BMP), the rune length will be less than the code unit length: |
| 382 |
| 383 var clef = '\u{1F3BC}'; // 🎼 |
| 384 clef.length; // 2 |
| 385 clef.runes.length; // 1 |
| 386 |
| 387 var music = 'I $hearts $clef'; // 'I ♡ 🎼 ' |
| 388 music.length; // 6 |
| 389 music.runes.length // 5 |
| 390 |
| 391 Use `length` if you want to number of code units; use `runes.length` if you |
| 392 want the number of runes. |
| 393 |
| 394 |
| 395 ## Subscripting a string |
| 396 |
| 397 ### Problem |
| 398 |
| 399 You want to be able to access a character in a string at a particular index. |
| 400 |
| 401 ### Solution |
| 402 |
| 403 Subscript runes: |
| 404 |
| 405 var teacup = '\u{1F375}'; // 🍵 |
| 406 teacup.runes.toList()[0]; // 127861 |
| 407 |
| 408 The number 127861 represents the code point for teacup, '\u{1F375}' (🍵 ). |
| 409 |
| 410 ### Discussion |
| 411 |
| 412 Subscripting a string directly can be problematic. This is because the default |
| 413 `[]` implementation subscripts along code units. This means that |
| 414 for non-BMP characters, subscripting yields invalid UTF-16 characters: |
| 415 |
| 416 'Dart'[0]; // 'D' |
| 417 |
| 418 var hearts = '\u2661'; // ♡ |
| 419 hearts[0]; '\u2661' // ♡ |
| 420 |
| 421 teacup[0]; // 55356, Invalid string, half of a surrogate pair. |
| 422 teacup.codeUnits.toList()[0]; // The same. |
| 423 |
| 424 |
| 425 ## Processing a string one character at a time |
| 426 |
| 427 ### Problem |
| 428 |
| 429 You want to do something with each individual character in a string. |
| 430 |
| 431 ### Solution |
| 432 |
| 433 To access an individual character, map the string runes: |
| 434 |
| 435 var charList = "Dart".runes.map((rune) => '*${new String.fromCharCode(rune)}
*').toList(); |
| 436 // ['*D*', '*a*', '*r*', '*t*'] |
| 437 |
| 438 var runeList = happy.runes.map((rune) => [rune, new String.fromCharCode(rune
)]).toList(), |
| 439 // [[73, 'I'], [32, ' '], [97, 'a'], [109, 'm'], [32, ' '], [9786, '☺']] |
| 440 |
| 441 If you are sure that the string is in the Basic Multilingual Plane (BMP), you |
| 442 can use string.split(''): |
| 443 |
| 444 'Dart'.split(''); // ['D', 'a', 'r', 't'] |
| 445 smileyFace.split('').length; // 1 |
| 446 |
| 447 Since `split('')` splits at the UTF-16 code unit boundaries, |
| 448 invoking it on a non-BMP character yields the string's surrogate pair: |
| 449 |
| 450 var clef = '\u{1F3BC}'; // 🎼 , not in BMP. |
| 451 clef.split('').length; // 2 |
| 452 |
| 453 The surrogate pair members are not valid UTF-16 strings. |
| 454 |
| 455 |
| 456 ## Splitting a string into substrings |
| 457 |
| 458 ### Problem |
| 459 |
| 460 You want to split a string into substrings. |
| 461 |
| 462 ### Solution |
| 463 |
| 464 Use the `split()` method with a string or a regExp as an argument. |
| 465 |
| 466 var smileyFace = '\u263A'; |
| 467 var happy = 'I am $smileyFace'; |
| 468 happy.split(' '); // ['I', 'am', '☺'] |
| 469 |
| 470 Here is an example of using `split()` with a regExp: |
| 471 |
| 472 var nums = '2/7 3 4/5 3~/5'; |
| 473 var numsRegExp = new RegExp(r'(\s|/|~/)'); |
| 474 nums.split(numsRegExp); // ['2', '7', '3', '4', '5', '3', '5'] |
| 475 |
| 476 In the code above, the string `nums` contains various numbers, some of which |
| 477 are expressed as fractions or as int-divisions. A regExp is used to split the |
| 478 string to extract just the numbers. |
| 479 |
| 480 You can perform operations on the matched and unmatched portions of a string |
| 481 when using `split()` with a regExp: |
| 482 |
| 483 'Eats SHOOTS leaves'.splitMapJoin((new RegExp(r'SHOOTS')), |
| 484 onMatch: (m) => '*${m.group(0).toLowerCase()}*', |
| 485 onNonMatch: (n) => n.toUpperCase()); // 'EATS *shoots* LEAVES' |
| 486 |
| 487 The regExp matches the middle word ('SHOOTS'). A pair of callbacks are |
| 488 registered to transform the matched and unmatched substrings before the |
| 489 substrings are joined together again. |
| 490 |
| 491 |
| 492 ## Changing string case |
| 493 |
| 494 ### Problem |
| 495 |
| 496 You want to change the case of strings. |
| 497 |
| 498 ### Solution |
| 499 |
| 500 Use `string.toUpperCase()` and `string.toLowerCase()` to convert a string to |
| 501 lower-case or upper-case, respectively: |
| 502 |
| 503 var theOneILove = 'I love Lucy'; |
| 504 theOneILove.toUpperCase(); // 'I LOVE LUCY!' |
| 505 theOneILove.toLowerCase(); // 'i love lucy!' |
| 506 |
| 507 ### Discussion |
| 508 |
| 509 Case changes affect the characters of bi-cameral scripts like Greek and French: |
| 510 var zeus = '\u0394\u03af\u03b1\u03c2'; // 'Δίας' (Zeus in modern Greek) |
| 511 zeus.toUpperCase(); // 'ΔΊΑΣ' |
| 512 |
| 513 var resume = '\u0052\u00e9\u0073\u0075\u006d\u00e9'; // 'Résumé' |
| 514 resume.toLowerCase(); // 'résumé' |
| 515 |
| 516 They do not affect the characters of uni-cameral scripts like Devanagari (used f
or |
| 517 writing many of the languages of India): |
| 518 |
| 519 var chickenKebab = '\u091a\u093f\u0915\u0928 \u0915\u092c\u093e\u092c'; |
| 520 // 'चिकन कबाब' (in Devanagari) |
| 521 chickenKebab.toLowerCase(); // 'चिकन कबाब' |
| 522 chickenKebab.toUpperCase(); // 'चिकन कबाब' |
| 523 |
| 524 If a character's case does not change when using `toUpperCase()` and |
| 525 `toLowerCase()`, it is most likely because the character only has one |
| 526 form. |
| 527 |
| 528 ## Determining whether a string contains another string |
| 529 |
| 530 ### Problem |
| 531 |
| 532 You want to find out if a string is the substring of another string. |
| 533 |
| 534 ### Solution |
| 535 |
| 536 Use `string.contains()`: |
| 537 |
| 538 var fact = 'Dart strings are immutable'; |
| 539 string.contains('immutable'); // True. |
| 540 |
| 541 You can indicate a startIndex as a second argument: |
| 542 |
| 543 string.contains('Dart', 2); // False |
| 544 |
| 545 ### Discussion |
| 546 |
| 547 The String library provides a couple of shortcuts for testing whether a string |
| 548 is a substring of another: |
| 549 |
| 550 string.startsWith('Dart'); // True. |
| 551 string.endsWith('e'); // True. |
| 552 |
| 553 You can also use `string.indexOf()`, which returns -1 if the substring is |
| 554 not found within a string, and its matching index, if it is: |
| 555 |
| 556 string.indexOf('art') != -1; // True, `art` is found in `Dart` |
| 557 |
| 558 You can also use a regExp and `hasMatch()`: |
| 559 |
| 560 new RegExp(r'ar[et]').hasMatch(string); // True, 'art' and 'are' match. |
| 561 |
| 562 |
| 563 ## Finding matches of a regExp pattern in a string |
| 564 |
| 565 ### Problem |
| 566 |
| 567 You want to use regExp to match a pattern in a string, and |
| 568 want to be able to access the matches. |
| 569 |
| 570 ### Solution |
| 571 |
| 572 Construct a regular expression using the RegExp class and find matches using |
| 573 the `allMatches()` method: |
| 574 |
| 575 var neverEatingThat = 'Not with a fox, not in a box'; |
| 576 var regExp = new RegExp(r'[fb]ox'); |
| 577 List matches = regExp.allMatches(neverEatingThat); |
| 578 matches.map((match) => match.group(0)).toList(); // ['fox', 'box'] |
| 579 |
| 580 ### Discussion |
| 581 |
| 582 You can query the object returned by `allMatches()` to find out the number of |
| 583 matches: |
| 584 |
| 585 matches.length; // 2 |
| 586 |
| 587 To find the first match, use `firstMatch()`: |
| 588 |
| 589 regExp.firstMatch(neverEatingThat).group(0); // 'fox' |
| 590 |
| 591 To directly access the matched string, use `stringMatch()`: |
| 592 |
| 593 regExp.stringMatch(neverEatingThat); // 'fox' |
| 594 regExp.stringMatch('I like bagels and lox'); // null |
| 595 |
| 596 |
| 597 ## Substituting strings based on regExp matches |
| 598 |
| 599 ### Problem |
| 600 |
| 601 You want to match substrings within a string and make substitutions based on |
| 602 the matches. |
| 603 |
| 604 ### Solution |
| 605 |
| 606 Construct a regular expression using the RegExp class and make replacements |
| 607 using `replaceAll()` method: |
| 608 |
| 609 'resume'.replaceAll(new RegExp(r'e'), '\u00E9'); // 'résumé' |
| 610 |
| 611 If you want to replace just the first match, use 'replaceFirst()`: |
| 612 |
| 613 '0.0001'.replaceFirst(new RegExp(r'0+'), ''); // '.0001' |
| 614 |
| 615 The RegExp matches for one or more 0's and replaces them with an empty string. |
| 616 |
| 617 You can use `replaceAllMatched()` and register a function to modify the |
| 618 matches: |
| 619 |
| 620 var heart = '\u2661'; // '♡' |
| 621 var string = 'I like Ike but I $heart Lucy'; |
| 622 var regExp = new RegExp(r'[A-Z]\w+'); |
| 623 string.replaceAllMapped(regExp, (match) => match.group(0).toUpperCase()); |
| 624 // 'I like IKE but I ♡ LUCY' |
| 625 ============================================================================== |
| 626 |
| 627 |
| 628 The string recipes included in this chapter assume that you have some |
| 629 familiarity with Unicode and UTF-16. Here is a brief refresher: |
| 630 |
| 631 ### What is the Basic Multilingual Plane? |
| 632 |
| 633 The Unicode code space is divided into seventeen planes of 65,536 points each. |
| 634 The first plane (code points U+0000 to U+FFFF) contains the most |
| 635 frequently used characters and is called the Basic Multilingual Plane or BMP. |
| 636 |
| 637 ### What is a Surrogate Pair? |
| 638 |
| 639 The term 'surrogate pair' refers to a means of encoding Unicode characters |
| 640 outside the Basic Multilingual Plane. |
| 641 |
| 642 In UTF-16, two-byte (16-bit) code sequences are used to store Unicode |
| 643 characters. Since two bytes can only contain the 65,536 characters in the 0x0 |
| 644 to 0xFFFF range, a pair of code points are used to store values in the |
| 645 0x10000 to 0x10FFFF range. |
| 646 |
| 647 For example the Unicode character for musical Treble-clef (🎼 ), with |
| 648 a value of '\u{1F3BC}', it too large to fit in 16 bits. |
| 649 |
| 650 var clef = '\u{1F3BC}'; // 🎼 |
| 651 |
| 652 '\u{1F3BC}' is composed of a UTF-16 surrogate pair: [\uD83C, \uDFBC]. |
| 653 |
| 654 ### What is the difference between a code point and a code unit? |
| 655 |
| 656 Within the Basic Multilingual Plane, the code point for a character is |
| 657 numerically the same as the code unit for that character. |
| 658 |
| 659 'D'.runes.first; // 68 |
| 660 'D'.codeUnits.first; // 68 |
| 661 |
| 662 For non-BMP characters, each code point is represented by two code units. |
| 663 |
| 664 var clef = '\u{1F3BC}'; // 🎼 |
| 665 clef.runes.length; // 1 |
| 666 clef.codeUnits.length; // 2 |
| 667 |
| 668 ### What exactly is a character? |
| 669 |
| 670 A character is a string contained in the Universal Character Set. |
| 671 Each character maps to a single rune value (code point); BMP characters |
| 672 map to 1 code unit; non-BMP characters map to 2 code units. |
| 673 |
| 674 You can read more about the Universal Character Set at |
| 675 http://en.wikipedia.org/wiki/Universal_Character_Set. |
| 676 |
| 677 |
OLD | NEW |