Chromium Code Reviews| OLD | NEW |
|---|---|
| (Empty) | |
| 1 # Strings | |
| 2 | |
| 3 A Dart string represents a sequence of characters encoded in UTF-16. Decoding | |
| 4 UTF-16 yields Unicode code points. Borrowing terminology from Go, Dart uses | |
| 5 the term `rune` for an integer representing a Unicode code point. | |
| 6 | |
| 7 The string recipes included in this chapter assume that you have some | |
| 8 familiarity with Unicode and UTF-16. Here is a brief refresher: | |
| 9 | |
| 10 ### What is the Basic Multilingual Plane? | |
| 11 | |
| 12 The Unicode code space is divided into seventeen planes of 65,536 points each. | |
| 13 The first plane (code points U+0000 to U+FFFF) contains the most | |
| 14 frequently used characters and is called the Basic Multilingual Plane or BMP. | |
| 15 | |
| 16 ### What is a Surrogate Pair? | |
| 17 | |
| 18 The term 'surrogate pair' refers to a means of encoding Unicode characters | |
| 19 outside the Basic Multilingual Plane. | |
| 20 | |
| 21 In UTF-16, two-byte (16-bit) code sequences are used to store Unicode | |
| 22 characters. Since two bytes can only contain the 65,536 characters in the 0x0 | |
| 23 to 0xFFFF range, a pair of code points are used to store values in the | |
| 24 0x10000 to 0x10FFFF range. | |
| 25 | |
| 26 For example the Unicode character for musical Treble-clef (🎼 ), with | |
| 27 a value of '\u{1F3BC}', it too large to fit in 16 bits. | |
| 28 | |
| 29 var clef = '\u{1F3BC}'; // 🎼 | |
| 30 | |
| 31 '\u{1F3BC}' is composed of a UTF-16 surrogate pair: [u\D83C, \uDFBC]. | |
| 32 | |
| 33 ### What is the difference between a code point and a code unit? | |
| 34 | |
| 35 Within the Basic Multilingual Plane, the code point for a character is | |
| 36 numerically the same as the code unit for that character. | |
| 37 | |
| 38 'D'.runes.first; // 68 | |
| 39 'D'.codeUnits.first; // 68 | |
| 40 | |
| 41 For non-BMP characters, each code point is represented by two code units. | |
| 42 | |
| 43 var clef = '\u{1F3BC}'; // 🎼 | |
| 44 clef.runes.length; // 1 | |
| 45 clef.codeUnits.length; // 2 | |
| 46 | |
| 47 ### What exactly is a character? | |
| 48 | |
| 49 A character is a string contained in the Universal Character Set. Each character | |
| 50 maps to a single rune value (code point); BMP characters map to 1 code | |
| 51 unit; non-BMP characters map to 2 code units. | |
| 52 | |
| 53 You can read more about the Universal Character Set at | |
| 54 http://en.wikipedia.org/wiki/Universal_Character_Set. | |
| 55 | |
| 56 ### Do I have to really deal with Unicode? | |
| 57 | |
| 58 Yes, if you want to build robust international applications, you do. | |
| 59 Besides, the String library makes working with Unicode relatively painless, | |
| 60 so there's no great overhead in doing things right. | |
| 61 | |
| 62 ## Concatenating Strings | |
| 63 | |
| 64 ### Problem | |
| 65 | |
| 66 You want to concatenate strings in Dart. You tried using `+`, but | |
| 67 that resulted in an error. | |
| 68 | |
| 69 ### Solution | |
| 70 | |
| 71 Use adjacent string literals: | |
| 72 | |
| 73 var fact = 'Dart' 'is' ' fun!'; // 'Dart is fun!' | |
| 74 | |
| 75 ### Discussion | |
| 76 | |
| 77 Adjacent literals also work over multiple lines: | |
| 78 | |
| 79 var fact = 'Dart' | |
| 80 'is' | |
| 81 'fun!'; // 'Dart is fun!' | |
| 82 | |
| 83 They also work when using multiline strings: | |
| 84 | |
| 85 var lunch = '''Peanut | |
| 86 butter''' | |
| 87 '''and | |
| 88 jelly'''; // 'Peanut\nbutter and\njelly' | |
| 89 | |
| 90 You can concatenate adjacent single line literals with multiline strings: | |
| 91 | |
| 92 var funnyGuys = 'Dewey ' 'Cheatem' | |
| 93 ''' and | |
| 94 Howe'''; // 'Dewey Cheatem and\n Howe' | |
| 95 | |
| 96 | |
| 97 #### Alternatives to adjacent string literals | |
| 98 | |
| 99 You can also use the `concat()` method on a string to concatenate it to another | |
|
floitsch
2013/03/09 00:01:41
I just gave an LGTM to Lasse for changing concat t
| |
| 100 string: | |
| 101 | |
| 102 var film = filmToWatch(); | |
| 103 film = film.concat('\n'); // 'The Big Lebowski\n' | |
| 104 | |
| 105 Since `concat()` creates a new string every time it is invoked, a long chain of | |
| 106 `concat()`s can be expensive. Avoid those. Use a StringBuffer instead (see | |
| 107 _Incrementally building a string efficiently using a StringBuffer_, below). | |
| 108 | |
| 109 Use can `join()` to combine a sequence of strings: | |
| 110 | |
| 111 var film = ['The', 'Big', 'Lebowski']).join(' '); // 'The Big Lebowski' | |
| 112 | |
| 113 You can also use string interpolation to concatenate strings (see | |
| 114 _Interpolating expressions inside strings_, below). | |
| 115 | |
| 116 | |
| 117 ## Interpolating expressions inside strings | |
| 118 | |
| 119 ### Problem | |
| 120 | |
| 121 You want to create strings that contain Dart expressions and identifiers. | |
| 122 | |
| 123 ### Solution | |
| 124 | |
| 125 You can put the value of an expression inside a string by using ${expression}. | |
| 126 | |
| 127 var favFood = 'sushi'; | |
| 128 var whatDoILove = 'I love ${favFood.toUpperCase()}'; // 'I love SUSHI' | |
| 129 | |
| 130 You can skip the {} if the expression is an identifier: | |
| 131 | |
| 132 var whatDoILove = 'I love $favFood'; // 'I love sushi' | |
| 133 | |
| 134 ### Discussion | |
| 135 | |
| 136 An interpolated string, `string ${expression}` is equivalent to the | |
| 137 concatenation of the strings 'string ' and `expression.toString()`. | |
| 138 Consider this code: | |
| 139 | |
| 140 var four = 4; | |
| 141 var seasons = 'The $four seasons'; // 'The 4 seasons' | |
| 142 | |
| 143 It is equivalent to the following: | |
|
floitsch
2013/03/09 00:01:41
It is not. the concat will make two copies, wherea
| |
| 144 | |
| 145 var seasons = 'The '.concat(4.toString()).concat(' seasons'); // 'The 4 seas ons' | |
| 146 | |
| 147 You should consider implementing a `toString()` method for user-defined | |
| 148 objects. Here's what happens if you don't: | |
| 149 | |
| 150 class Point { | |
| 151 num x, y; | |
| 152 Point(this.x, this.y); | |
| 153 } | |
| 154 | |
| 155 var point = new Point(3, 4); | |
| 156 print('Point: $point'); // "Point: Instance of 'Point'" | |
| 157 | |
| 158 Probably not what you wanted. Here is the same example with an explicit | |
| 159 `toString()`: | |
| 160 | |
| 161 class Point { | |
| 162 ... | |
| 163 | |
| 164 String toString() => 'x: $x, y: $y'; | |
| 165 } | |
| 166 | |
| 167 print('Point: $point'); // 'Point: x: 3, y: 4' | |
| 168 | |
| 169 | |
| 170 ## Escaping special characters | |
| 171 | |
| 172 ### Problem | |
| 173 | |
| 174 You want to put newlines, dollar signs, or other special characters in your stri ngs. | |
| 175 | |
| 176 ### Solution | |
| 177 | |
| 178 Prefix special characters with a `\`. | |
| 179 | |
| 180 print(Wile\nCoyote'); | |
| 181 // Wile | |
| 182 // Coyote | |
| 183 | |
| 184 ### Discussion | |
| 185 | |
| 186 Dart designates a few characters as special, and these can be escaped: | |
| 187 | |
| 188 - \n for newline, equivalent to \x0A. | |
| 189 - \r for carriage return, equivalent to \x0D. | |
| 190 - \f for form feed, equivalent to \x0C. | |
| 191 - \b for backspace, equivalent to \x08. | |
| 192 - \t for tab, equivalent to \x09. | |
| 193 - \v for vertical tab, equivalent to \x0B. | |
| 194 | |
| 195 If you prefer, you can use `\x` or `\u` notation to indicate the special | |
| 196 character: | |
| 197 | |
| 198 print('Wile\x0ACoyote'); // same as print('Wile\nCoyote'); | |
| 199 print('Wile\u000ACoyote'); // same as print('Wile\nCoyote'); | |
| 200 | |
| 201 You can also use `\u{}` notation: | |
| 202 | |
| 203 print('Wile\u{000A}Coyote'); // same as print('Wile\nCoyote'); | |
| 204 | |
| 205 You can also escape the `$` used in string interpolation: | |
| 206 | |
| 207 var superGenius = 'Wile Coyote'; | |
| 208 print('$superGenius and Road Runner'); // 'Wile Coyote and Road Runner' | |
| 209 print('\$superGenius and Road Runner'); // '$superGenius and Road Runner' | |
| 210 | |
| 211 If you escape a non-special character, the `\` is ignored: | |
| 212 | |
| 213 print('Wile \E Coyote'); // 'Wile E Coyote' | |
| 214 | |
| 215 | |
| 216 ## Incrementally building a string efficiently using a StringBuffer | |
| 217 | |
| 218 ### Problem | |
| 219 | |
| 220 You want to collect string fragments and combine them in an efficient manner. | |
| 221 | |
| 222 ### Solution | |
| 223 | |
| 224 Use a StringBuffer to programmatically generate a string. A StringBuffer | |
| 225 collects the string fragments, but does not generate a new string until | |
| 226 `toString()` is called: | |
| 227 | |
| 228 var sb = new StringBuffer(); | |
| 229 sb.write('John, '); | |
| 230 sb.write('Paul, '); | |
| 231 sb.write('George, '); | |
| 232 sb.write('and Ringo'); | |
| 233 var beatles = sb.toString(); // 'John, Paul, George, and Ringo' | |
| 234 | |
| 235 ### Discussion | |
| 236 | |
| 237 In addition to `write()`, the StringBuffer class provides methods to write a | |
| 238 list of strings (`writeAll()`), write a numerical character code | |
| 239 (`writeCharCode()`), write with an added newline ('writeln()`), and more. Here | |
| 240 is a simple example that show the use of these methods: | |
| 241 | |
| 242 var sb = new StringBuffer(); | |
| 243 sb.writeln('The Beatles:'); | |
| 244 sb.writeAll(['John, ', 'Paul, ', 'George, and Ringo']); | |
| 245 sb.writeCharCode(33); // charCode for '!'. | |
| 246 var beatles = sb.toString(); // 'The Beatles:\nJohn, Paul, George, and Ringo !' | |
| 247 | |
| 248 Since a StringBuffer waits until the call to `toString()` to generate the | |
| 249 concatenated string, it represents a more efficient way of combining strings | |
| 250 than `concat()`. See the _Concatenating Strings_ recipe for a description of | |
| 251 `concat()`. | |
| 252 | |
| 253 ## Converting between string characters and numerical codes | |
| 254 | |
| 255 ### Problem | |
| 256 | |
| 257 You want to convert string characters into numerical codes and back. | |
| 258 | |
| 259 ### Solution | |
| 260 | |
| 261 Use the `runes` getter to access a string's code points: | |
| 262 | |
| 263 'Dart'.runes.toList(); // [68, 97, 114, 116] | |
| 264 | |
| 265 var smileyFace = '\u263A'; // ☺ | |
| 266 smileyFace.runes.toList(); // [9786] | |
| 267 | |
| 268 The number 9786 represents the code unit '\u263A'. | |
| 269 | |
| 270 Use `string.codeUnits` to get a string's UTF-16 code units: | |
| 271 | |
| 272 'Dart'.codeUnits.toList(); // [68, 97, 114, 116] | |
| 273 smileyFace.codeUnits.toList(); // [9786] | |
| 274 | |
| 275 ### Discussion | |
| 276 | |
| 277 Notice that using `runes` and `codeUnits` produces identical results | |
| 278 in the examples above. That happens because each character in 'Dart' and in | |
| 279 `smileyFace` fits within 16 bits, resulting in a code unit corresponding | |
| 280 neatly with a code point. | |
| 281 | |
| 282 Consider an example where a character cannot be represented within 16-bits, | |
| 283 the Unicode character for a Treble clef ('\u{1F3BC}'). This character consists | |
| 284 of a surrogate pair: '\uD83C', '\uDFBC'. Getting the numerical value of this | |
| 285 character using `codeUnits` and `runes` produces the following result: | |
| 286 | |
| 287 var clef = '\u{1F3BC}'; // 🎼 | |
| 288 clef.codeUnits.toList(); // [55356, 57276] | |
| 289 clef.runes.toList(); // [127932] | |
| 290 | |
| 291 The numbers 55356 and 57276 represent `clef`'s surrogate pair, '\uD83C' and | |
| 292 '\uDFBC', respectively. The number 127932 represents the code point '\u1F3BC'. | |
| 293 | |
| 294 #### Using codeUnitAt() to access individual code units | |
| 295 | |
| 296 To access the 16-Bit UTF-16 code unit at a particular index, use | |
| 297 `codeUnitAt()`: | |
| 298 | |
| 299 'Dart'.codeUnitAt(0); // 68 | |
| 300 smileyFace.codeUnitAt(0); // 9786 | |
| 301 | |
| 302 Using `codeUnitAt()` with the multi-byte `clef` character leads to problems: | |
| 303 | |
| 304 clef.codeUnitAt(0); // 55356 | |
| 305 clef.codeUnitAt(1); // 57276 | |
| 306 | |
| 307 In either call to `clef.codeUnitAt()`, the values returned represent strings | |
| 308 that are only one half of a UTF-16 surrogate pair. These are not valid UTF-16 | |
| 309 strings. | |
| 310 | |
| 311 | |
| 312 #### Converting numerical codes to strings | |
| 313 | |
| 314 You can generate a new string from runes or code units using the factory | |
| 315 `String.fromCharCodes(charCodes)`: | |
| 316 | |
| 317 new String.fromCharCodes([68, 97, 114, 116]); // 'Dart' | |
| 318 | |
| 319 new String.fromCharCodes([73, 32, 9825, 32, 76, 117, 99, 121]); | |
| 320 // 'I ♡ Lucy' | |
| 321 | |
| 322 new String.fromCharCodes([55356, 57276]); // 🎼 | |
| 323 new String.fromCharCodes([127932]), // 🎼 | |
| 324 | |
| 325 You can use the `String.fromCharCode()` factory to convert a single rune or | |
| 326 code unit to a string: | |
| 327 | |
| 328 new String.fromCharCode(68); // 'D' | |
| 329 new String.fromCharCode(9786); // ☺ | |
| 330 new String.fromCharCode(127932); // 🎼 | |
| 331 | |
| 332 Creating a string with only one half of a surrogate pair is permitted, but not | |
| 333 recommended. | |
| 334 | |
| 335 ## Determining if a string is empty | |
| 336 | |
| 337 ### Problem | |
| 338 | |
| 339 You want to know if a string is empty. You tried ` if(string) {...}`, but that | |
| 340 did not work. | |
| 341 | |
| 342 ### Solution | |
| 343 | |
| 344 Use `string.isEmpty`: | |
| 345 | |
| 346 var emptyString = ''; | |
| 347 emptyString.isEmpty; // true | |
| 348 | |
| 349 A string with a space is not empty: | |
| 350 | |
| 351 var space = ' '; | |
| 352 space.isEmpty; // false | |
| 353 | |
| 354 ### Discussion | |
| 355 | |
| 356 Don't use `if (string)` to test the emptiness of a string. In Dart, all | |
| 357 objects except the boolean true evaluate to false. `if(string)` will always | |
| 358 be false. | |
| 359 | |
| 360 | |
| 361 ## Removing leading and trailing whitespace | |
| 362 | |
| 363 ### Problem | |
| 364 | |
| 365 You want to remove leading and trailing whitespace from a string. | |
| 366 | |
| 367 ### Solution | |
| 368 | |
| 369 Use `string.trim()`: | |
| 370 | |
| 371 var space = '\n\r\f\t\v'; // We'll use a variety of space characters. | |
| 372 var string = '$space X $space'; | |
| 373 var newString = string.trim(); // 'X' | |
| 374 | |
| 375 The String class has no methods to remove only leading or only trailing | |
| 376 whitespace. But you can always use regExps. | |
| 377 | |
| 378 Remove only leading whitespace: | |
| 379 | |
| 380 var newString = string.replaceFirst(new RegExp(r'^\s+'), ''); // 'X $space' | |
| 381 | |
| 382 Remove only trailing whitespace: | |
| 383 | |
| 384 var newString = string.replaceFirst(new RegExp(r'\s+$'), ''); // '$space X' | |
| 385 | |
| 386 | |
| 387 ## Calculating the length of a string | |
| 388 | |
| 389 ### Problem | |
| 390 | |
| 391 You want to get the length of a string, but are not sure how to | |
| 392 correctly calculate the length when working with Unicode. | |
| 393 | |
| 394 ### Solution | |
| 395 | |
| 396 Use string.length to get the number of UTF-16 code units in a string: | |
| 397 | |
| 398 'I love music'.length; // 12 | |
| 399 'I love music'.runes.length; // 12 | |
| 400 | |
| 401 ### Discussion | |
| 402 | |
| 403 For characters that fit into 16 bits, the code unit length is the same as the | |
| 404 rune length: | |
| 405 | |
| 406 var hearts = '\u2661'; // ♡ | |
| 407 hearts.length; // 1 | |
| 408 hearts.runes.length; // 1 | |
| 409 | |
| 410 If the string contains any characters outside the Basic Multilingual | |
| 411 Plane (BMP), the rune length will be less than the code unit length: | |
| 412 | |
| 413 var clef = '\u{1F3BC}'; // 🎼 | |
| 414 clef.length; // 2 | |
| 415 clef.runes.length; // 1 | |
| 416 | |
| 417 var music = 'I $hearts $clef'; // 'I ♡ 🎼 ' | |
| 418 music.length; // 6 | |
| 419 music.runes.length // 5 | |
| 420 | |
| 421 Use `length` if you want to number of code units; use `runes.length` if you | |
| 422 want the number of runes. | |
|
floitsch
2013/03/09 00:01:41
You could add, that Twitter uses runes for the len
| |
| 423 | |
| 424 | |
| 425 ## Subscripting a string | |
| 426 | |
| 427 ### Problem | |
| 428 | |
| 429 You want to be able to access a character in a string at a particular index. | |
| 430 | |
| 431 ### Solution | |
| 432 | |
| 433 Subscript runes: | |
| 434 | |
| 435 var teacup = '\u{1F375}'; // 🍵 | |
| 436 teacup.runes.toList()[0]; // 127861 | |
|
floitsch
2013/03/09 00:01:41
If you want to access it only once, you can also u
| |
| 437 | |
| 438 The number 127861 represents the code point for teacup, '\u{1F375}' (🍵 ). | |
| 439 | |
| 440 ### Discussion | |
| 441 | |
| 442 Subscripting a string directly can be problematic. This is because the default | |
| 443 `[]` implementation subscripts along code units. This means that | |
| 444 for non-BMP characters, subscripting yields invalid UTF-16 characters: | |
| 445 | |
| 446 'Dart'[0]; // 'D' | |
| 447 | |
| 448 var hearts = '\u2661'; // ♡ | |
| 449 hearts[0]; '\u2661' // ♡ | |
| 450 | |
| 451 teacup[0]; // 55356, Invalid string, half of a surrogate pair. | |
| 452 teacup.codeUnits.toList()[0]; // The same. | |
| 453 | |
| 454 | |
| 455 ## Processing a string one character at a time | |
| 456 | |
| 457 ### Problem | |
| 458 | |
| 459 You want to do something with each individual character in a string. | |
| 460 | |
| 461 ### Solution | |
| 462 | |
| 463 To access an individual character, map the string runes: | |
| 464 | |
| 465 var charList = "Dart".runes.map((rune) => '*${new String.fromCharCode(rune)} *').toList(); | |
| 466 // ['*D*', '*a*', '*r*', '*t*'] | |
| 467 | |
| 468 var runeList = happy.runes.map((rune) => [rune, new String.fromCharCode(rune )]).toList(), | |
| 469 // [[73, 'I'], [32, ' '], [97, 'a'], [109, 'm'], [32, ' '], [9786, '☺' ]] | |
| 470 | |
| 471 If you are sure that the string is in the Basic Multilingual Plane (BMP), you | |
| 472 can use string.split(''): | |
| 473 | |
| 474 'Dart'.split(''); // ['D', 'a', 'r', 't'] | |
| 475 smileyFace.split('').length; // 1 | |
| 476 | |
| 477 Since `split('')` splits at the UTF-16 code unit boundaries, | |
| 478 invoking it on a non-BMP character yields the string's surrogate pair: | |
| 479 | |
| 480 var clef = '\u{1F3BC}'; // 🎼 , not in BMP. | |
| 481 clef.split('').length; // 2 | |
| 482 | |
| 483 The surrogate pair members are not valid UTF-16 strings. | |
| 484 | |
| 485 | |
| 486 ## Splitting a string into substrings | |
| 487 | |
| 488 ### Problem | |
| 489 | |
| 490 You want to split a string into substrings. | |
| 491 | |
| 492 ### Solution | |
| 493 | |
| 494 Use the `split()` method with a string or a regExp as an argument. | |
| 495 | |
| 496 var smileyFace = '\u263A'; | |
| 497 var happy = 'I am $smileyFace'; | |
| 498 happy.split(' '); // ['I', 'am', '☺'] | |
| 499 | |
| 500 Here is an example of using `split()` with a regExp: | |
| 501 | |
| 502 var nums = '2/7 3 4/5 3~/5'; | |
| 503 var numsRegExp = new RegExp(r'(\s|/|~/)'); | |
| 504 nums.split(numsRegExp); // ['2', '7', '3', '4', '5', '3', '5'] | |
| 505 | |
| 506 In the code above, the string `nums` contains various numbers, some of which | |
| 507 are expressed as fractions or as int-divisions. A regExp is used to split the | |
| 508 string to extract just the numbers. | |
| 509 | |
| 510 You can perform operations on the matched and unmatched portions of a string | |
| 511 when using `split()` with a regExp: | |
| 512 | |
| 513 'Eats SHOOTS leaves'.splitMapJoin((new RegExp(r'SHOOTS')), | |
| 514 onMatch: (m) => '*${m.group(0).toLowerCase()}*', | |
| 515 onNonMatch: (n) => n.toUpperCase()); // 'EATS *shoots* LEAVES' | |
| 516 | |
| 517 The regExp matches the middle word ('SHOOTS'). A pair of callbacks are | |
| 518 registered to transform the matched and unmatched substrings before the | |
| 519 substrings are joined together again. | |
| 520 | |
| 521 | |
| 522 ## Changing string case | |
| 523 | |
| 524 ### Problem | |
| 525 | |
| 526 You want to change the case of strings. | |
| 527 | |
| 528 ### Solution | |
| 529 | |
| 530 Use `string.toUpperCase()` and `string.toLowerCase()` to convert a string to | |
| 531 lower-case or upper-case, respectively: | |
| 532 | |
| 533 var theOneILove = 'I love Lucy'; | |
| 534 theOneILove.toUpperCase(); // 'I LOVE LUCY!' | |
| 535 theOneILove.toLowerCase(); // 'i love lucy!' | |
| 536 | |
| 537 ### Discussion | |
| 538 | |
| 539 Case changes affect the characters of bi-cameral scripts like Greek and French: | |
| 540 var zeus = '\u0394\u03af\u03b1\u03c2'; // 'Δίας' (Zeus in modern Greek) | |
| 541 zeus.toUpperCase(); // 'ΔΊΑΣ' | |
| 542 | |
| 543 var resume = '\u0052\u00e9\u0073\u0075\u006d\u00e9'; // 'Résumé' | |
| 544 resume.toLowerCase(); // 'résumé' | |
| 545 | |
| 546 They do not affect the characters of uni-cameral scripts like Devanagari (used f or | |
| 547 writing many of the languages of India): | |
| 548 | |
| 549 var chickenKebab = '\u091a\u093f\u0915\u0928 \u0915\u092c\u093e\u092c'; | |
| 550 // 'चिकन कबाब' (in Devanagari) | |
| 551 chickenKebab.toLowerCase(); // 'चिकन कबाब' | |
| 552 chickenKebab.toUpperCase(); // 'चिकन कबाब' | |
| 553 | |
| 554 If a character's case does not change when using `toUpperCase()` and | |
| 555 `toLowerCase()`, it is most likely because the character only has one | |
| 556 form. | |
| 557 | |
| 558 ## Determining whether a string contains another string | |
| 559 | |
| 560 ### Problem | |
| 561 | |
| 562 You want to find out if a string is the substring of another string. | |
| 563 | |
| 564 ### Solution | |
| 565 | |
| 566 Use `string.contains()`: | |
| 567 | |
| 568 var fact = 'Dart strings are immutable'; | |
| 569 string.contains('immutable'); // True. | |
| 570 | |
| 571 You can indicate a startIndex as a second argument: | |
| 572 | |
| 573 string.contains('Dart', 2); // False | |
| 574 | |
| 575 ### Discussion | |
| 576 | |
| 577 The String library provides a couple of shortcuts for testing whether a string | |
| 578 is a substring of another: | |
| 579 | |
| 580 string.startsWith('Dart'); // True. | |
| 581 string.endsWith('e'); // True. | |
| 582 | |
| 583 You can also use `string.indexOf()`, which returns -1 if the substring is | |
| 584 not found within a string, and its matching index, if it is: | |
| 585 | |
| 586 string.indexOf('art') != -1; // True, `art` is found in `Dart` | |
| 587 | |
| 588 You can also use a regExp and `hasMatch()`: | |
| 589 | |
| 590 new RegExp(r'ar[et]').hasMatch(string); // True, 'art' and 'are' match. | |
| 591 | |
| 592 | |
| 593 ## Finding matches of a regExp pattern in a string | |
| 594 | |
| 595 ### Problem | |
| 596 | |
| 597 You want to use regExp to match a pattern in a string, and | |
| 598 want to be able to access the matches. | |
| 599 | |
| 600 ### Solution | |
| 601 | |
| 602 Construct a regular expression using the RegExp class and find matches using | |
| 603 the `allMatches()` method: | |
| 604 | |
| 605 var neverEatingThat = 'Not with a fox, not in a box'; | |
| 606 var regExp = new RegExp(r'[fb]ox'); | |
| 607 List matches = regExp.allMatches(neverEatingThat); | |
| 608 matches.map((match) => match.group(0)).toList(); // ['fox', 'box'] | |
| 609 | |
| 610 ### Discussion | |
| 611 | |
| 612 You can query the object returned by `allMatches()` to find out the number of | |
| 613 matches: | |
| 614 | |
| 615 matches.length; // 2 | |
| 616 | |
| 617 To find the first match, use `firstMatch()`: | |
| 618 | |
| 619 regExp.firstMatch(neverEatingThat).group(0); // 'fox' | |
| 620 | |
| 621 To directly access the matched string, use `stringMatch()`: | |
| 622 | |
| 623 regExp.stringMatch(neverEatingThat); // 'fox' | |
| 624 regExp.stringMatch('I like bagels and lox'); // null | |
| 625 | |
| 626 | |
| 627 ## Substituting strings based on regExp matches | |
| 628 | |
| 629 ### Problem | |
| 630 | |
| 631 You want to match substrings within a string and make substitutions based on | |
| 632 the matches. | |
| 633 | |
| 634 ### Solution | |
| 635 | |
| 636 Construct a regular expression using the RegExp class and make replacements | |
| 637 using `replaceAll()` method: | |
| 638 | |
| 639 'resume'.replaceAll(new RegExp(r'e'), '\u00E9'); // 'résumé' | |
| 640 | |
| 641 If you want to replace just the first match, use 'replaceFirst()`: | |
| 642 | |
| 643 '0.0001'.replaceFirst(new RegExp(r'0+'), ''); // '.0001' | |
| 644 | |
| 645 The RegExp matches for one or more 0's and replaces them with an empty string. | |
| 646 | |
| 647 You can use `replaceAllMatched()` and register a function to modify the | |
| 648 matches: | |
| 649 | |
| 650 var heart = '\u2661'; // '♡' | |
| 651 var string = 'I like Ike but I $heart Lucy'; | |
| 652 var regExp = new RegExp(r'[A-Z]\w+'); | |
| 653 string.replaceAllMapped(regExp, (match) => match.group(0).toUpperCase()); | |
| 654 // 'I like IKE but I ♡ LUCY' | |
| OLD | NEW |