OLD | NEW |
---|---|
(Empty) | |
1 # Strings | |
2 | |
3 Dart string represents a sequence of characters encoded in UTF-16. Decoding | |
Alan Knight
2013/03/07 21:15:00
*A* Dart string
shailentuli
2013/03/08 22:38:26
Done.
| |
4 UTF-16 yields Unicode code points. Borrowing terminology from Go, Dart uses | |
5 the term `rune` for an integer representing a Unicode code point. | |
6 | |
7 The string recipes included in this chapter assume that you have some | |
8 familiarity with Unicode and UTF-16. Here is a brief refresher: | |
9 | |
Alan Knight
2013/03/07 21:15:00
I agree with Erik's comments that this is not what
| |
10 ### What is the Basic Multilingual Plane? | |
11 | |
12 The Unicode code space is divided into seventeen planes of 65,536 points each. | |
13 The first plane (code points U+0000 to U+FFFF) contains the most | |
floitsch
2013/03/07 17:22:07
should we stick to "rune" (here and in the rest)?
| |
14 frequently used characters and is called the Basic Multilingual Plane or BMP. | |
15 | |
16 ### What is a Surrogate Pair? | |
17 | |
18 The term 'surrogate pair' refers to a means of encoding Unicode characters | |
19 outside the Basic Multilingual Plane. | |
20 | |
21 In UTF-16, two-byte (16-bit) code sequences are used to store Unicode | |
22 characters. Since two bytes can only contain the 65,536 characters in the 0x0 | |
23 to 0xFFFF range, a pair of code points are used to store values in the | |
24 0x10000 to 0x10FFFF range. | |
25 | |
26 For example the Unicode character for musical Treble-clef (🎼 ), with | |
27 a value of '\u{1F3BC}', it too large to fit in 16 bits. | |
28 | |
29 var clef = '\u{1F3BC}'; // 🎼 | |
30 | |
31 '\u{1F3BC}' is composed of a UTF-16 surrogate pair: [u\D83C, \uDFBC]. | |
32 | |
33 ### What is the difference between a code point and a code unit? | |
34 | |
35 Within the Basic Multilingual Plane, the code point for a character is | |
36 numerically the same as code unit for that charcter. | |
floitsch
2013/03/07 17:22:07
as *the* code unit for that char*a*cter.
shailentuli
2013/03/08 22:38:26
Done.
| |
37 | |
38 'D'.runes.first; // 68 | |
39 'D'.codeUnits.first; // 68 | |
40 | |
41 For non-BMP characters, each code point is represented by two code units. | |
42 | |
43 var clef = '\u{1F3BC}'; // 🎼 | |
44 clef.runes.length; // 1 | |
45 clef.codeUnits.length; // 2 | |
46 | |
47 ### What exactly is a character? | |
48 | |
49 A character is a string contained in the Universal Character Set. Each character | |
50 maps to a single rune value (code point); BMP characters map to 1 code | |
51 unit; non-BMP characters map 2 code units. | |
floitsch
2013/03/07 17:22:07
map to
shailentuli
2013/03/08 22:38:26
Done.
| |
52 | |
53 You can read more about the Universal Character Set at | |
54 http://en.wikipedia.org/wiki/Universal_Character_Set. | |
55 | |
56 ### Do I have to really deal with Unicode? | |
57 | |
58 Yes, if you want to build robust international applications, you do. | |
59 Besides, the String library makes working with Unicode relatively painless, | |
60 so there's no great overhead in doing things right. | |
Alan Knight
2013/03/07 21:15:00
This seems confusing. Dealing with Unicode is not
shailentuli
2013/03/08 22:38:26
You are quite right. I am removing this section. W
| |
61 | |
62 ## Concatenating Strings | |
63 | |
64 ### Problem | |
65 | |
66 You want to concatenate strings in Dart. You tried using `+`, but | |
67 that resulted in an error. | |
68 | |
69 ### Solution | |
70 | |
71 Use adjacent string literals: | |
72 | |
73 var fact = 'Dart' 'is' ' fun!'; // 'Dart is fun!' | |
74 | |
75 ### Discussion | |
76 | |
77 Adjacent literals also work over multiple lines: | |
78 | |
79 var fact = 'Dart' | |
80 'is' | |
81 'fun!'; // 'Dart is fun!' | |
82 | |
83 They also work when using multiline strings: | |
84 | |
85 var lunch = '''Peanut | |
86 butter''' | |
87 '''and | |
88 jelly'''; // 'Peanut\nbutter and\njelly' | |
89 | |
90 You can concatenate adjacent single line literals with multiline strings: | |
91 | |
92 var funnyGuys = 'Dewey ' 'Cheatem' | |
93 ''' and | |
94 Howe'''; // 'Dewey Cheatem and\n Howe' | |
95 | |
96 | |
97 #### Alternatives to adjacent string literals | |
98 | |
99 You can also use the `concat()` method on a string to concatenate it to another | |
100 string: | |
101 | |
102 var film = filmToWatch(); | |
103 film = film.concat('\n'); // 'The Big Lebowski\n' | |
104 | |
105 Since `concat()` creates a new string every time it is invoked, a long chain of | |
106 `concat()`s can be expensive. Avoid those. Use a StringBuffer instead (see | |
107 _Incrementally building a string efficiently using a StringBuffer_, below). | |
108 | |
109 Use can `join()` to combine a sequence of strings: | |
110 | |
111 var film = ['The', 'Big', 'Lebowski']).join(' '); // 'The Big Lebowski' | |
112 | |
113 You can also use string interpolation to concatenate strings (see | |
114 _Interpolating expressions inside strings_, below). | |
115 | |
116 | |
117 ## Interpolating expressions inside strings | |
118 | |
119 ### Problem | |
120 | |
121 You want to create strings that contain Dart expressions and identifiers. | |
sethladd
2013/03/07 05:15:16
This sounds like I already know what the solution
| |
122 | |
123 ### Solution | |
124 | |
125 You can put the value of an expression inside a string by using ${expression}. | |
126 | |
127 var favFood = 'sushi'; | |
128 var whatDoILove = 'I love ${favFood.toUpperCase()}'; // 'I love SUSHI' | |
129 | |
130 You can skip the {} if the expression is an identifier: | |
131 | |
132 var whatDoILove = 'I love $favFood'; // 'I love sushi' | |
133 | |
134 ### Discussion | |
135 | |
136 An interpolated string, `string ${expression}` is equivalent to the | |
137 concatenation of the strings 'string ' and `expression.toString()`. | |
138 Consider this code: | |
139 | |
140 var four = 4; | |
141 var seasons = 'The $four seasons'; // 'The 4 seasons' | |
142 | |
143 It is equivalent to the following: | |
floitsch
2013/03/07 17:22:07
I'm not sure we should call it "equivalent". It sh
| |
144 | |
145 var seasons = 'The '.concat(4.toString()).concat(' seasons'); // 'The 4 seas ons' | |
146 | |
147 You should consider implementing a `toString()` method for user-defined | |
148 objects. Here's what happens if you don't: | |
149 | |
150 class Point { | |
151 num x, y; | |
152 Point(this.x, this.y); | |
153 } | |
154 | |
155 var point = new Point(3, 4); | |
156 print('Point: $point'); // "Point: Instance of 'Point'" | |
157 | |
158 Probably not what you wanted. Here is the same example with an explicit | |
159 `toString()`: | |
160 | |
161 class Point { | |
162 ... | |
163 | |
164 String toString() => 'x: $x, y: $y'; | |
165 } | |
166 | |
167 print('Point: $point'); // 'Point: x: 3, y: 4' | |
168 | |
169 | |
170 ## Escaping special characters | |
sethladd
2013/03/07 05:15:16
Is this the right usage of the terminology? This i
shailentuli
2013/03/08 22:38:26
I've used Alan's recommendation below.
| |
171 | |
172 ### Problem | |
173 | |
174 You want to know how to escape special characters. | |
175 | |
Alan Knight
2013/03/07 21:15:00
Better phrased in terms of the problem. e.g. You w
shailentuli
2013/03/08 22:38:26
Done.
| |
176 ### Solution | |
177 | |
178 Prefix special characters with a `\`. | |
179 | |
180 print(Wile\nCoyote'); | |
181 // Wile | |
182 // Coyote | |
183 | |
184 ### Discussion | |
185 | |
186 Dart designates a few characters as special, and these can be escaped: | |
187 | |
188 - \n for newline, equivalent to \x0A. | |
189 - \r for carriage return, equivalent to \x0D. | |
190 - \f for form feed, equivalent to \x0C. | |
191 - \b for backspace, equivalent to \x08. | |
192 - \t for tab, equivalent to \x09. | |
193 - \v for vertical tab, equivalent to \x0B. | |
194 | |
Alan Knight
2013/03/07 21:15:00
Also \$. And I don't see any mention of raw strin
shailentuli
2013/03/08 22:38:26
I'm saving them for a separate recipe which will f
| |
195 If you prefer, you can use `\x` or `\u` notation to indicate the special | |
floitsch
2013/03/07 17:22:07
We also have \u{A}
shailentuli
2013/03/08 22:38:26
I've added an example of this.
| |
196 character: | |
197 | |
198 print('Wile\x0ACoyote'); // same as print('Wile\nCoyote'); | |
199 print('Wile\u000ACoyote'); // same as print('Wile\nCoyote'); | |
200 | |
201 If you escape a non-special character, the `\` is ignored: | |
202 | |
203 print('Wile \E Coyote'); // 'Wile E Coyote' | |
204 | |
205 | |
206 ## Incrementally building a string efficiently using a StringBuffer | |
207 | |
208 ### Problem | |
209 | |
210 You want to collect string fragments and combine them in an efficient manner. | |
211 | |
212 ### Solution | |
213 | |
214 Use a StringBuffer to programmatically generate a string. A StringBuffer | |
215 collects the string fragments, but does not generate a new string until | |
216 `toString()` is called: | |
217 | |
218 var sb = new StringBuffer(); | |
sethladd
2013/03/07 05:15:16
This example doesn't show off the right way to use
| |
219 sb.write('John, '); | |
220 sb.write('Paul, '); | |
221 sb.write('George, '); | |
222 sb.write('and Ringo'); | |
223 var beatles = sb.toString(); // 'John, Paul, George, and Ringo' | |
224 | |
225 ### Discussion | |
226 | |
227 In addition to `write()`, the StringBuffer class provides methods to write a | |
228 list of strings (`writeAll()`), write a numerical character code | |
229 (`writeCharCode()`), write with an added newline ('writeln()`), and more. Here | |
230 is a simple example that show the use of these methods: | |
231 | |
232 var sb = new StringBuffer(); | |
233 sb.writeln('The Beatles:'); | |
234 sb.writeAll(['John, ', 'Paul, ', 'George, and Ringo']); | |
235 sb.writeCharCode(33); // charCode for '!'. | |
236 var beatles = sb.toString(); // 'The Beatles:\nJohn, Paul, George, and Ringo !' | |
237 | |
238 Since a StringBuffer waits until the call to `toString()` to generate the | |
239 concatenated string, it represents a more efficient way of combining strings | |
240 than `concat()`. See the _Concatenating Strings_ recipe for a description of | |
241 `concat()`. | |
242 | |
243 ## Converting between string characters and numerical codes | |
244 | |
245 ### Problem | |
246 | |
247 You want to convert string characters into numerical codes and back. | |
sethladd
2013/03/07 05:15:16
Is there a more real-life problem here? Why would
Alan Knight
2013/03/07 21:15:00
I need to compare character in a string to numeric
| |
248 | |
249 ### Solution | |
250 | |
251 Use the `runes` getter to access a string's code points: | |
252 | |
253 'Dart'.runes.toList(); // [68, 97, 114, 116] | |
254 | |
255 var smileyFace = '\u263A'; // ☺ | |
256 smileyFace.runes.toList(); // [9786] | |
257 | |
258 The number 9786 represents the code unit '\u263A'. | |
259 | |
260 Use `string.codeUnits()` to get a string's UTF-16 code units: | |
261 | |
262 'Dart'.codeUnits.toList(); // [68, 97, 114, 116] | |
263 smileyFace.codeUnits.toList(); // [9786] | |
264 | |
265 ### Discussion | |
266 | |
267 Notice that using `runes` and `codeUnits()` produces identical results | |
floitsch
2013/03/07 17:22:07
no () for codeUnits
shailentuli
2013/03/08 22:38:26
Done.
| |
268 in the examples above. That happens because each character in 'Dart' and in | |
269 `smileyFace` fits within 16 bits, resulting in a code unit corresponding | |
270 neatly with a code point. | |
271 | |
272 Consider an example where a character cannot be represented within 16-bits, | |
273 the Unicode character for a Treble clef ('\u{1F3BC}'). This character consists | |
274 of a surrogate pair: '\uD83C', '\uDFBC'. Getting the numerical value of this | |
275 character using `codeUnits()` and `runes` produces the following result: | |
floitsch
2013/03/07 17:22:07
no "()".
shailentuli
2013/03/08 22:38:26
Done.
| |
276 | |
277 var clef = '\u{1F3BC}'; // 🎼 | |
278 clef.codeUnits.toList(); // [55356, 57276] | |
279 clef.runes.toList(); // [127932] | |
280 | |
281 The numbers 55356 and 57276 represent `clef`'s surrogate pair, '\uD83C' and | |
282 '\uDFBC', respectively. The number 127932 represents the code point '\u1F3BC'. | |
283 | |
284 #### Using codeUnitAt() to access individual code units | |
285 | |
286 To access the 16-Bit UTF-16 code unit at a particular index, use | |
287 `codeUnitAt()`: | |
288 | |
289 'Dart'.codeUnitAt(0); // 68 | |
290 smileyFace.codeUnitAt(0); // 9786 | |
291 | |
292 Using `codeUnitAt()` with the multi-byte `clef` character leads to problems: | |
293 | |
294 clef.codeUnitAt(0); // 55356 | |
295 clef.codeUnitAt(1); // 57276 | |
296 | |
297 In either call to `clef.codeUnitAt()`, the values returned represent strings | |
298 that are only one half of a UTF-16 surrogate pair. These are not valid UTF-16 | |
299 strings. | |
Alan Knight
2013/03/07 21:15:00
It's worth pointing out that this is always the ca
| |
300 | |
301 | |
302 #### Converting numerical codes to strings | |
303 | |
304 You can generate a new string from runes or code units using the factory | |
305 `String.fromCharCodes(charCodes)`: | |
Alan Knight
2013/03/07 21:15:00
I think it's important to point out that you can g
| |
306 | |
307 new String.fromCharCodes([68, 97, 114, 116]); // 'Dart' | |
308 | |
309 new String.fromCharCodes([73, 32, 9825, 32, 76, 117, 99, 121]); | |
310 // 'I ♡ Lucy' | |
311 | |
312 new String.fromCharCodes([55356, 57276]); // 🎼 | |
313 new String.fromCharCodes([127932]), // 🎼 | |
314 | |
315 You can use the `String.fromCharCode()` factory to convert a single rune or | |
316 code unit to a string: | |
317 | |
318 new String.fromCharCode(68); // 'D' | |
319 new String.fromCharCode(9786); // ☺ | |
320 new String.fromCharCode(127932); // 🎼 | |
321 | |
322 Creating a string with only one half of a surrogate pair is permitted, but not | |
323 recommended. | |
324 | |
325 ## Determining if a string is empty | |
326 | |
327 ### Problem | |
328 | |
329 You want to know if a string is empty. You tried ` if(string) {...}`, but that | |
sethladd
2013/03/07 05:15:16
This is a great problem, well worded!
| |
330 did not work. | |
331 | |
332 ### Solution | |
333 | |
334 Use `string.isEmpty`: | |
floitsch
2013/03/07 17:22:07
or just string == "". Both are fine.
| |
335 | |
336 var emptyString = ''; | |
337 emptyString.isEmpty; // true | |
338 | |
339 A string with a space is not empty: | |
340 | |
341 var space = ' '; | |
342 space.isEmpty; // false | |
343 | |
344 ### Discussion | |
345 | |
346 Don't use `if (string)` to test the emptiness of a string. In Dart, all | |
347 objects except the boolean true evaluate to false. `if(string)` will always | |
348 be false. | |
Alan Knight
2013/03/07 21:15:00
And you will see a warning in the editor if you us
| |
349 | |
350 Don't try to explicitly test for the emptiness of a string: | |
351 | |
352 if (emptyString == anotherString) {...} | |
353 | |
354 This may work sometimes, but if `string` has an empty value that is | |
355 not a literal `''`, the comparisons will fail: | |
356 | |
357 emptyString == '\u0020'; // false | |
358 emptyString == '\u2004'; // false | |
Alan Knight
2013/03/07 21:15:00
Are you saying that the string '\u0020' is suppose
shailentuli
2013/03/08 22:38:26
This was erroneously added here. Removed.
| |
359 | |
360 | |
361 ## Removing leading and trailing whitespace | |
362 | |
363 ### Problem | |
364 | |
365 You want to remove leading and trailing whitespace from a string. | |
366 | |
367 ### Solution | |
368 | |
369 Use `string.trim()`: | |
370 | |
371 var space = '\n\r\f\t\v'; // We'll use a variety of space characters. | |
372 var string = '$space X $space'; | |
373 var newString = string.trim(); // 'X' | |
374 | |
375 The String class has no methods to remove leading and trailing whitespace. But | |
376 you can always use regExps. | |
Alan Knight
2013/03/07 21:15:00
This seems unclear. I think you mean "to remove *j
shailentuli
2013/03/08 22:38:26
Done.
| |
377 | |
378 Remove only leading whitespace: | |
379 | |
380 var newString = string.replaceFirst(new RegExp(r'^\s+'), ''); // 'X $space' | |
381 | |
382 Remove only trailing whitespace: | |
383 | |
384 var newString = string.replaceFirst(new RegExp(r'\s+$'), ''); // '$space X' | |
385 | |
Alan Knight
2013/03/07 21:15:00
Or you could do this with the runes or codePoints,
| |
386 | |
387 ## Calculating the length of a string | |
388 | |
389 ### Problem | |
390 | |
391 You want to get the length of a string, but are not sure how to | |
392 correctly calculate the length when working with Unicode. | |
393 | |
394 ### Solution | |
395 | |
396 Use string.length to get the number of UTF-16 code units in a string: | |
397 | |
398 'I love music'.length; // 12 | |
399 'I love music'.runes.length; // 12 | |
400 | |
401 ### Discussion | |
402 | |
403 For characters that fit into 16 bites, the code unit length is the same as the | |
floitsch
2013/03/07 17:22:07
bits
shailentuli
2013/03/08 22:38:26
Done.
| |
404 rune length: | |
405 | |
406 var hearts = '\u2661'; // ♡ | |
407 hearts.length; // 1 | |
408 hearts.runes.length; // 1 | |
409 | |
410 If the string contains any characters outside the Basic Multilingual | |
411 Plane (BMP), the rune length will be less than the code unit length: | |
412 | |
413 var clef = '\u{1F3BC}'; // 🎼 | |
414 clef.length; // 2 | |
415 clef.runes.length; // 1 | |
416 | |
417 var music = 'I $hearts $clef'; // 'I ♡ 🎼 ' | |
418 music.length; // 6 | |
419 music.runes.length // 5 | |
420 | |
421 Use `length` if you want to number of code units; use `runes.length` if you | |
422 want the number of distinct characters. | |
floitsch
2013/03/07 17:22:07
But what is a "character" ?
For example "é" is not
Alan Knight
2013/03/07 21:15:00
Yes, we probably need yet another section on these
shailentuli
2013/03/08 22:38:26
Changing to runes.
| |
423 | |
424 | |
425 ## Subscripting a string | |
426 | |
427 ### Problem | |
428 | |
429 You want to be able to access a character in a string at a particular index. | |
430 | |
431 ### Solution | |
432 | |
433 Subscript runes: | |
434 | |
435 var coffee = '\u{1F375}'; // 🍵 | |
436 coffee.runes.toList()[0]; // 127861 | |
437 | |
438 The number 127861 represents the code point for coffee, '\u{1F375}' (🍵 ). | |
439 | |
Alan Knight
2013/03/07 21:15:00
Technically I think that's "Teacup without handle"
shailentuli
2013/03/08 22:38:26
You are quite correct. Renaming to teacup.
| |
440 ### Discussion | |
441 | |
442 Subscripting a string directly can be problematic. This is because the default | |
443 `[]` implementation subscripts along code units. This means that | |
444 for non-BMP characters, subscripting yields invalid UTF-16 characters: | |
445 | |
446 'Dart'[0]; // 'D' | |
447 | |
448 var hearts = '\u2661'; // ♡ | |
449 hearts[0]; '\u2661' // ♡ | |
450 | |
451 coffee[0]; // 55356, Invalid string, half of a surrogate pair. | |
452 coffee.codeUnits.toList()[0]; // The same. | |
453 | |
Alan Knight
2013/03/07 21:15:00
I think the recommended answer is that you just us
| |
454 | |
455 ## Processing a string one character at a time | |
456 | |
457 ### Problem | |
458 | |
459 You want to do something with each individual character in a string. | |
460 | |
461 ### Solution | |
462 | |
463 To access an individual character, map the string runes: | |
464 | |
465 var charList = "Dart".runes.map((rune) => '*${new String.fromCharCode(rune)} *').toList(); | |
floitsch
2013/03/07 17:22:07
I'm questioning the utility of these strings. "é"
| |
466 // ['*D*', '*a*', '*r*', '*t*'] | |
467 | |
468 var runeList = happy.runes.map((rune) => [rune, new String.fromCharCode(rune )]).toList(), | |
469 // [[73, 'I'], [32, ' '], [97, 'a'], [109, 'm'], [32, ' '], [9786, '☺' ]] | |
470 | |
471 If you are sure that the string is in the Basic Multilingual Plane (BMP), you | |
floitsch
2013/03/07 17:22:07
But then you can also just index into the string.
| |
472 can use string.split(''): | |
473 | |
474 'Dart'.split(''); // ['D', 'a', 'r', 't'] | |
475 smileyFace.split('').length; // 1 | |
476 | |
477 Since `split('')` splits at the UTF-16 code unit boundaries, | |
478 invoking it on a non-BMP character yields the string's surrogate pair: | |
479 | |
480 var clef = '\u{1F3BC}'; // 🎼 , not in BMP. | |
481 clef.split('').length; // 2 | |
482 | |
483 The surrogate pair members are not valid UTF-16 strings. | |
484 | |
485 | |
486 ## Splitting a string into substrings | |
487 | |
488 ### Problem | |
489 | |
490 You want to split a string into substrings. | |
sethladd
2013/03/07 05:15:16
can you add "based on some pattern". An example he
| |
491 | |
492 ### Solution | |
493 | |
494 Use the `split()` method with a string or a regExp as an argument. | |
495 | |
496 var smileyFace = '\u263A'; | |
497 var happy = 'I am $smileyFace'; | |
498 happy.split(' '); // ['I', 'am', '☺'] | |
499 | |
500 Here is an example of using `split()` with a regExp: | |
501 | |
502 var nums = '2/7 3 4/5 3~/5'; | |
503 var numsRegExp = new RegExp(r'(\s|/|~/)'); | |
504 nums.split(numsRegExp); // ['2', '7', '3', '4', '5', '3', '5'] | |
505 | |
506 In the code above, the string `nums` contains various numbers, some of which | |
507 are expressed as fractions or as int-divisions. A regExp is used to split the | |
508 string to extract just the numbers. | |
509 | |
510 You can perform operations on the matched and unmatched portions of a string | |
511 when using `split()` with a regExp: | |
512 | |
513 'Eats SHOOTS leaves'.splitMapJoin((new RegExp(r'SHOOTS')), | |
514 onMatch: (m) => '*${m.group(0).toLowerCase()}*', | |
515 onNonMatch: (n) => n.toUpperCase()); // 'EATS *shoots* LEAVES' | |
516 | |
517 The regExp matches the middle word ('SHOOTS'). A pair of callbacks are | |
518 registered to transform the matched and unmatched substrings before the | |
519 substrings are joined together again. | |
520 | |
521 | |
522 ## Changing string case | |
523 | |
524 ### Problem | |
525 | |
526 You want to change the case of strings. | |
527 | |
528 ### Solution | |
529 | |
530 Use `string.toUpperCase()` and `string.toLowerCase()` to covert a string to | |
floitsch
2013/03/07 17:22:07
convert
shailentuli
2013/03/08 22:38:26
Done.
| |
531 lower-case or upper-case, respectively: | |
floitsch
2013/03/07 17:22:07
No. this is not a good solution.
This only works i
shailentuli
2013/03/08 22:38:26
Agreed. The goal is to show people how to correctl
floitsch
2013/03/09 00:01:41
Well, that it doesn't work except in some language
| |
532 | |
533 var theOneILove = 'I love Lucy'; | |
534 theOneILove.toUpperCase(); // 'I LOVE LUCY!' | |
535 theOneILove.toLowerCase(); // 'i love lucy!' | |
536 | |
537 ### Discussion | |
538 | |
539 Case changes affect the characters of bi-cameral scripts like Greek and French: | |
540 var zeus = '\u0394\u03af\u03b1\u03c2'; // 'Δίας' (Zeus in modern Greek) | |
541 zeus.toUpperCase(); // 'ΔΊΑΣ' | |
542 | |
543 var resume = '\u0052\u00e9\u0073\u0075\u006d\u00e9'; // 'Résumé' | |
544 resume.toLowerCase(); // 'résumé' | |
545 | |
546 They do not affect the characters of uni-cameral scripts like Devanagari (used f or | |
547 writing many of the languages of India): | |
548 | |
549 var chickenKebab = '\u091a\u093f\u0915\u0928 \u0915\u092c\u093e\u092c'; | |
550 // 'चिकन कबाब' (in Devanagari) | |
551 chickenKebab.toLowerCase(); // 'चिकन कबाब' | |
552 chickenKebab.toUpperCase(); // 'चिकन कबाब' | |
553 | |
554 If a character's case does not change when using `toUpperCase()` and | |
555 `toLowerCase()`, it is most likely because the character only has one | |
556 form. | |
557 | |
558 ## Determining whether a string contains another string | |
Alan Knight
2013/03/07 21:15:00
These should be much earlier, as they seem like ve
| |
559 | |
560 ### Problem | |
561 | |
562 You want to find out if a string is the substring of another string. | |
563 | |
564 ### Solution | |
565 | |
566 Use `string.contains()`: | |
567 | |
568 var fact = 'Dart strings are immutable'; | |
569 string.contains('immutable'); // True. | |
570 | |
571 You can indicate a startIndex as a second argument: | |
572 | |
573 string.contains('Dart', 2); // False | |
574 | |
575 ### Discussion | |
576 | |
577 The String library provides a couple of shortcuts for testing whether a string | |
578 is a substring of another: | |
579 | |
580 string.startsWith('Dart'); // True. | |
581 string.endsWith('e'); // True. | |
582 | |
583 You can also use `string.indexOf()`, which returns -1 if the substring is | |
584 not found within a string, and its matching index, if it is: | |
585 | |
586 string.indexOf('art') != -1; // True, `art` is found in `Dart` | |
587 | |
588 You can also use a regExp and `hasMatch()`: | |
589 | |
590 new RegExp(r'ar[et]').hasMatch(string); // True, 'art' and 'are' match. | |
591 | |
592 | |
593 ## Finding matches of a regExp pattern in a string | |
594 | |
595 ### Problem | |
596 | |
597 You want to use regExp to match a pattern in a string, and | |
598 want to be able to access the matches. | |
599 | |
600 ### Solution | |
601 | |
602 Construct a regular expression using the RegExp class and find matches using | |
603 the `allMatches()` method: | |
604 | |
605 var neverEatingThat = 'Not with a fox, not in a box'; | |
606 var regExp = new RegExp(r'[fb]ox'); | |
607 List matches = regExp.allMatches(neverEatingThat); | |
608 matches.map((match) => match.group(0)).toList(); // ['fox', 'box'] | |
609 | |
610 ### Discussion | |
611 | |
612 You can query the object returned by `allMatches()` to find out the number of | |
613 matches: | |
614 | |
615 matches.length; // 2 | |
616 | |
617 To find the first match, use `firstMatch()`: | |
618 | |
619 regExp.firstMatch(neverEatingThat).group(0); // 'fox' | |
620 | |
621 To directly access the matched string, use `stringMatch()`: | |
622 | |
623 regExp.stringMatch(neverEatingThat); // 'fox' | |
624 regExp.stringMatch('I like bagels and lox'); // null | |
625 | |
626 | |
627 ## Substituting strings based on regExp matches | |
628 | |
629 ### Problem | |
630 | |
631 You want to match substrings within a string and make substitutions based on | |
632 the matches. | |
633 | |
634 ### Solution | |
635 | |
636 Construct a regular expression using the RegExp class and make replacements | |
637 using `replaceAll()` method: | |
638 | |
639 'resume'.replaceAll(new RegExp(r'e'), '\u00E9'); // 'résumé' | |
640 | |
641 If you want to replace just the first match, use 'replaceFirst()`: | |
642 | |
643 '0.0001'.replaceFirst(new RegExp(r'0+'), ''); // '.0001' | |
644 | |
645 The RegExp matches for one or more 0's and replaces them with an empty string. | |
646 | |
647 You can use `replaceAllMatched()` and register a function to modify the | |
648 matches: | |
649 | |
650 var heart = '\u2661'; // '♡' | |
651 var string = 'I like Ike but I $heart Lucy'; | |
652 var regExp = new RegExp(r'[A-Z]\w+'); | |
653 string.replaceAllMapped(regExp, (match) => match.group(0).toUpperCase()); | |
654 // 'I like IKE but I ♡ LUCY' | |
Alan Knight
2013/03/07 21:15:00
I think it would be nice to see some discussion of
| |
OLD | NEW |