Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(307)

Side by Side Diff: recipes/src/core/strings.md

Issue 12335109: Strings recipes for the Dart Cookbook (Closed) Base URL: https://github.com/dart-lang/cookbook.git@master
Patch Set: Fixed typos. Created 7 years, 9 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « recipes/pubspec.yaml ('k') | recipes/test/all_tests.dart » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 ## Concatenating Strings
2
3 ### Problem
4
5 You want to know how to concatenate strings in Dart. You tried using `+`, but
sethladd 2013/02/26 23:41:45 A more concise way: "You want to combine two or m
6 that resulted in an error.
7
8 ### Solution
9
10 Use adjacent string literals:
11
12 'Dart' 'is' ' fun!'; // 'Dart is fun!'
sethladd 2013/02/26 23:41:45 assign to variable
13
14 ### Discussion
15
16 Adjacent literals also work over multiple lines:
17
18 'Dart'
19 'is'
20 'fun!'; // 'Dart is fun!'
21
22 They also work when using multiline strings:
23
24 '''Peanut
25 butter'''
26 '''and
27 jelly'''; // 'Peanut\nbutter and\njelly'
28
29 You can also concatenate adjacent single line literals with multiline strings:
30
31 'Peanut ' 'butter'
32 ''' and
33 jelly'''; // 'Peanut butter and\n jelly'
34
35 #### Alternatives to adjacent string literals
36
37 Use `concat()`:
sethladd 2013/02/26 23:41:45 I'm not sure we want to show things that we genera
38
39 'Dewey'.concat(' Cheatem').concat(' and').concat( ' Howe'); // 'Dewey Cheate m and Howe'
sethladd 2013/02/26 23:41:45 nice example :)
40
41 Since `concat()` creates a new string every time it is invoked, a long chain of
42 `concat()`s can be expensive; if you need to incrementally build up a long
43 string, use a StringBuffer instead (see below).
44
45 Use `join()` to combine a sequence of strings:
46
47 ['Dewey', 'Cheatem', 'and', 'Howe'].join(' '); // 'Dewey Cheatem and Howe'
48
49 You can also use string interpolation (see below).
50
51
52 ## Interpolating expressions inside strings
53
54 ### Problem
55
56 You want to create strings that contain Dart expressions and identifiers.
57
58 ### Solution
59
60 You can put the value of an expression inside a string by using ${expression}.
sethladd 2013/02/26 23:41:45 wrap in backticks?
sethladd 2013/02/26 23:41:45 Shouldn't it be more declarative? s/You can put/P
61
62 var favFood = 'sushi';
63 'I love ${favFood.toUpperCase()}'; // 'I love SUSHI'
64
65 You can skip the {} if the expression is an identifier:
66
67 'I love $favFood'; // 'I love sushi'
68
69 ### Discussion
70
71 An interpolated string‚ `string ${expression}` is equivalent to the
72 concatenation of the strings ‚ 'string ' and `expression.toString()`.
73 Consider this code:
74
75 var four = 4;
76 'The $four seasons'; // 'The 4 seasons'
77
78 It is equivalent to the following:
79
80 'The '.concat(4.toString()).concat(' seasons'); // 'The 4 seasons'
81
82 You should consider implementing a `toString()` method for user-defined
sethladd 2013/02/26 23:41:45 This is a separate recipe. Consider breaking it ou
83 objects. Here's what happens if you don't:
84
85 class Point {
86 num x, y;
87 Point(this.x, this.y);
88 }
89
90 var point = new Point(3, 4);
91 'Point: $point'; // "Point: Instance of 'Point'"
sethladd 2013/02/26 23:41:45 assign to variable. Creating objects without varia
92
93 Probably not what you wanted. Here is the same example with an explicit
94 `toString()`:
95
96 class Point {
97 ...
98
99 String toString() => "x: $x, y: $y";
100 }
101
102 'Point: $point'; // 'Point: x: 3, y: 4'
103
104 Interpolations are not evaluated within raw strings:
105
106 r'$favFood'; // '$favFood'
sethladd 2013/02/26 23:41:45 This came out of left field. Can you add a recipe
107
108 ## Incrementally building a string efficiently using a StringBuffer
109
110 ### Problem
111
112 You want to collect string fragments and combine them in an efficient manner.
113
114 ### Solution
115
116 Use a StringBuffer to programmatically generate a string. A StringBuffer
117 collects the string fragments, but does not generate a new string until
118 `toString()` is called:
119
120 var sb = new StringBuffer();
121 sb.write("John, ");
122 sb.write("Paul, ");
123 sb.write("George, ");
124 sb.write("and Ringo");
125 sb.toString(); // "John, Paul, George, and Ringo"
126
sethladd 2013/02/26 23:41:45 assign to variable
127 ### Discussion
128
129 In addition to `write()`, the StringBuffer class provides methods to write a
130 list of strings (`writeAll()`), write a numerical character code
131 (`writeCharCode()`), write with an added newline ('writeln()`), and more. Here
sethladd 2013/02/26 23:41:45 Is this now writeCharUnit?
132 is a simple example that show the use of these methods:
133
134 var sb = new StringBuffer();
135 sb.writeln("The Beatles:");
136 sb.writeAll(['John, ', 'Paul, ', 'George, and Ringo']);
137 sb.writeCharCode(33); // charCode for '!'.
138 sb.toString(); // 'The Beatles:\nJohn, Paul, George, and Ringo!'
139
140 Since a StringBuffer waits until the call to `toString()` to generate the
141 concatenated string, it represents a more efficient way of combining strings
142 than `concat()`. See the "Concatenating Strings" recipe for a description of
143 `concat()`.
144
145 ## Converting between string characters and numbers
sethladd 2013/02/26 23:41:45 Can you change the title? I saw "numbers" and I th
146
147 ### Problem
148
149 You want to convert string characters into numerical code units and back.
150
151 ### Solution
152
153 Use `string.codeUnits()` to access the sequence of Unicode UTF-16 code units
154 that make up a string:
155
156 'Dart'.codeUnits.toList(); // [68, 97, 114, 116]
157
158 var smileyFace = '\u263A'; // ☺
159 smileyFace.codeUnits.toList(); // [9786]
160
161 The number 9786 represents the code unit '\u263A'.
162
163 Use the `runes` getter to access a string's code points:
164
165 'Dart'.runes.toList(); // [68, 97, 114, 116]
166 smileyFace.runes.toList(); // [9786]
167
168 ### Discussion
169
170 Notice that using `runes` and `codeUnits()` produces identical results
171 in the examples above. That is because each character in both 'Dart' and
172 `smileyFace` fits within 16 bits, resulting in a code unit corresponding
173 neatly with a code point.
174
175 Consider an example where a character cannot be represented within 16-bits,
176 the Unicode character for a Treble clef ('\u{1F3BC}'). This character consists
177 of a surrogate pair: '\uD83C', '\uDFBC'. Getting the numerical value of this
178 character using `codeUnits()` produces the following result:
179
180 var clef = '\u{1F3BC}'; // 🎼
181 clef.codeUnits.toList(); // [55356, 57276]
182
183 The numbers 55356 and 57276 represent `clef`'s surrogate pair, '\uD83C' and
184 '\uDFBC', respectively.
185
186 #### Using the runes getter
187
188 You can also use `runes` to convert a string to its corresponding numerical valu es:
189
190 clef.runes.toList(); // [127932]
191
192 The number 127932 represents the code point '\u1F3BC'.
193
194 #### Using codeUnitAt() to access individual characters
195
196 To access the 16-Bit UTF-16 code unit at a particular index, use
197 `codeUnitAt()`:
198
199 'Dart'.codeUnitAt(0); // 68
200 smileyFace.codeUnitAt(0); // 9786
201
202 The number 9786 represents the code unit '\u263A', the `smileyFace`
203 characrter.
204
205 Using `codeUnitAt()` with the multi-byte `clef` character leads to problems:
206
207 clef.codeUnitAt(0); // 55356
208 clef.codeUnitAt(1); // 57276
209
210 In either call to `clef.codeUnitAt()`, the values returned represent strings
211 that are only one half of a UTF-16 surrogate pair. These are not valid UTF-16
212 strings.
213
214 #### Converting numerical values to strings
215
216 You can generate a new string from code units using the factory
217 `String.fromCharCodes(charCodes)`:
218
219 new String.fromCharCodes([68, 97, 114, 116]); // 'Dart'
220
221 var heart = '\u2661'; // ♡
222 new String.fromCharCodes([73, 32, 9825, 32, 76, 117, 99, 121]);
223 // 'I ♡ Lucy'
224
225 The charCodes can be UTF-16 code units or runes.
226
227 The Unicode character for a Treble clef is '\u{1F3BC}', with a rune value of
228 127932. Passing either code units, or a code point to `String.fromCharCodes()`
229 produces the `clef` string:
230
231 new String.fromCharCodes([55356, 57276]); // 🎼
232 new String.fromCharCodes([127932]), // 🎼
233
234 You can use the `String.fromCharCode()` factory to convert a single code unit
235 to a string:
236
237 new String.fromCharCode(127932); // 🎼
238
239 Creating a string with only one half of a surrogate pair is permitted, but not
240 recommended.
241
242 ## Determining if a string is empty
243
244 ### Problem
245
246 You want to know if a string is empty. You tried ` if(string) {...}`, but that
247 did not work.
248
249 ### Solution
250
251 Use `string.isEmpty`:
252
253 var emptyString = '';
254 emptyString.isEmpty; // true
255
256 A string with a space is not empty:
257
258 var space = " ";
259 space.isEmpty; // false
260
261 ### Discussion
262
263 Don't use `if (string)` to test the emptiness of a string. In Dart, all
264 objects except the boolean true evaluate to false. `if(string)` will always
265 be false.
266
267 Don't try to explicitly test for the emptiness of a string:
268
269 if (emptyString == anotherString) {...}
270
271 This may work sometimes, but if `string` has an empty value that is
272 not a literal `''`, the comparisons will fail:
273
274 emptyString == '\u0020'; // false
275 emptyString == '\u2004'; // false
276
277 ## Removing leading and trailing whitesapce
278
279 ### Problem
280
281 You want to remove leading and trailing whitespace from a string.
282
283 ### Solution
284
285 Use `string.trim()`:
286
287 var space = '\n\r\f\t\v';
288 var string = '$space X $space';
289 string.trim(); // 'X'
290
291 The String class has no methods to remove leading and trailing whitespace. But
292 you can always use regExps.
293
294 Remove only leading whitespace:
295
296 string.replaceFirst(new RegExp(r'^\s+'), ''); // 'X $space'
297
298 Remove only trailing whitespace:
299
300 string.replaceFirst(new RegExp(r'\s+$'), ''); // '$space X'
301
302 ## Calculating the length of a string
303
304 ### Problem
305
306 You want to get the length of a string, but are not sure how to
307 correctly calculate the length when working with Unicode.
308
309 ### Solution
310
311 Use string.length to get the number of UTF-16 code units in a string:
312
313 'I love music'.length; // 12
314
315 ### Discussion
316
317 For characters that fit into 16 bites, the code unit length is the same as the
318 rune length:
319
320 var hearts = '\u2661'; // ♡
321
322 hearts.length; // 1
323 hearts.runes.length; // 1
324
325 If the string contains any characters outside the Basic Multilingual
326 Plane (BMP), the rune length will be less than the code unit length:
327
328 var clef = '\u{1F3BC}'; // 🎼
329 clef.length; // 2
330 clef.runes.length; // 1
331
332 var music = 'I $hearts $clef'; // 'I ♡ 🎼'
333 music.length; // 6
334 music.runes.length // 5
335
336 Use `length` if you want to number of code units; use `runes.length` if you
337 want the number of distinct characters.
338
339 ## Getting the character at a specific index in a string
340
341 ### Problem
342
343 You want to be able to access a character in a string at a particular index.
344
345 ### Solution
346
347 For strings in the Basic Multilingual Plane (BMP), use [] to subscript the
348 string:
349
350 'Dart'[0]; // 'D'
351
352 var hearts = '\u2661'; // ♡
353 hearts[0]; '\u2661' // ♡
354
355 For non-BMP characters, subscripting yields invalid UTF-16 characters:
356
357 var coffee = '\u{1F375}'; // 🍵
358 var doughnuts = '\u{1F369}'; // 🍩
359 var healthFood = '$coffee and $doughnuts'; // 🍵 and 🍩
360
361 healthFood[0]; // Invalid string, half of a surrogate pair.
362
363 You can slice the string to get the first 2 code units:
364
365 healthFood.slice(0, 2); // 🍵
366
367 #### The safer approach: subscript runes
368
369 You can always subscript runes and be sure that you are dealing with complete
370 characters:
371
372 healthFood.runes.first; // 127861
373
374 The number 127861 represents the code point for coffee, '\u{1F375}' (🍵 ).
375
376 Contrast this with the result of subscripting `codeUnits`:
377
378 healthFood.codeUnits.first; // 55356
379
380 The number 55356 represents the first of the surrogate pair for '\u{1F375}'.
381 This is not a valid UTF-16 string.
382
383 If you are dealing with non-BMP characters, avoid subscripting `codeUnits`.
384
385
386 ## Splitting a string
387
388 ### Problem
389
390 You want to split a string into substrings.
391
392 ### Solution
393
394 To split a string into a list of characters, map the string runes:
395
396 "dart".runes.map((rune) => new String.fromCharCode(rune)).toList();
397 // ['d', 'a', 'r', 't']
398
399 var smileyFace = '\u263A'; // ☺
400 var happy = 'I am $smileyFace'; // 'I am ☺'
401 happy.runes.map((charCode) => new String.fromCharCode(charCode)).toList();
402 // [I, , a, m, , ☺]
403
404 You can also use string.split(''):
405
406 'Dart'.split(''); // ['D', 'a', 'r', 't']
407 smileyFace.split('').length; // 1
408
409 Do this only if you are sure that the string is in the Basic Multilingual
410 Plane (BMP). Since `split('')` splits at the UTF-16 code unit boundaries,
411 invoking it on a non-BMP character yields the string's surrogate pair:
412
413 var clef = '\u{1F3BC}'; // 🎼, not in BMP.
414 clef.split('').length; // 2
415
416 The surrogate pair members are not valid UTF-16 strings.
417
418
419 ### Split a string using a regExp
420
421 The `split()` method takes a string or a regExp as an argument. Here is an
422 example of using `split()` with a regExp:
423
424 var nums = "2/7 3 4/5 3~/5";
425 var numsRegExp = new RegExp(r'(\s|/|~/)');
426 nums.split(numsRegExp); // ['2', '7', '3', '4', '5', '3', '5']
427
428 In the code above, the string `nums` contains various numbers, some of which
429 are expressed as fractions or as int-divisions. A regExp is used to split the
430 string to extract just the numbers.
431
432 You can perform operations on the matched and unmatched portions of a string
433 when using `split()` with a regExp:
434
435 'Eats SHOOTS leaves'.splitMapJoin((new RegExp(r'SHOOTS')),
436 onMatch: (m) => '*${m.group(0).toLowerCase()}*',
437 onNonMatch: (n) => n.toUpperCase()); // 'EATS *shoots* LEAVES'
438
439 The regExp matches the middle word ("SHOOTS"). A pair of callbacks are
440 registered to transform the matched and unmatched substrings before the
441 substrings are joined together again.
442
443 ## Changing string case
444
445 ### Problem
446
447 You want to change the case of strings.
448
449 ### Solution
450
451 Use `string.toUpperCase()` and `string.toLowerCase()` to covert a string to
452 lower-case or upper-case, respectively:
453
454 var string = "I love Lucy";
455 string.toUpperCase(); // 'I LOVE LUCY!'
456 string.toLowerCase(); // 'i love lucy!'
457
458 ### Discussion
459
460 Case changes affect the characters of bi-cameral scripts like Greek and French:
461
462 var zeus = '\u0394\u03af\u03b1\u03c2'; // Δίας (Zeus in modern Greek)
463 zeus.toUpperCase(); // 'ΔΊΑΣ'
464
465 var resume = '\u0052\u00e9\u0073\u0075\u006d\u00e9'; // Résumé
466 resume.toLowerCase(); // 'résumé'
467
468 They do not affect the characters of uni-case scripts like Devanagari (used for
469 writing many of the languages of India):
470
471 var chickenKebab = '\u091a\u093f\u0915\u0928 \u0915\u092c\u093e\u092c';
472 // चिकन कबाब (in Devanagari)
473 chickenKebab.toLowerCase(); // चिकन कबाब
474 chickenKebab.toUpperCase(); // चिकन कबाब
475
476 If a character's case does not change when using `toUpperCase()` and
477 `toLowerCase()`, it is most likely because the character only has one
478 form.
479
480 ## Determining whether a string contains another string
481
482 ### Problem
483
484 You want to find out if a string is the subset of another string.
485
486 ### Solution
487
488 Use `string.contains()`:
489
490 var string = 'Dart strings are immutable';
491 string.contains('immutable'); // True.
492
493 You can indicate a startIndex as a second argument:
494
495 string.contains('Dart', 2); // False
496
497 ### Discussion
498
499 The String library provides a couple of shortcuts for testing whether a string
500 is a substring of another:
501
502 string.startsWith('Dart'); // True.
503 string.endsWith('e'); // True.
504
505 You can also use `string.indexOf()`, which returns -1 if the substring is
506 not found within a string, and its matching index, if it is:
507
508 string.indexOf('art') != -1; // True, `art` is found in `Dart`
509
510 You can also use a regExp and `hasMatch()`:
511
512 new RegExp(r'ar[et]').hasMatch(string); // True, 'art' and 'are' match.
513
514
515 ## Finding matches of a regExp pattern in a string
516
517 ### Problem
518
519 You want to use regExp to match a pattern in a string, and
520 want to be able to access the matches.
521
522 ### Solution
523
524 Construct a regular expression using the RegExp class and find matches using
525 the `allMatches()` method:
526
527 var string = 'Not with a fox, not in a box';
528 var regExp = new RegExp(r'[fb]ox');
529 List matches = regExp.allMatches(string);
530 matches.map((match) => match.group(0)).toList(); // ['fox', 'box']
531
532 You can query the object returned by `allMatches()` to find out the number of
533 matches:
534
535 matches.length; // 2
536
537 To find the first match, use `firstMatch()`:
538
539 regExp.firstMatch(string).group(0); // 'fox'
540
541
542 ## Substituting strings based on regExp matches
543
544 ### Problem
545
546 You want to match substrings within a string and make substitutions based on
547 the matches.
548
549 ### Solution
550
551 Construct a regular expression using the RegExp class and make replacements
552 using `replaceAll()` method:
553
554 'resume'.replaceAll(new RegExp(r'e'), '\u00E9'); // 'résumé'
555
556 If you want to replace just the first match, use 'replaceFirst()`:
557
558 '0.0001'.replaceFirst(new RegExp(r'0+'), ''); // '.0001'
559
560 The RegExp matches for one or more 0's and replaces them with an empty string.
561
562 You can use `replaceAllMatched()` and register a function to modify the
563 matches:
564
565 var heart = '\u2661'; // ♡
566 var string = "I like Ike but I $heart Lucy";
567 var regExp = new RegExp(r'[A-Z]\w+');
568 string.replaceAllMapped(regExp, (match) => match.group(0).toUpperCase());
569 // 'I like IKE but I ♡ LUCY'
OLDNEW
« no previous file with comments | « recipes/pubspec.yaml ('k') | recipes/test/all_tests.dart » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698