Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(48)

Side by Side Diff: base/debug/format.cc

Issue 18656004: Added a new SafeSPrintf() function that implements snprintf() in an async-safe-fashion (Closed) Base URL: svn://svn.chromium.org/chrome/trunk/src
Patch Set: Addressed Julien's comments Created 7 years, 4 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch | Annotate | Revision Log
« no previous file with comments | « base/debug/format.h ('k') | base/debug/format_unittest.cc » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 // Copyright (c) 2013 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file.
4 //
5 // Author: markus@chromium.org
6
7 #include <limits>
8
9 #include "base/debug/format.h"
10
11 #if !defined(NDEBUG)
12 // In debug builds, we use RAW_CHECK() to print useful error messages, if
13 // Format() is called with broken arguments.
14 // As our contract promises that Format() can be called from any restricted
15 // run-time context, it is not actually safe to call logging functions from it;
16 // and we only ever do so for debug builds and hope for the best.
17 // We should _never_ call any logging function other than RAW_CHECK(), and
18 // we should _never_ include any logging code that is active in production
19 // builds. Most notably, we should not include these logging functions in
20 // unofficial release builds, even though those builds would otherwise have
21 // DCHECKS() enabled.
22 // In other words; please do not remove the #ifdef around this #include.
23 // Instead, in production builds we opt for returning a degraded result,
24 // whenever an error is encountered.
25 // E.g. The broken function call
26 // Format("errno = %d (%x)", errno, strerror(errno))
27 // will print something like
28 // errno = 13, (%x)
29 // instead of
30 // errno = 13 (Access denied)
31 // In most of the anticipated use cases, that's probably the preferred
32 // behavior.
33 #include "base/logging.h"
34 #define RAW_DCHECK RAW_CHECK
jln (very slow on Chromium) 2013/08/06 22:47:44 I would love a real RAW_DCHECK in logging.h. If wh
35 #else
36 #define RAW_DCHECK(x) do { if (x) { } } while (0)
37 #endif
38
39
40 namespace base {
41 namespace debug {
42
43 // The code in this file is extremely careful to be async-signal-safe.
44 //
45 // Most obviously, we avoid calling any code that could dynamically allocate
46 // memory. Doing so would almost certainly result in bugs and dead-locks.
47 // We also avoid calling any other STL functions that could have unintended
48 // side-effects involving memory allocation or access to other shared
49 // resources.
50 //
51 // But on top of that, we also avoid calling other library functions, as many
52 // of them have the side-effect of calling getenv() (in order to deal with
53 // localization) or accessing errno. The latter sounds benign, but there are
54 // several execution contexts where it isn't even possible to safely read let
55 // alone write errno.
56 //
57 // The stated design goal of the Format() function is that it can be called
58 // from any context that can safely call C or C++ code (i.e. anything that
59 // doesn't require assembly code).
60 //
61 // For a brief overview of some but not all of the issues with async-signal-
62 // safety, refer to:
63 // http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html
64
65 namespace {
66
67 // Increments |count| by |inc| unless this would cause |count| to overflow.
68 // Returns "false", iff an overflow was detected.
jln (very slow on Chromium) 2013/08/06 22:47:44 The documentation here isn't correct. You need to
69 inline bool IncrementCount(size_t* count, size_t inc) {
70 // "inc" is either 1 or a "padding" value. Padding is clamped at run-time to
71 // at most SSIZE_MAX. So, we know that "inc" is always in the range
72 // 1..SSIZE_MAX.
73 // This allows us to compute "SSIZE_MAX - inc" without incurring any
74 // integer overflows.
75 RAW_DCHECK((size_t)inc <= (size_t)std::numeric_limits<ssize_t>::max());
jln (very slow on Chromium) 2013/08/06 22:47:44 The first cast (of inc) isn't necessary. The seco
76 if (*count > std::numeric_limits<ssize_t>::max() - inc) {
77 *count = std::numeric_limits<ssize_t>::max();
78 return false;
79 } else {
80 *count += inc;
81 return true;
82 }
83 }
84
85 // Convenience method for the common case of incrementing |count| by one.
86 inline bool IncrementCountByOne(size_t* count) {
87 return IncrementCount(count, 1);
88 }
89
90 // Emits one |ch| character into the |buf| buffer of size |sz| and updates
jln (very slow on Chromium) 2013/08/06 22:47:44 "updates the count" is not clear. "the count" mea
91 // the |count|. Returns "false", iff the buffer was already full.
92 inline bool Out(char* buf, size_t sz, size_t* count, char ch) {
93 if (sz >= 1 && *count < sz - 1) {
94 buf[*count] = ch;
95 IncrementCountByOne(count);
jln (very slow on Chromium) 2013/08/06 22:47:44 Shouldn't this be if (IncrementCountByOne(count))
96 return true;
97 }
98 IncrementCountByOne(count);
99 return false;
100 }
101
102 // Inserts |padding|-|len| bytes worth of padding into the |buf| buffer of
103 // size |sz|. |ptr| marks the position where bytes should start to be emitted,
104 // and it will be updated upon return. |count| will also be incremented by the
105 // number of bytes emitted. The |pad| character is typically either a ' ' space
106 // or a '0' zero, but other non-NUL values are legal.
107 inline void Pad(char* buf, size_t sz, size_t* count, char pad, size_t padding,
108 size_t len, char** ptr) {
109 char *dst = *ptr;
110 for (; padding > len; --padding)
111 if (Out(buf, sz, count, pad))
112 ++dst;
113 else {
114 if (--padding)
115 IncrementCount(count, padding-len);
116 break;
117 }
118 *ptr = dst;
119 }
120
121 // POSIX doesn't define any async-signal-safe function for converting
122 // an integer to ASCII. Define our own version.
123 //
124 // This also gives us the ability to make the function a little more powerful
125 // and have it deal with |padding|, with truncation, and with predicting the
126 // length of the untruncated output.
127 //
128 // IToASCII() converts an (optionally signed) integer to ASCII. It never
129 // writes more than |sz| bytes. Output will be truncated as needed, and a NUL
130 // character is appended, unless |sz| is zero. It returns the number of non-NUL
131 // bytes that would be output if no truncation had happened.
132 //
133 // It supports bases 2 through 16. Padding can be done with either '0' zeros
134 // or ' ' spaces.
135 size_t IToASCII(bool sign, bool upcase, int64_t i, char* buf, size_t sz,
136 int base, size_t padding, char pad) {
jln (very slow on Chromium) 2013/08/06 22:47:44 Style: don't mix input and outputs. Normally outpu
137 // Sanity check for the "base".
138 if (base < 2 || base > 16 || (sign && base != 10)) {
139 if (static_cast<ssize_t>(sz) >= 1)
140 buf[0] = '\000';
141 return 0;
142 }
143
144 // Handle negative numbers, if requested by caller.
145 size_t count = 0;
146 size_t n = 1;
147 char* start = buf;
148 int minint = 0;
149 bool needs_minus = false;
150 uint64_t num;
151 if (sign && i < 0) {
152 // If we aren't inserting padding, or if we are padding with '0' zeros,
153 // we should insert the minus character now. It makes it easier to
154 // correctly deal with truncated padded numbers.
155 // On the other hand, if we are padding with ' ' spaces, we have to
156 // delay outputting the minus character until later.
157 if (padding <= 2 || pad == '0') {
158 ++count;
159
160 // Make sure we can write the '-' character.
161 if (++n > sz) {
162 if (sz > 0)
163 *start = '\000';
164 } else
165 *start++ = '-';
166
167 // Adjust padding, since we just output one character already.
168 if (padding)
169 --padding;
170 } else
171 needs_minus = true;
172
173 // Turn our number positive.
174 if (i == std::numeric_limits<int64_t>::min()) {
175 // The most negative integer needs special treatment.
176 minint = 1;
177 num = -(i + 1);
178 } else {
179 // "Normal" negative numbers are easy.
180 num = -i;
181 }
182 } else
183 num = i;
184
185 // Loop until we have converted the entire number. Output at least one
186 // character (i.e. '0').
187 char* ptr = start;
188 bool started = false;
189 do {
190 // Sanity check. If padding is used to fill the entire address space,
191 // don't allow more than SSIZE_MAX bytes.
192 if (++count == static_cast<size_t>(std::numeric_limits<ssize_t>::max())) {
193 RAW_DCHECK(count <
194 static_cast<size_t>(std::numeric_limits<ssize_t>::max()));
195 break;
196 }
197
198 // Make sure there is still enough space left in our output buffer.
199 if (n == sz) {
200 if (ptr > start) {
201 // It is rare that we need to output a partial number. But if asked
202 // to do so, we will still make sure we output the correct number of
203 // leading digits.
204 // Since we are generating the digits in reverse order, we actually
205 // have to discard digits in the order that we have already emitted
206 // them. This is essentially equivalent to:
207 // memmove(start, start+1, --ptr - start)
208 --ptr;
209 for (char* move = start; move < ptr; ++move)
210 *move = move[1];
211 } else
212 goto cannot_write_anything_but_nul;
213 } else
214 ++n;
215
216 // Output the next digit and (if necessary) compensate for the most
217 // negative integer needing special treatment. This works because,
218 // no matter the bit width of the integer, the lowest-most decimal
219 // integer always ends in 2, 4, 6, or 8.
220 if (n <= sz) {
221 if (!num && started)
222 if (needs_minus) {
223 *ptr++ = '-';
224 needs_minus = false;
225 } else
226 *ptr++ = pad;
227 else {
228 started = true;
229 *ptr++ = (upcase ? "0123456789ABCDEF" : "0123456789abcdef")
jln (very slow on Chromium) 2013/08/06 22:47:44 I would define the base strings as static const ch
230 [num%base+minint];
jln (very slow on Chromium) 2013/08/06 22:47:44 Nit: X % Y + Z
231 }
232 }
233
234 cannot_write_anything_but_nul:
235 minint = 0;
236 num /= base;
237
238 // Add padding, if requested.
239 if (padding > 0) {
240 --padding;
241
242 // Performance optimization for when we are asked to output
243 // excessive padding, but our output buffer is limited in size.
244 // Even if we output a 128bit number in binary, we would never
245 // write more than 130 characters. So, anything beyond this limit
246 // and we can compute the result arithmetically.
247 if (count > n && count - n > 130) {
248 IncrementCount(&count, padding);
249 padding = 0;
250 }
251 }
252 } while (num || padding || needs_minus);
253
254 // Terminate the output with a NUL character.
255 if (sz > 0)
256 *ptr = '\000';
257
258 // Conversion to ASCII actually resulted in the digits being in reverse
259 // order. We can't easily generate them in forward order, as we can't tell
260 // the number of characters needed until we are done converting.
261 // So, now, we reverse the string (except for the possible '-' sign).
262 while (--ptr > start) {
263 char ch = *ptr;
264 *ptr = *start;
265 *start++ = ch;
266 }
267 return count;
268 }
269
270 } // anonymous namespace
271
272 ssize_t internal::FormatN(char* buf, size_t sz, const char* fmt,
273 const Arg* args, const size_t max_args) {
274 // Make sure we can write at least one NUL byte.
275 if (static_cast<ssize_t>(sz) < 1)
276 return -1;
277
278 // Iterate over format string and interpret '%' arguments as they are
279 // encountered.
280 char* ptr = buf;
281 size_t padding;
282 char pad;
283 size_t count = 0;
284 for (unsigned int cur_arg = 0;
285 *fmt &&
286 count != static_cast<size_t>(std::numeric_limits<ssize_t>::max()); ) {
287 if (*fmt++ == '%') {
288 padding = 0;
289 pad = ' ';
290 char ch = *fmt++;
291 format_character_found:
292 switch (ch) {
293 case '0': case '1': case '2': case '3': case '4':
294 case '5': case '6': case '7': case '8': case '9':
295 // Found a width parameter. Convert to an integer value and store in
296 // "padding". If the leading digit is a zero, change the padding
297 // character from a space ' ' to a zero '0'.
298 pad = ch == '0' ? '0' : ' ';
299 for (;;) {
300 const size_t max_padding = std::numeric_limits<ssize_t>::max();
301 if (padding > max_padding/10 ||
jln (very slow on Chromium) 2013/08/06 22:47:44 style "X / Y"
302 10*padding > max_padding - (ch - '0')) {
303 RAW_DCHECK(padding <= max_padding/10 &&
304 10*padding <= max_padding - (ch - '0'));
305 // Integer overflow detected. Skip the rest of the width until
306 // we find the format character, then do the normal error handling.
307 while ((ch = *fmt++) >= '0' && ch <= '9') {
308 }
309 goto fail_to_expand;
310 }
311 padding = 10*padding + ch - '0';
312 ch = *fmt++;
313 if (ch < '0' || ch > '9') {
314 // Reached the end of the width parameter. This is where the format
315 // character is found.
316 goto format_character_found;
317 }
318 }
319 break;
320 case 'c': { // Output an ASCII character.
321 // Check that there are arguments left to be inserted.
322 if (cur_arg >= max_args) {
323 RAW_DCHECK(cur_arg < max_args);
324 goto fail_to_expand;
325 }
326
327 // Check that the argument has the expected type.
328 const Arg& arg = args[cur_arg++];
329 if (arg.type_ != Arg::INT &&
330 arg.type_ != Arg::UINT) {
331 RAW_DCHECK(arg.type_ == Arg::INT ||
332 arg.type_ == Arg::UINT);
333 goto fail_to_expand;
334 }
335
336 // Apply padding, if needed.
337 Pad(buf, sz, &count, ' ', padding, 1, &ptr);
338
339 // Convert the argument to an ASCII character and output it.
340 char ch = static_cast<char>(arg.i_);
341 if (!ch)
342 goto end_of_output_buffer;
343 if (Out(buf, sz, &count, ch))
344 ++ptr;
345 break; }
346 case 'd': { // Output a signed or unsigned integer-like value.
347 // Check that there are arguments left to be inserted.
348 if (cur_arg >= max_args) {
349 RAW_DCHECK(cur_arg < max_args);
350 goto fail_to_expand;
351 }
352
353 // Check that the argument has the expected type.
354 const Arg& arg = args[cur_arg++];
355 if (arg.type_ != Arg::INT &&
356 arg.type_ != Arg::UINT) {
357 RAW_DCHECK(arg.type_ == Arg::INT ||
358 arg.type_ == Arg::UINT);
359 goto fail_to_expand;
360 }
361
362 // Our implementation of IToASCII() can handle all widths of data types
363 // and can print both signed and unsigned values.
364 IncrementCount(&count,
365 IToASCII(arg.type_ == Arg::INT, false, arg.i_,
366 ptr, sz - (ptr - buf), 10, padding, pad));
367
368 // Advance "ptr" to the end of the string that was just emitted.
369 if (sz - (ptr - buf))
370 while (*ptr)
371 ++ptr;
372 break; }
373 case 'x': // Output an unsigned hexadecimal value.
374 case 'X':
375 case 'p': { // Output a pointer value.
376 // Check that there are arguments left to be inserted.
377 if (cur_arg >= max_args) {
378 RAW_DCHECK(cur_arg < max_args);
379 goto fail_to_expand;
380 }
381
382 const Arg& arg = args[cur_arg++];
383 int64_t i;
384 switch (ch) {
385 case 'x': // Hexadecimal values are available for integer-like args.
386 case 'X':
387 // Check that the argument has the expected type.
388 if (arg.type_ != Arg::INT &&
389 arg.type_ != Arg::UINT) {
390 RAW_DCHECK(arg.type_ == Arg::INT ||
391 arg.type_ == Arg::UINT);
392 goto fail_to_expand;
393 }
394 i = arg.i_;
395
396 // The Arg() constructor automatically performed sign expansion on
397 // signed parameters. This is great when outputting a %d decimal
398 // number, but can result in unexpected leading 0xFF bytes when
399 // outputting a %c hexadecimal number. Mask bits, if necessary.
400 // We have to do this here, instead of in the Arg() constructor, as
401 // the Arg() constructor cannot tell whether we will output a %d
402 // or a %x. Only the latter should experience masking.
403 if (arg.width_ < sizeof(int64_t))
404 i &= (1LL << (8*arg.width_)) - 1;
405 break;
406 default:
407 // Pointer values require an actual pointer or a string.
408 if (arg.type_ == Arg::POINTER)
409 i = reinterpret_cast<uintptr_t>(arg.ptr_);
410 else if (arg.type_ == Arg::STRING)
411 i = reinterpret_cast<uintptr_t>(arg.s_);
412 else if (arg.type_ == Arg::INT && arg.width_ == sizeof(void *) &&
413 arg.i_ == 0) // Allow C++'s version of NULL
414 i = 0;
415 else {
416 RAW_DCHECK(arg.type_ == Arg::POINTER ||
417 arg.type_ == Arg::STRING);
418 goto fail_to_expand;
419 }
420
421 // Pointers always include the "0x" prefix. This affects padding.
422 if (padding) {
423 if (pad == ' ') {
424 // Predict the number of hex digits (including "0x" prefix) that
425 // will be output for this address when it is converted to ASCII.
426 size_t chars = 2;
427 uint64_t j = i;
428 do {
429 ++chars;
430 j >>= 4;
431 } while (j);
432
433 // Output the necessary number of space characters to perform
434 // padding. We can't rely on IToASCII() to do that for us, as it
435 // would incorrectly add padding _after_ the "0x" prefix.
436 Pad(buf, sz, &count, pad, padding, chars, &ptr);
437
438 // Inform IToASCII() that it no longer needs to handle the
439 // padding.
440 padding = 0;
441 } else {
442 // Adjust for the two-character "0x" prefix.
443 padding = padding >= 2 ? padding - 2 : 0;
444 }
445 }
446
447 // Insert "0x" prefix, if there is still sufficient space in the
448 // output buffer.
449 if (Out(buf, sz, &count, '0'))
450 ++ptr;
451 if (Out(buf, sz, &count, 'x'))
452 ++ptr;
453 break;
454 }
455
456 // No matter what data type this value originated from, print it as
457 // a regular hexadecimal number.
458 IncrementCount(&count,
459 IToASCII(false, ch != 'x', i, ptr, sz - (ptr - buf),
460 16, padding, pad));
461
462 // Advance "ptr" to the end of the string that was just emitted.
463 if (sz - (ptr - buf))
464 while (*ptr)
465 ++ptr;
466 break; }
467 case 's': {
468 // Check that there are arguments left to be inserted.
469 if (cur_arg >= max_args) {
470 RAW_DCHECK(cur_arg < max_args);
471 goto fail_to_expand;
472 }
473
474 // Check that the argument has the expected type.
475 const Arg& arg = args[cur_arg++];
476 const char *s;
477 if (arg.type_ == Arg::STRING)
478 s = arg.s_ ? arg.s_ : "<NULL>";
479 else if (arg.type_ == Arg::INT && arg.width_ == sizeof(void *) &&
480 arg.i_ == 0) // Allow C++'s version of NULL
481 s = "<NULL>";
482 else {
483 RAW_DCHECK(arg.type_ == Arg::STRING);
484 goto fail_to_expand;
485 }
486
487 // Apply padding, if needed. This requires us to first check the
488 // length of the string that we are outputting.
489 if (padding) {
490 size_t len = 0;
491 for (const char* src = s; *src++; )
492 ++len;
493 Pad(buf, sz, &count, ' ', padding, len, &ptr);
494 }
495
496 // Printing a string involves nothing more than copying it into the
497 // output buffer and making sure we don't output more bytes than
498 // available space.
499 for (const char* src = s; *src; )
500 if (Out(buf, sz, &count, *src++))
501 ++ptr;
502 break; }
503 case '%':
504 // Quoted percent '%' character.
505 goto copy_verbatim;
506 fail_to_expand:
507 // C++ gives us tools to do type checking -- something that snprintf()
508 // could never really do. So, whenever we see arguments that don't
509 // match up with the format string, we refuse to output them. But
510 // since we have to be extremely conservative about being async-
511 // signal-safe, we are limited in the type of error handling that we
512 // can do in production builds (in debug builds we can use RAW_DCHECK()
513 // and hope for the best). So, all we do is pass the format string
514 // unchanged. That should eventually get the user's attention; and in
515 // the meantime, it hopefully doesn't lose too much data.
516 default:
517 // Unknown or unsupported format character. Just copy verbatim to
518 // output.
519 if (Out(buf, sz, &count, '%'))
520 ++ptr;
521 if (!ch)
522 goto end_of_format_string;
523 if (Out(buf, sz, &count, ch))
524 ++ptr;
525 break;
526 }
527 } else {
528 copy_verbatim:
529 if (Out(buf, sz, &count, fmt[-1]))
530 ++ptr;
531 }
532 }
533 end_of_format_string:
534 end_of_output_buffer:
535 *ptr = '\000';
536 IncrementCountByOne(&count);
537 return static_cast<ssize_t>(count)-1;
538 }
539
540 ssize_t FormatN(char* buf, size_t N, const char* fmt) {
541 // Make sure we can write at least one NUL byte.
542 ssize_t n = static_cast<ssize_t>(N);
543 if (n < 1)
544 return -1;
545 size_t count = 0;
546
547 // In the slow-path, we deal with errors by copying the contents of
548 // "fmt" unexpanded. This means, if there are no arguments passed, the
549 // Format() function always degenerates to version of strncpy() that
550 // de-duplicates '%' characters.
551 char* dst = buf;
552 const char* src = fmt;
553 for (; *src; ++src) {
554 char ch = *src;
555 if (!IncrementCountByOne(&count) && n > 1) {
556 --dst;
557 break;
558 }
559 if (n > 1) {
560 --n;
561 *dst++ = ch;
562 }
563 if (ch == '%' && src[1] == '%')
564 ++src;
565 }
566 IncrementCountByOne(&count);
567 *dst = '\000';
568 return static_cast<ssize_t>(count)-1;
569 }
570
571 } // namespace debug
572 } // namespace base
OLDNEW
« no previous file with comments | « base/debug/format.h ('k') | base/debug/format_unittest.cc » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698