|
|
Chromium Code Reviews|
Created:
7 years ago by gone Modified:
6 years, 11 months ago Reviewers:
rkaplow, jar (doing other things), Ted C, Yaron, Alexei Svitkine (slow), Ilya Sherman, Kibeom Kim (inactive) CC:
chromium-reviews, jar (doing other things), asvitkine+watch_chromium.org, Kibeom Kim (inactive) Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionLet MetricsService know about some Android Activities
To tune our crash states more finely, we need to track which Activity is in the
foreground when Chrome is killed. Add piping to let MetricsService know when
Activities are swapped in and out by the user.
The CL makes MetricsService track how many times particular Activities have been
launched, as well as how often Chrome was improperly closed while one of these
Activities was in the foreground (i.e. crashed). These numbers are concatenated
into a histogram, which is then sent to the server.
BUG=321346
Patch Set 1 #Patch Set 2 : Cleaning up #Patch Set 3 : Fleshing out templates #Patch Set 4 : Cleaning #
Total comments: 6
Patch Set 5 : Comment addressing #Patch Set 6 : Separating out Android code #Patch Set 7 : Copyright #
Total comments: 14
Patch Set 8 : Switching from sparse #Patch Set 9 : Use regular histograms #
Total comments: 3
Patch Set 10 : Splitting variables #
Total comments: 6
Patch Set 11 : Addressing comments #Patch Set 12 : NOTREACHED #
Total comments: 20
Patch Set 13 : Adding more activities, addressing comments #
Total comments: 12
Messages
Total messages: 27 (0 generated)
Sending this out for official review.
https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/an... File chrome/browser/android/activity_type_ids.h (right): https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/an... chrome/browser/android/activity_type_ids.h:17: #endif // DEFINE_ACTIVITY_TYPE_FLAG https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/an... chrome/browser/android/activity_type_ids.h:27: }; // enum Ids enum Type https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/me... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/me... chrome/browser/metrics/metrics_service.cc:987: int foreground = pref->GetInteger(prefs::kStabilityForegroundActivity); So you'll get onForegroundActivityChanged(ACTIVITY_NONE) in onPause, right?
https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/an... File chrome/browser/android/activity_type_ids.h (right): https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/an... chrome/browser/android/activity_type_ids.h:17: #endif On 2013/12/10 01:48:17, Yaron wrote: > // DEFINE_ACTIVITY_TYPE_FLAG Done. https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/an... chrome/browser/android/activity_type_ids.h:27: }; // enum Ids On 2013/12/10 01:48:17, Yaron wrote: > enum Type Done. https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/me... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/60001/chrome/browser/me... chrome/browser/metrics/metrics_service.cc:987: int foreground = pref->GetInteger(prefs::kStabilityForegroundActivity); On 2013/12/10 01:48:17, Yaron wrote: > So you'll get onForegroundActivityChanged(ACTIVITY_NONE) in onPause, right? Yep: https://chrome-internal-review.googlesource.com/#/c/149216/ More specifically: https://chrome-internal-review.googlesource.com/#/c/149216/4/java/apps/chrome... recordActivityPaused() gets called explicitly when an Activity is Paused to clear that out, which should handle the weird ANR cases.
lgtm Seems sensible to me but need a good look from the uma folk.
This looks ok to me, though I'd love to see (most of) the new code factored out into a new file. This file is really too large, and this at least seems like a good opportunity to keep it from getting larger. I'd really like to get Jim's thoughts on this as well, but if he doesn't reply in a reasonable amount of time, I'm ok with being the sole UMA OWNERS reviewer.
On 2013/12/12 08:55:14, Ilya Sherman wrote: > This looks ok to me, though I'd love to see (most of) the new code factored out > into a new file. This file is really too large, and this at least seems like a > good opportunity to keep it from getting larger. > > I'd really like to get Jim's thoughts on this as well, but if he doesn't reply > in a reasonable amount of time, I'm ok with being the sole UMA OWNERS reviewer. I've split off the Android code with hooks to a metrics_service_android.cc file. Is this what you had in mind?
Thanks for moving that code out into a separate file -- I think this will really help with maintainability of this file :) (We might eventually need to rethink the number of #defines that are involved, but I think it's ok as-is for now.) https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service.cc:940: #endif // defined(OS_ANDROID) I think this line needs to be moved after Alexei's CL [ https://codereview.chromium.org/81603002/ ], which just recently landed. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:26: } nit: Please arrange this method in the same order in the impl file as in the header file. (I didn't check the other methods, but this comment applies throughout.) https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:60: DCHECK(pref); nit: No need for this, since you directly access the object on line 64. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:67: } nit: No need for curly braces. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:70: } nit: No need for curly braces. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:100: } This is somewhat inefficient. Any reason not to use a regular enumerated histogram instead?
https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:100: } On 2013/12/16 23:54:59, Ilya Sherman wrote: > This is somewhat inefficient. Any reason not to use a regular enumerated > histogram instead? No reason other than I'm not familiar with all the different histogram types. The stat we want in the end is just the number of times a launch or crash happened, summed up across all devices. Each device should just send how many times a launch or crash happened across all of a user's sessions. Would the regular enumerated histogram be fine for that, or would that just result in bucketing users by how many times they crashed or launched Chrome between uploads?
https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:100: } On 2013/12/17 00:17:28, dfalcantara wrote: > On 2013/12/16 23:54:59, Ilya Sherman wrote: > > This is somewhat inefficient. Any reason not to use a regular enumerated > > histogram instead? > > No reason other than I'm not familiar with all the different histogram types. > The stat we want in the end is just the number of times a launch or crash > happened, summed up across all devices. Each device should just send how many > times a launch or crash happened across all of a user's sessions. Would the > regular enumerated histogram be fine for that, or would that just result in > bucketing users by how many times they crashed or launched Chrome between > uploads? An enumerated histogram should be fine for that. The difference between an enumerated histogram and a sparse histogram is mostly whether the histogram is backed by a dense representation (vector) vs. a sparse representation (map). In this case, it looks like you're emitting to a histogram with only two buckets, so a dense representation ought to be more efficient.
Still looking at that CL you linked to, but can you take a look at the enum changes in the meantime? https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service.cc:940: #endif // defined(OS_ANDROID) On 2013/12/16 23:54:59, Ilya Sherman wrote: > I think this line needs to be moved after Alexei's CL [ > https://codereview.chromium.org/81603002/ ], which just recently landed. I guess something definitely has to be done for this CL, but that's not entirely clear since his CL splits apart incrementing the crash stat from the launch count stat (among a lot of other things). I'll take a look... https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:26: } On 2013/12/16 23:54:59, Ilya Sherman wrote: > nit: Please arrange this method in the same order in the impl file as in the > header file. (I didn't check the other methods, but this comment applies > throughout.) Done. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:60: DCHECK(pref); On 2013/12/16 23:54:59, Ilya Sherman wrote: > nit: No need for this, since you directly access the object on line 64. Done. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:67: } On 2013/12/16 23:54:59, Ilya Sherman wrote: > nit: No need for curly braces. Done. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:70: } On 2013/12/16 23:54:59, Ilya Sherman wrote: > nit: No need for curly braces. Done. https://chromiumcodereview.appspot.com/103943006/diff/120001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:100: } On 2013/12/17 00:32:22, Ilya Sherman wrote: > On 2013/12/17 00:17:28, dfalcantara wrote: > > On 2013/12/16 23:54:59, Ilya Sherman wrote: > > > This is somewhat inefficient. Any reason not to use a regular enumerated > > > histogram instead? > > > > No reason other than I'm not familiar with all the different histogram types. > > The stat we want in the end is just the number of times a launch or crash > > happened, summed up across all devices. Each device should just send how many > > times a launch or crash happened across all of a user's sessions. Would the > > regular enumerated histogram be fine for that, or would that just result in > > bucketing users by how many times they crashed or launched Chrome between > > uploads? > > An enumerated histogram should be fine for that. The difference between an > enumerated histogram and a sparse histogram is mostly whether the histogram is > backed by a dense representation (vector) vs. a sparse representation (map). In > this case, it looks like you're emitting to a histogram with only two buckets, > so a dense representation ought to be more efficient. I think I've adjusted for that use case now.
https://chromiumcodereview.appspot.com/103943006/diff/160001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/160001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:70: ActivityTypeIds::ACTIVITY_MAX_VALUE); It looks like this histogram only has two buckets, but you're allocating space for 4 buckets instead. Do you plan to ever actually OR together ACTIVITY_MAIN and ACTIVITY_FULLSCREEN for this histogram? If not, please define a separate enum specifically for use with the histogram, and use that here. (It doesn't matter too much while there are only 2 activities, but if the list were to ever grow, exponential explosion in the size of the allocated vector is not a good place to be in.)
Current hook is to insert the histogram right before it gets cut, which I guess won't help when the crashes happen before that. This might need to be part of the initial stability log, but that doesn't seem to have any histograms in it AFAICT. https://chromiumcodereview.appspot.com/103943006/diff/160001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/160001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:70: ActivityTypeIds::ACTIVITY_MAX_VALUE); On 2013/12/17 02:13:44, Ilya Sherman wrote: > It looks like this histogram only has two buckets, but you're allocating space > for 4 buckets instead. Do you plan to ever actually OR together ACTIVITY_MAIN > and ACTIVITY_FULLSCREEN for this histogram? If not, please define a separate > enum specifically for use with the histogram, and use that here. (It doesn't > matter too much while there are only 2 activities, but if the list were to ever > grow, exponential explosion in the size of the allocated vector is not a good > place to be in.) Split it apart, but it got a bit messy.
Alexei, please take a look at the use of RecordAndroidStabilityHistograms(). How would you recommend supporting this in concert with the notion of the "initial stability log"? https://chromiumcodereview.appspot.com/103943006/diff/160001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/160001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:70: ActivityTypeIds::ACTIVITY_MAX_VALUE); On 2013/12/18 23:46:01, dfalcantara wrote: > On 2013/12/17 02:13:44, Ilya Sherman wrote: > > It looks like this histogram only has two buckets, but you're allocating space > > for 4 buckets instead. Do you plan to ever actually OR together ACTIVITY_MAIN > > and ACTIVITY_FULLSCREEN for this histogram? If not, please define a separate > > enum specifically for use with the histogram, and use that here. (It doesn't > > matter too much while there are only 2 activities, but if the list were to > ever > > grow, exponential explosion in the size of the allocated vector is not a good > > place to be in.) > > Split it apart, but it got a bit messy. Thanks! https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:89: } Please add the new histograms to histograms.xml as part of this CL. https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:122: default: Please replace the default case with an explicit "case ActivityTypeIds::ACTIVITY_TYPE_MAX_VALUE:" so that the compiler can enforce that new cases get added if the enum grows. https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:123: DCHECK(false) << "Unknown activity type encountered: " << type; nit: DCHECK(false) -> NOTREACHED()
Addressed the comments, I think, but still waiting for input on the histogram trigger. https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:89: } On 2013/12/19 00:36:37, Ilya Sherman wrote: > Please add the new histograms to histograms.xml as part of this CL. Done. https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:122: default: On 2013/12/19 00:36:37, Ilya Sherman wrote: > Please replace the default case with an explicit "case > ActivityTypeIds::ACTIVITY_TYPE_MAX_VALUE:" so that the compiler can enforce that > new cases get added if the enum grows. Done. https://chromiumcodereview.appspot.com/103943006/diff/180001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:123: DCHECK(false) << "Unknown activity type encountered: " << type; On 2013/12/19 00:36:37, Ilya Sherman wrote: > nit: DCHECK(false) -> NOTREACHED() Done.
Remind me, why did we decide to log these as a histogram rather than as part of the Stability message of the SystemProfile? It's probably fine either way, I'm just kind of curious as to what the perceived advantage of each is. It seems not ideal to have two different ways to log our core stability data, so I'd expect that we'd probably want to settle on one or the other. FWIW, I don't think there's any issue with including histograms in the initial stability log... we just hadn't been because there weren't any that were relevant. These seem relevant. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service.cc:963: if (!pref->GetBoolean(prefs::kStabilityExitedCleanly)) { This check will probably need to be updated, so that Android stability stats are included. There are likely similar checks elsewhere that need to be updated as well. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:123: NOTREACHED() << "Unknown activity type encountered: " << type; nit: No need for the rest of the logging stmt now, since there's only one |type| that can reach this code path. https://chromiumcodereview.appspot.com/103943006/diff/260001/tools/metrics/hi... File tools/metrics/histograms/histograms.xml (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/tools/metrics/hi... tools/metrics/histograms/histograms.xml:21591: + <int value="3" label="Maximum Activity value (invalid)"/> nit: Please remove this bucket, since it's not a bucket that should be possible to emit.
On 2013/12/20 23:28:20, Ilya Sherman (Away thru Jan 6) wrote: > Remind me, why did we decide to log these as a histogram rather than as part of > the Stability message of the SystemProfile? It's probably fine either way, I'm > just kind of curious as to what the perceived advantage of each is. It seems > not ideal to have two different ways to log our core stability data, so I'd > expect that we'd probably want to settle on one or the other. Not sure what the exact reasoning was; I'd originally added them to the protobuf definition but jar@ requested that I move them over to histograms to avoid having to "write custom code to visualize it". I'd still argue the protobuf definition makes more sense, given that the other stability metrics are there, though.
On 2013/12/20 23:34:36, dfalcantara wrote: > On 2013/12/20 23:28:20, Ilya Sherman (Away thru Jan 6) wrote: > > Remind me, why did we decide to log these as a histogram rather than as part > of > > the Stability message of the SystemProfile? It's probably fine either way, > I'm > > just kind of curious as to what the perceived advantage of each is. It seems > > not ideal to have two different ways to log our core stability data, so I'd > > expect that we'd probably want to settle on one or the other. > > Not sure what the exact reasoning was; I'd originally added them to the protobuf > definition but jar@ requested that I move them over to histograms to avoid > having to "write custom code to visualize it". I'd still argue the protobuf > definition makes more sense, given that the other stability metrics are there, > though. Ok. Sounds like we should discuss this further, then, I s'pose.
https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service.cc:1292: RecordAndroidStabilityHistograms(); One problem with logging these as histograms (compared to system profile proto) is that they won't be picked up by the initial stability log (which doesn't have histograms). The initial stability log was added to more correctly track stability for experiments. Basically, the idea is that crashes from a previous session should report correct field trial stats, and the initial stability log does this by saving the system profile from the last session and using it for the initial stability log. So either these should be done as actual entries in the proto (which as Jim points out would not be immediately supported by the dashboard without server-side changes - though can still be queried manually) or to add a mechanism that allows for adding a specific set of whitelisted histograms to the initial stability log (right now the initial stability log has no histograms). The latter approach would require actually implementing such a mechanism, but it might not be very hard. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:67: UMA_HISTOGRAM_ENUMERATION("Chrome.Android.Activity.LaunchCounts", Nit: Make a helper function in the anon namespace for this that just takes in the activity type id. (Histogram macros expand to a blob of code and its better for code size to not repeat them - also prevents accidental typos in the name). https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:113: case ActivityTypeIds::ACTIVITY_TYPE_NONE: Can you add a comment explaining what this means? https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:126: pref->SetInteger(prefs::kStabilityLaunchedActivityFlags, launched_activities); Why use an intermediate pref for this, rather than incrementing kStabilityLaunchCountMainActivity and kStabilityLaunchCountFullScreenActivity here directly? Is it to avoid double counting? If so, can you just keep |launched_activities| a local var and record ones you've seen already and log the new ones? https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... File chrome/common/pref_names.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... chrome/common/pref_names.cc:1497: // ID of the Activity that is currently in the foreground. What kind of ID and how is it stored (string / int)? Document this. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... chrome/common/pref_names.cc:1501: // ORed set of flags indicating which Activities were launched during the Which flags? Can you point to the corresponding constants/enums in the code in this comment? https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... chrome/common/pref_names.cc:1508: "user_experience_metrics.stability.launch_count_main_activity"; Instead of special casing main activity, full screen activity, etc, would it make more sense to use a list item here, where each entry has activity_type, launch_count, crash_count, is_foreground? Seems like that would be cleaner and more extensible in the future. (See how we store plugins for UMA stability for an example of this).
https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:126: pref->SetInteger(prefs::kStabilityLaunchedActivityFlags, launched_activities); On 2013/12/23 16:28:01, Alexei Svitkine wrote: > Why use an intermediate pref for this, rather than incrementing > kStabilityLaunchCountMainActivity and kStabilityLaunchCountFullScreenActivity > here directly? > > Is it to avoid double counting? If so, can you just keep |launched_activities| a > local var and record ones you've seen already and log the new ones? (I mean class instance var, rather than a local var.)
On 2013/12/23 16:29:51, asvitkine (OOO until Jan 6th) wrote: > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > File chrome/browser/metrics/metrics_service_android.cc (right): > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > chrome/browser/metrics/metrics_service_android.cc:126: > pref->SetInteger(prefs::kStabilityLaunchedActivityFlags, launched_activities); > On 2013/12/23 16:28:01, Alexei Svitkine wrote: > > Why use an intermediate pref for this, rather than incrementing > > kStabilityLaunchCountMainActivity and kStabilityLaunchCountFullScreenActivity > > here directly? > > > > Is it to avoid double counting? If so, can you just keep |launched_activities| > a > > local var and record ones you've seen already and log the new ones? > > (I mean class instance var, rather than a local var.) Adding jar's comments here since most discussion is happening here: IMO, it is faster/easier to add a histogram (when practical) than to add protobufs. The latter requires client and server change, carefully sequenced, and and also requires some custom dremel or sawzall processing. The former is predominantly a client-only addition, often taking a day or so to establish. If you prefer the protobuf route, have at it. It is not unreasonable. Note that if you take the histogram route, you can still harvest the data via dremel. The one drawback is that it is harder to get correlations between the crash data, and other histograms (as all histograms are in repeating fields.). Ilya might have additional pro/con comments.
Changed the CL to reflect changes for the comments that could be immediately addressed and added several new IDs for other Activities we want to watch crashes for. I have no strong preference either way for whether these are stored as histograms or protobufs, but we really need to get this in for M34 because we're flying blind on many potential crashes. On Android, Chrome doesn't track any crash stats outside of those that occur in the main Chrome browser Activity, which means that we're not getting any stats for application shortcut Activities, the settings Activity, or whatever other Activities I've missed. A concrete decision on what path to take is most welcome. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service.cc:963: if (!pref->GetBoolean(prefs::kStabilityExitedCleanly)) { On 2013/12/20 23:28:21, Ilya Sherman (Away thru Jan 6) wrote: > This check will probably need to be updated, so that Android stability stats are > included. There are likely similar checks elsewhere that need to be updated as > well. The check should still hold: we're still counting on that stat to track when a UMA session failed to end cleanly. This CL just tries to make the crash location a lot clearer. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:67: UMA_HISTOGRAM_ENUMERATION("Chrome.Android.Activity.LaunchCounts", On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > Nit: Make a helper function in the anon namespace for this that just takes in > the activity type id. (Histogram macros expand to a blob of code and its better > for code size to not repeat them - also prevents accidental typos in the name). Changed it to loop over the possibilities instead. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:113: case ActivityTypeIds::ACTIVITY_TYPE_NONE: On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > Can you add a comment explaining what this means? Added a more explicit ACTIVITY_UNKNOWN here to track that some other Activity we haven't enumerated has been launched. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:123: NOTREACHED() << "Unknown activity type encountered: " << type; On 2013/12/20 23:28:21, Ilya Sherman (Away thru Jan 6) wrote: > nit: No need for the rest of the logging stmt now, since there's only one |type| > that can reach this code path. Redid this bit. Should be cleaner now. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:126: pref->SetInteger(prefs::kStabilityLaunchedActivityFlags, launched_activities); On 2013/12/23 16:29:52, asvitkine (OOO until Jan 6th) wrote: > On 2013/12/23 16:28:01, Alexei Svitkine wrote: > > Why use an intermediate pref for this, rather than incrementing > > kStabilityLaunchCountMainActivity and kStabilityLaunchCountFullScreenActivity > > here directly? > > > > Is it to avoid double counting? If so, can you just keep |launched_activities| > a > > local var and record ones you've seen already and log the new ones? > > (I mean class instance var, rather than a local var.) tl;dr: Yeah, it's to avoid double-counting, but it's also done so that we don't lose the stat during a crash. Android sessions are currently defined as the time between when the main browser Activity (i.e. browser window) is brought to the foreground and sent to the background. We define it this way because Android is free to kill the Chrome process whenever it's not visible, meaning that it'd be difficult to properly log a "clean shutdown". This is also why getting a clean browser shutdown is impractical on Android: if we try and perform a shutdown every time the app went to the background, a user temporarily switching to another app would incur a slow cold start every time they switched back. Instead, we compromise and end UMA sessions once in the background. This is increasingly problematic, however, since we have other Activities that can trigger a Chrome launch without ever bringing the main browser Activity to the foreground (e.g. application shortcuts or FullScreenActivities). Consequently, these Activities are untracked by the metrics service and crashes go unnoticed. I'm trying to expand the definition of the UMA session to include when any Chrome Activity is in the foreground through this CL, which is important because the user can easily swap between the main browser Activity and a FullScreenActivity without ever backgrounding Chrome, which feels like it should count as a single session. We don't want to count these as separate launches because they're analogously just swapping between two desktop windows. Incrementing the counter whenever the foreground Activity would then result in inflated counts. Instead, we record a set of flags indicating which Activities were launched for that given session and increment them when a new session starts. Since we don't immediately save the histograms out to disk, we store them in prefs to ensure we don't lose them between crashes. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... File chrome/common/pref_names.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... chrome/common/pref_names.cc:1497: // ID of the Activity that is currently in the foreground. On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > What kind of ID and how is it stored (string / int)? Document this. Done. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... chrome/common/pref_names.cc:1501: // ORed set of flags indicating which Activities were launched during the On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > Which flags? Can you point to the corresponding constants/enums in the code in > this comment? Done. https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... chrome/common/pref_names.cc:1508: "user_experience_metrics.stability.launch_count_main_activity"; On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > Instead of special casing main activity, full screen activity, etc, would it > make more sense to use a list item here, where each entry has activity_type, > launch_count, crash_count, is_foreground? Seems like that would be cleaner and > more extensible in the future. (See how we store plugins for UMA stability for > an example of this). Which activity is in a singular value; only one can be in the foreground at any given time. It gets translated to a crash state later. I've switched to lists otherwise; thanks for the heads up. https://chromiumcodereview.appspot.com/103943006/diff/260001/tools/metrics/hi... File tools/metrics/histograms/histograms.xml (right): https://chromiumcodereview.appspot.com/103943006/diff/260001/tools/metrics/hi... tools/metrics/histograms/histograms.xml:21591: + <int value="3" label="Maximum Activity value (invalid)"/> On 2013/12/20 23:28:21, Ilya Sherman (Away thru Jan 6) wrote: > nit: Please remove this bucket, since it's not a bucket that should be possible > to emit. Done.
On 2013/12/28 01:33:13, dfalcantara wrote: > Changed the CL to reflect changes for the comments that could be immediately > addressed and added several new IDs for other Activities we want to watch > crashes for. I have no strong preference either way for whether these are > stored as histograms or protobufs, but we really need to get this in for M34 > because we're flying blind on many potential crashes. On Android, Chrome > doesn't track any crash stats outside of those that occur in the main Chrome > browser Activity, which means that we're not getting any stats for application > shortcut Activities, the settings Activity, or whatever other Activities I've > missed. > > A concrete decision on what path to take is most welcome. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > File chrome/browser/metrics/metrics_service.cc (right): > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > chrome/browser/metrics/metrics_service.cc:963: if > (!pref->GetBoolean(prefs::kStabilityExitedCleanly)) { > On 2013/12/20 23:28:21, Ilya Sherman (Away thru Jan 6) wrote: > > This check will probably need to be updated, so that Android stability stats > are > > included. There are likely similar checks elsewhere that need to be updated > as > > well. > > The check should still hold: we're still counting on that stat to track when a > UMA session failed to end cleanly. This CL just tries to make the crash > location a lot clearer. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > File chrome/browser/metrics/metrics_service_android.cc (right): > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > chrome/browser/metrics/metrics_service_android.cc:67: > UMA_HISTOGRAM_ENUMERATION("Chrome.Android.Activity.LaunchCounts", > On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > > Nit: Make a helper function in the anon namespace for this that just takes in > > the activity type id. (Histogram macros expand to a blob of code and its > better > > for code size to not repeat them - also prevents accidental typos in the > name). > > Changed it to loop over the possibilities instead. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > chrome/browser/metrics/metrics_service_android.cc:113: case > ActivityTypeIds::ACTIVITY_TYPE_NONE: > On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > > Can you add a comment explaining what this means? > > Added a more explicit ACTIVITY_UNKNOWN here to track that some other Activity we > haven't enumerated has been launched. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > chrome/browser/metrics/metrics_service_android.cc:123: NOTREACHED() << "Unknown > activity type encountered: " << type; > On 2013/12/20 23:28:21, Ilya Sherman (Away thru Jan 6) wrote: > > nit: No need for the rest of the logging stmt now, since there's only one > |type| > > that can reach this code path. > > Redid this bit. Should be cleaner now. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/browser/m... > chrome/browser/metrics/metrics_service_android.cc:126: > pref->SetInteger(prefs::kStabilityLaunchedActivityFlags, launched_activities); > On 2013/12/23 16:29:52, asvitkine (OOO until Jan 6th) wrote: > > On 2013/12/23 16:28:01, Alexei Svitkine wrote: > > > Why use an intermediate pref for this, rather than incrementing > > > kStabilityLaunchCountMainActivity and > kStabilityLaunchCountFullScreenActivity > > > here directly? > > > > > > Is it to avoid double counting? If so, can you just keep > |launched_activities| > > a > > > local var and record ones you've seen already and log the new ones? > > > > (I mean class instance var, rather than a local var.) > > tl;dr: Yeah, it's to avoid double-counting, but it's also done so that we don't > lose the stat during a crash. > > Android sessions are currently defined as the time between when the main browser > Activity (i.e. browser window) is brought to the foreground and sent to the > background. We define it this way because Android is free to kill the Chrome > process whenever it's not visible, meaning that it'd be difficult to properly > log a "clean shutdown". This is also why getting a clean browser shutdown is > impractical on Android: if we try and perform a shutdown every time the app went > to the background, a user temporarily switching to another app would incur a > slow cold start every time they switched back. Instead, we compromise and end > UMA sessions once in the background. > > This is increasingly problematic, however, since we have other Activities that > can trigger a Chrome launch without ever bringing the main browser Activity to > the foreground (e.g. application shortcuts or FullScreenActivities). > Consequently, these Activities are untracked by the metrics service and crashes > go unnoticed. > > I'm trying to expand the definition of the UMA session to include when any > Chrome Activity is in the foreground through this CL, which is important because > the user can easily swap between the main browser Activity and a > FullScreenActivity without ever backgrounding Chrome, which feels like it should > count as a single session. We don't want to count these as separate launches > because they're analogously just swapping between two desktop windows. > Incrementing the counter whenever the foreground Activity would then result in > inflated counts. > > Instead, we record a set of flags indicating which Activities were launched for > that given session and increment them when a new session starts. Since we don't > immediately save the histograms out to disk, we store them in prefs to ensure we > don't lose them between crashes. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... > File chrome/common/pref_names.cc (right): > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... > chrome/common/pref_names.cc:1497: // ID of the Activity that is currently in the > foreground. > On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > > What kind of ID and how is it stored (string / int)? Document this. > > Done. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... > chrome/common/pref_names.cc:1501: // ORed set of flags indicating which > Activities were launched during the > On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > > Which flags? Can you point to the corresponding constants/enums in the code in > > this comment? > > Done. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/chrome/common/pr... > chrome/common/pref_names.cc:1508: > "user_experience_metrics.stability.launch_count_main_activity"; > On 2013/12/23 16:28:01, asvitkine (OOO until Jan 6th) wrote: > > Instead of special casing main activity, full screen activity, etc, would it > > make more sense to use a list item here, where each entry has activity_type, > > launch_count, crash_count, is_foreground? Seems like that would be cleaner and > > more extensible in the future. (See how we store plugins for UMA stability for > > an example of this). > > Which activity is in a singular value; only one can be in the foreground at any > given time. It gets translated to a crash state later. I've switched to lists > otherwise; thanks for the heads up. > > https://chromiumcodereview.appspot.com/103943006/diff/260001/tools/metrics/hi... > File tools/metrics/histograms/histograms.xml (right): > > https://chromiumcodereview.appspot.com/103943006/diff/260001/tools/metrics/hi... > tools/metrics/histograms/histograms.xml:21591: + <int value="3" label="Maximum > Activity value (invalid)"/> > On 2013/12/20 23:28:21, Ilya Sherman (Away thru Jan 6) wrote: > > nit: Please remove this bucket, since it's not a bucket that should be > possible > > to emit. > > Done. Ping?
I think if you're going to go with histograms, you should ensure these histograms get added to the initial stability log. Currently, PrepareInitialStabilityLog doesn't add any histograms. I think you would need to change it to get the state of a set of specific histograms (i.e. the ones you're adding) and add those. This probably requires adding a bit of extra infrastructure to be able to filter the histograms to save... https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/a... File chrome/browser/android/activity_type_ids.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/a... chrome/browser/android/activity_type_ids.cc:11: // static Remove this, since this isn't a static function in a class. https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/a... File chrome/browser/android/activity_type_ids.h (right): https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/a... chrome/browser/android/activity_type_ids.h:9: #ifndef DEFINE_ACTIVITY_ID This is a bit messy. I suggest splitting into two files, one that just has the macros and names (e.g. like net_error_list.h[1]) and another that includes it that's meant to be used directly by C++ code (e.g. with a header guard, the enum declaration, etc). [1] https://code.google.com/p/chromium/codesearch#chromium/src/net/base/net_error... Then, you won't need these extra ifdefs. https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/m... File chrome/browser/metrics/metrics_service_android.cc (right): https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:44: PrefService* pref = g_browser_process->local_state(); Can this be passed as a param? Same for the function below. So this file doesn't need to reach into g_browser_process. https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:53: for (int activity_type = ActivityTypeIds::ACTIVITY_NONE; Nit: You use activity_type here but activity in another function. Standardize on one of the two. https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:112: CHECK(type < ActivityTypeIds::ACTIVITY_MAX_VALUE); I think that's why you have GetActivityType() to validate the range. Why do it here too? Presumably by the time its already an enum, it should be valid (vs. when its an int coming from the Java side). https://chromiumcodereview.appspot.com/103943006/diff/400001/chrome/browser/m... chrome/browser/metrics/metrics_service_android.cc:116: if (activity == type) return; Put return on next line.
On 2014/01/09 20:20:11, Alexei Svitkine wrote: > I think if you're going to go with histograms, you should ensure these > histograms get added to the initial stability log. Yeah, that's what I'm stuck on. No one has given a concrete "put it in histogram form" or "put it in the protobuf definition".
On 2014/01/09 21:00:55, dfalcantara wrote: > On 2014/01/09 20:20:11, Alexei Svitkine wrote: > > I think if you're going to go with histograms, you should ensure these > > histograms get added to the initial stability log. > > Yeah, that's what I'm stuck on. No one has given a concrete "put it in > histogram form" or "put it in the protobuf definition". I think either way is okay, with the following considerations: pros of histograms: you can use the dashboard to see the values without needing custom dremel or building new dashboards cons of histograms: to do it correctly, you need to change the code that generates initial stability log to include these histograms (only) With protobuf changes, you don't need the extra changes for the stability log so the client-code is potentially simpler, but you lose the histograms support in the dashboard. Given the above, I'm ok with either but slightly leaning more towards histograms provided you implement the support to include them (and only these) in the initial stability log.
Adding Kibeom to look into adding histogram support.
Continuing at https://codereview.chromium.org/137623002/ https://codereview.chromium.org/103943006/diff/400001/chrome/browser/android/... File chrome/browser/android/activity_type_ids.cc (right): https://codereview.chromium.org/103943006/diff/400001/chrome/browser/android/... chrome/browser/android/activity_type_ids.cc:11: // static On 2014/01/09 20:20:12, Alexei Svitkine wrote: > Remove this, since this isn't a static function in a class. Done. https://codereview.chromium.org/103943006/diff/400001/chrome/browser/android/... File chrome/browser/android/activity_type_ids.h (right): https://codereview.chromium.org/103943006/diff/400001/chrome/browser/android/... chrome/browser/android/activity_type_ids.h:9: #ifndef DEFINE_ACTIVITY_ID On 2014/01/09 20:20:12, Alexei Svitkine wrote: > This is a bit messy. > > I suggest splitting into two files, one that just has the macros and names (e.g. > like net_error_list.h[1]) and another that includes it that's meant to be used > directly by C++ code (e.g. with a header guard, the enum declaration, etc). > > [1] > https://code.google.com/p/chromium/codesearch#chromium/src/net/base/net_error... > > Then, you won't need these extra ifdefs. Done. https://codereview.chromium.org/103943006/diff/400001/chrome/browser/metrics/... File chrome/browser/metrics/metrics_service_android.cc (right): https://codereview.chromium.org/103943006/diff/400001/chrome/browser/metrics/... chrome/browser/metrics/metrics_service_android.cc:44: PrefService* pref = g_browser_process->local_state(); On 2014/01/09 20:20:12, Alexei Svitkine wrote: > Can this be passed as a param? Same for the function below. > > So this file doesn't need to reach into g_browser_process. Done. https://codereview.chromium.org/103943006/diff/400001/chrome/browser/metrics/... chrome/browser/metrics/metrics_service_android.cc:53: for (int activity_type = ActivityTypeIds::ACTIVITY_NONE; On 2014/01/09 20:20:12, Alexei Svitkine wrote: > Nit: You use activity_type here but activity in another function. Standardize on > one of the two. Done. https://codereview.chromium.org/103943006/diff/400001/chrome/browser/metrics/... chrome/browser/metrics/metrics_service_android.cc:112: CHECK(type < ActivityTypeIds::ACTIVITY_MAX_VALUE); On 2014/01/09 20:20:12, Alexei Svitkine wrote: > I think that's why you have GetActivityType() to validate the range. Why do it > here too? Presumably by the time its already an enum, it should be valid (vs. > when its an int coming from the Java side). Done. https://codereview.chromium.org/103943006/diff/400001/chrome/browser/metrics/... chrome/browser/metrics/metrics_service_android.cc:116: if (activity == type) return; On 2014/01/09 20:20:12, Alexei Svitkine wrote: > Put return on next line. Done. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
