Issue 2400283003: [Findit] Add skeleton code for delta test script.

Sharu Jiang

Patchset #1 (id:1) has been deleted

4 years, 2 months ago (2016-10-08 00:35:14 UTC) #1

Sharu Jiang

Description was changed from ========== [Findit] Add skeleton code for delta test script. BUG=605369 ========== ...

4 years, 2 months ago (2016-10-08 00:41:30 UTC) #3

Sharu Jiang

katesonia@chromium.org changed reviewers: + mbarbella@google.com, stgao@chromium.org, wrengr@chromium.org

4 years, 2 months ago (2016-10-08 00:41:30 UTC) #4

Sharu Jiang

katesonia@chromium.org changed reviewers: + chanli@chromium.org, lijeffrey@chromium.org

4 years, 2 months ago (2016-10-08 00:41:46 UTC) #5

Sharu Jiang

katesonia@chromium.org changed reviewers: + mbarbella@chromium.org - mbarbella@google.com

4 years, 2 months ago (2016-10-08 00:43:19 UTC) #6

wrengr

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/common/cache_decorator.py File appengine/findit/common/cache_decorator.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/common/cache_decorator.py#newcode140 appengine/findit/common/cache_decorator.py:140: func (function): An abitrary function. was correct before: "arbitrary" ...

4 years, 2 months ago (2016-10-11 22:57:44 UTC) #8

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/common/cache_decorator.py File appengine/findit/common/cache_decorator.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/common/cache_decorator.py#newcode140 appengine/findit/common/cache_decorator.py:140: func (function): An abitrary function. On 2016/10/11 22:57:43, wrengr ...

4 years, 2 months ago (2016-10-12 01:18:20 UTC) #9

wrengr

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py File appengine/findit/crash/findit_for_chromecrash.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py#newcode28 appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self): On 2016/10/12 01:18:20, sharu jiang wrote: > ...

4 years, 2 months ago (2016-10-12 20:15:17 UTC) #10

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py File appengine/findit/crash/findit_for_chromecrash.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py#newcode28 appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self): On 2016/10/12 20:15:17, wrengr wrote: > On ...

4 years, 2 months ago (2016-10-12 22:58:27 UTC) #11

stgao

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode120 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:120: # TODO(katesoina): Implement run-azalea.py. Is the TODO still valid? ...

4 years, 2 months ago (2016-10-14 01:40:40 UTC) #12

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode120 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:120: # TODO(katesoina): Implement run-azalea.py. On 2016/10/14 01:40:40, stgao wrote: ...

4 years, 2 months ago (2016-10-15 01:24:47 UTC) #14

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py
(right):

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:120: #
TODO(katesoina): Implement run-azalea.py.
On 2016/10/14 01:40:40, stgao wrote:
> Is the TODO still valid?

run-azalea.py is not finished yet, however, right, better add Todo there instead
of here.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:121:
command = 'python %s %s' % ('run-azalea.py', result_path) + (
On 2016/10/14 01:40:40, stgao wrote:
> what's the current working dir? Will it matter here?

In line 114, we make sure the current working dir is the delta test dir.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:122: ' -v'
if verbose else '')
On 2016/10/14 01:40:40, stgao wrote:
> Can we use a argument list instead of a string?

Done.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:130:
p.communicate(input=json.dumps(crashes))
On 2016/10/14 01:40:40, stgao wrote:
> Should we pass information through a file instead?

We can, we can cache the CrashIterator results(batches of crashes) and let
run-azalea read those cache, but as I stated in the design doc
https://docs.google.com/document/d/1kfu_HIJFCSwOWQleRpSkOH4t_D0UFY5vDSSausaZu...,
this may caused some disk space since whenever args changed, we need to cache
those crash_infos (it's relatively big compared to culprit result) again. 

I can do this in another cl.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:135:
logging.info('Fail to get results.')
On 2016/10/14 01:40:40, stgao wrote:
> Should it be an error or info? Same for other usage of logging.

Done.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:139: return
pickle.load(f)
On 2016/10/14 01:40:40, stgao wrote:
> Just curious: why we use pickle instead of json?

The results are a general concept, it can be a object like Culprit object.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:183: for
git_hash in [git_hash1, git_hash2]:
On 2016/10/14 01:40:40, stgao wrote:
> So for each crash, we switch the checkout twice. If we have 100 crashes, we
have
> to switch checkout 200 times.
> 
> Can we avoid such an overhead?

This is not for each crash, it is for each batch of 1000 crashes
(batch_run=True). So I think the overhead should be ok.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py
(right):

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:56:
'--after',
On 2016/10/14 01:40:40, stgao wrote:
> same as the other CL, why two different names?

As answered in the other cl, this is similar to the args of 'git log', have two
options for people to remember :)

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:88: #
If in verbose mode, prints debug infos about Gqlquery.
On 2016/10/14 01:40:40, stgao wrote:
> Just Gqlquery?

Oops, needs to be updated.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:117:
logging.info('Delta results already existed in\n%s', delta_path)
On 2016/10/14 01:40:40, stgao wrote:
> For this case, what's the expected action by the user?

In this case, user can check out the csv result file.

stgao

Look good after comments are addressed. I'm fine with such util scripts having no tests, ...

4 years, 2 months ago (2016-10-20 01:40:20 UTC) #17

Look good after comments are addressed.

I'm fine with such util scripts having no tests, but you may want to test it out
before committing.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py
(right):

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:121:
command = 'python %s %s' % ('run-azalea.py', result_path) + (
On 2016/10/15 01:24:47, sharu jiang wrote:
> On 2016/10/14 01:40:40, stgao wrote:
> > what's the current working dir? Will it matter here?
> 
> In line 114, we make sure the current working dir is the delta test dir.

Is this new process run in the same shell as that process in #115 above? Did the
`cd DIR` really change the current working dir? Have you tried in a command
line?

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:130:
p.communicate(input=json.dumps(crashes))
On 2016/10/15 01:24:47, sharu jiang wrote:
> On 2016/10/14 01:40:40, stgao wrote:
> > Should we pass information through a file instead?
> 
> We can, we can cache the CrashIterator results(batches of crashes) and let
> run-azalea read those cache, but as I stated in the design doc
>
https://docs.google.com/document/d/1kfu_HIJFCSwOWQleRpSkOH4t_D0UFY5vDSSausaZu...,
> this may caused some disk space since whenever args changed, we need to cache
> those crash_infos (it's relatively big compared to culprit result) again. 

I don't quite understand this. Maybe we could chat in person.

> 
> I can do this in another cl.

If this is the plan, let's add a TODO with a bug.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:139: return
pickle.load(f)
On 2016/10/15 01:24:47, sharu jiang wrote:
> On 2016/10/14 01:40:40, stgao wrote:
> > Just curious: why we use pickle instead of json?
> 
> The results are a general concept, it can be a object like Culprit object.

Acknowledged.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:183: for
git_hash in [git_hash1, git_hash2]:
On 2016/10/15 01:24:47, sharu jiang wrote:
> On 2016/10/14 01:40:40, stgao wrote:
> > So for each crash, we switch the checkout twice. If we have 100 crashes, we
> have
> > to switch checkout 200 times.
> > 
> > Can we avoid such an overhead?
> 
> This is not for each crash, it is for each batch of 1000 crashes
> (batch_run=True). So I think the overhead should be ok.

sg

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py
(right):

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:56:
'--after',
On 2016/10/15 01:24:47, sharu jiang wrote:
> On 2016/10/14 01:40:40, stgao wrote:
> > same as the other CL, why two different names?
> 
> As answered in the other cl, this is similar to the args of 'git log', have
two
> options for people to remember :)

The other was updated. Why not here?
Two names for the same argument is uncommon.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:117:
logging.info('Delta results already existed in\n%s', delta_path)
On 2016/10/15 01:24:47, sharu jiang wrote:
> On 2016/10/14 01:40:40, stgao wrote:
> > For this case, what's the expected action by the user?
> 
> In this case, user can check out the csv result file.

Why not add that the the message? It is not clear at least to me. And what do
you mean by "check out the csv result file"?

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py
(right):

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:126:
p.communicate(input=json.dumps(crashes))
Just a note: not sure if the passing a big chunk of data through stdio would be
a problem or not.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:134: with
open(result_path) as f:
As the result_path is passed in, why this function want to read the file and
return the content?

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:192:
dev_null_handle = open(os.devnull, 'w')
Should we do a with clause here? Otherwise a file handle is leaking.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/run-azalea.py
(right):

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-azalea.py:19: def
GetCulprits(crashes, client, verbose=False):
What's the verbose for? If not used, could we remove for now?

lijeffrey

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/common/cache_decorator.py File appengine/findit/common/cache_decorator.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/common/cache_decorator.py#newcode140 appengine/findit/common/cache_decorator.py:140: func (function): An abitrary function. nit: the previous spelling ...

4 years, 2 months ago (2016-10-20 15:52:15 UTC) #18

Sharu Jiang

Sure, the script has been tested. https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode121 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:121: command = 'python ...

4 years, 2 months ago (2016-10-20 22:39:07 UTC) #20

Sure, the script has been tested.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py
(right):

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:121:
command = 'python %s %s' % ('run-azalea.py', result_path) + (
On 2016/10/20 01:40:20, stgao wrote:
> On 2016/10/15 01:24:47, sharu jiang wrote:
> > On 2016/10/14 01:40:40, stgao wrote:
> > > what's the current working dir? Will it matter here?
> > 
> > In line 114, we make sure the current working dir is the delta test dir.
> 
> Is this new process run in the same shell as that process in #115 above? Did
the
> `cd DIR` really change the current working dir? Have you tried in a command
> line?

Yes, the run-azalea.py is in the same dir as this file(delta test dir). I tested
the skeleton delta test in command line, it works.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:130:
p.communicate(input=json.dumps(crashes))
On 2016/10/20 01:40:20, stgao wrote:
> On 2016/10/15 01:24:47, sharu jiang wrote:
> > On 2016/10/14 01:40:40, stgao wrote:
> > > Should we pass information through a file instead?
> > 
> > We can, we can cache the CrashIterator results(batches of crashes) and let
> > run-azalea read those cache, but as I stated in the design doc
> >
>
https://docs.google.com/document/d/1kfu_HIJFCSwOWQleRpSkOH4t_D0UFY5vDSSausaZu...,
> > this may caused some disk space since whenever args changed, we need to
cache
> > those crash_infos (it's relatively big compared to culprit result) again. 
> 
> I don't quite understand this. Maybe we could chat in person.
> 
> > 
> > I can do this in another cl.
> 
> If this is the plan, let's add a TODO with a bug.

Done.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py
(right):

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:56:
'--after',
On 2016/10/20 01:40:20, stgao wrote:
> On 2016/10/15 01:24:47, sharu jiang wrote:
> > On 2016/10/14 01:40:40, stgao wrote:
> > > same as the other CL, why two different names?
> > 
> > As answered in the other cl, this is similar to the args of 'git log', have
> two
> > options for people to remember :)
> 
> The other was updated. Why not here?
> Two names for the same argument is uncommon.

Done.

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-delta-test.py:117:
logging.info('Delta results already existed in\n%s', delta_path)
On 2016/10/20 01:40:20, stgao wrote:
> On 2016/10/15 01:24:47, sharu jiang wrote:
> > On 2016/10/14 01:40:40, stgao wrote:
> > > For this case, what's the expected action by the user?
> > 
> > In this case, user can check out the csv result file.
> 
> Why not add that the the message? It is not clear at least to me. And what do
> you mean by "check out the csv result file"?

I meant open the csv file to see the results. 

Modify the code so, even we already have csv result, we still print the delta to
the screen.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py
(right):

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:31: """Dict
representation of delta."""
On 2016/10/20 15:52:14, lijeffrey wrote:
> is it possible to include an example of what the resulting dict should look
> like?

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:41: value2
= value2.ToDict()
On 2016/10/20 15:52:14, lijeffrey wrote:
> does value2 need to be checked if callable(value2.ToDict)?

The result1 and result2 should be the same kind, so no need to check value2. 

Added a Note of that in Delta class doc.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:72: 
On 2016/10/20 15:52:14, lijeffrey wrote:
> nit: include in the docstring that set1 and set2 are dicts (?)

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:81: if
crash_id not in set2:
On 2016/10/20 15:52:14, lijeffrey wrote:
> nit: set2.keys()?

This is valid usage for dict.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:111:
logging.info('Switch to git %s', git_hash)
On 2016/10/20 15:52:14, lijeffrey wrote:
> nit: Switching?

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:126:
p.communicate(input=json.dumps(crashes))
On 2016/10/20 01:40:20, stgao wrote:
> Just a note: not sure if the passing a big chunk of data through stdio would
be
> a problem or not.

It works fine for the largest batch, and the crash data won't be largely
expanded, so it should be ok.

Added a todo, I will address this potential problem by caching the crashes data
and let the run-azalea read corresponding cache file to get input.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:131:
logging.error('Fail to get results.')
On 2016/10/20 15:52:14, lijeffrey wrote:
> nit: Failed

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:134: with
open(result_path) as f:
On 2016/10/20 01:40:20, stgao wrote:
> As the result_path is passed in, why this function want to read the file and
> return the content?

We can read the file outside this function however I think encapsulate this part
in GetResults would be cleaner.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:192:
dev_null_handle = open(os.devnull, 'w')
On 2016/10/20 01:40:20, stgao wrote:
> Should we do a with clause here? Otherwise a file handle is leaking.

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/delta_util.py
(right):

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_util.py:34: return
GIT_HASH_PATTERN.match(str(revision)) or revision == 'master'
On 2016/10/20 15:52:14, lijeffrey wrote:
> nit: just to be safe, use revision.lower() == 'master'

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_util.py:47:
logging.error('Fail to parse git hash for %s\nStacktrace:\n%s',
On 2016/10/20 15:52:15, lijeffrey wrote:
> nit: Failed

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/delta_util.py:83:
logging.info('Writing delta diff to %s\n', file_path)
On 2016/10/20 15:52:14, lijeffrey wrote:
> nit: move this logging line closer to where the write call is actually being
> made

Done.

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
File appengine/findit/util_scripts/crash_queries/delta_test/run-azalea.py
(right):

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_...
appengine/findit/util_scripts/crash_queries/delta_test/run-azalea.py:19: def
GetCulprits(crashes, client, verbose=False):
On 2016/10/20 01:40:20, stgao wrote:
> What's the verbose for? If not used, could we remove for now?

Done.

stgao

lgtm https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode134 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:134: with open(result_path) as f: On 2016/10/20 22:39:06, sharu ...

4 years, 2 months ago (2016-10-21 01:41:39 UTC) #21

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/160001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode134 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:134: with open(result_path) as f: On 2016/10/21 01:41:39, stgao wrote: ...

4 years, 2 months ago (2016-10-21 18:45:53 UTC) #22

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/200001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/200001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode136 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:136: args = ['python', 'run-azalea.py', result_path, '--client', client_id] On 2016/10/21 ...

4 years, 2 months ago (2016-10-21 18:59:06 UTC) #24

Sharu Jiang

The patchset sent to the CQ was uploaded after l-g-t-m from stgao@chromium.org Link to the ...

4 years, 2 months ago (2016-10-21 19:04:53 UTC) #26

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2400283003/260001

4 years, 2 months ago (2016-10-21 19:05:00 UTC) #27

wrengr

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode183 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:183: for git_hash in [git_hash1, git_hash2]: On 2016/10/15 01:24:47, sharu ...

4 years, 1 month ago (2016-10-24 17:30:36 UTC) #29

Sharu Jiang

The patchset sent to the CQ was uploaded after l-g-t-m from stgao@chromium.org Link to the ...

4 years, 1 month ago (2016-10-24 17:55:43 UTC) #32

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2400283003/300001

4 years, 1 month ago (2016-10-24 17:55:53 UTC) #33

wrengr

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py File appengine/findit/crash/findit_for_chromecrash.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py#newcode28 appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self): On 2016/10/12 22:58:27, sharu jiang wrote: > ...

4 years, 1 month ago (2016-10-24 18:09:07 UTC) #34

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
File appengine/findit/crash/findit_for_chromecrash.py (right):

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self):
On 2016/10/12 22:58:27, sharu jiang wrote:
> On 2016/10/12 20:15:17, wrengr wrote:
> > On 2016/10/12 01:18:20, sharu jiang wrote:
> > > On 2016/10/11 22:57:43, wrengr wrote:
> > > > What's the purpose of this? Since _fields is a private member of
> namedtuple,
> > > it
> > > > make sense to leave it private so that it's clear when callers are
> depending
> > > on
> > > > the internals of this class in ways they shouldn't be (i.e., which are
> > > fragile).
> > > 
> > > This is used in delta test, and later may used to drop empty field in
> Culprit.
> > 
> > I still don't understand. Why can't delta use _fields directly? If all we
need
> > is to compare one Culprit to another, then we should provide the appropriate
> > comparison functions rather than having client code rely on internals.
> > 
> > This class serves as the API for the library to pass information to the
> > application; thus, it needs to serve the purposes of the library first and
> > foremost. I assume you want to drop fields because of serializing to JSON,
but
> > that's an application concern not a library concern. The library shouldn't
> have
> > to care at all about how the application decides to serialize things.
> 
> shouldn't _fields supposed to be private and not used directly? This is not
for
> comparison. 
> 
> For "that's an application concern not a library concern", in this case, I
think
> the serialization part should be in clients' code (findit_for_*) instead of
> Culprit, to do that, the "fields" may still be needed.

(FWIW, whenever I talk about "client code" or "clients (of X)", I mean any code
that uses X; as distinct from "users", i.e. humans, who use X. I do not
specifically mean the AppEngine applications. I believe this has been a source
of confusion, so I will avoid the word "client" below.)

If we're going to be turning Culprit objects into anonymous dicts, then that
should be done in/by the Culprit class itself. That way, code using the class
isn't forced to muck around with the innards of the class— thereby becoming
fragile to changes in/to the class. If code using the class can simply call a
ToDict method (or similar) and expect that the method will return some sort of
JSON-serializable dictionary, then the code using the class becomes independent
of how the class is implemented; which in turn frees us up to change the
internal representation of Culprit objects without needing to change all the
code that uses the Culprit class.

If we're serializing Culprit objects into JSON, then we surely want to serialize
them in the same way regardless of which AppEngine application happens to be
working with that serialized format. Thus, serialization is not
application-specific, so it should not be implemented by/in application code. It
is much less fragile for application code to assume that objects already know
how to serialize *themselves*, and then just ask them to do so (by calling the
ToDict method) whenever necessary. Yes, in terms of execution control flow, the
application code is where the request for serialization is made, but that
doesn't mean that's where the implementation of serialization should live.

commit-bot: I haz the power

Description was changed from ========== [Findit] Add skeleton code for delta test script. Skeleton code ...

4 years, 1 month ago (2016-10-24 18:10:41 UTC) #35

commit-bot: I haz the power

Committed patchset #9 (id:300001) as https://chromium.googlesource.com/infra/infra/+/6cb32efbba33bf0bd0fe76a14f80a7d5172a3bb5

4 years, 1 month ago (2016-10-24 18:10:42 UTC) #36

stgao

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py File appengine/findit/crash/findit_for_chromecrash.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py#newcode28 appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self): On 2016/10/24 18:09:06, wrengr wrote: > On ...

4 years, 1 month ago (2016-10-24 18:17:26 UTC) #37

Message was sent while issue was closed.

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
File appengine/findit/crash/findit_for_chromecrash.py (right):

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self):
On 2016/10/24 18:09:06, wrengr wrote:
> On 2016/10/12 22:58:27, sharu jiang wrote:
> > On 2016/10/12 20:15:17, wrengr wrote:
> > > On 2016/10/12 01:18:20, sharu jiang wrote:
> > > > On 2016/10/11 22:57:43, wrengr wrote:
> > > > > What's the purpose of this? Since _fields is a private member of
> > namedtuple,
> > > > it
> > > > > make sense to leave it private so that it's clear when callers are
> > depending
> > > > on
> > > > > the internals of this class in ways they shouldn't be (i.e., which are
> > > > fragile).
> > > > 
> > > > This is used in delta test, and later may used to drop empty field in
> > Culprit.
> > > 
> > > I still don't understand. Why can't delta use _fields directly? If all we
> need
> > > is to compare one Culprit to another, then we should provide the
appropriate
> > > comparison functions rather than having client code rely on internals.
> > > 
> > > This class serves as the API for the library to pass information to the
> > > application; thus, it needs to serve the purposes of the library first and
> > > foremost. I assume you want to drop fields because of serializing to JSON,
> but
> > > that's an application concern not a library concern. The library shouldn't
> > have
> > > to care at all about how the application decides to serialize things.
> > 
> > shouldn't _fields supposed to be private and not used directly? This is not
> for
> > comparison. 
> > 
> > For "that's an application concern not a library concern", in this case, I
> think
> > the serialization part should be in clients' code (findit_for_*) instead of
> > Culprit, to do that, the "fields" may still be needed.
> 
> (FWIW, whenever I talk about "client code" or "clients (of X)", I mean any
code
> that uses X; as distinct from "users", i.e. humans, who use X. I do not
> specifically mean the AppEngine applications. I believe this has been a source
> of confusion, so I will avoid the word "client" below.)
> 
> If we're going to be turning Culprit objects into anonymous dicts, then that
> should be done in/by the Culprit class itself. That way, code using the class
> isn't forced to muck around with the innards of the class— thereby becoming
> fragile to changes in/to the class. If code using the class can simply call a
> ToDict method (or similar) and expect that the method will return some sort of
> JSON-serializable dictionary, then the code using the class becomes
independent
> of how the class is implemented; which in turn frees us up to change the
> internal representation of Culprit objects without needing to change all the
> code that uses the Culprit class.
> 
> If we're serializing Culprit objects into JSON, then we surely want to
serialize
> them in the same way regardless of which AppEngine application happens to be
> working with that serialized format. Thus, serialization is not
> application-specific, so it should not be implemented by/in application code.
It
> is much less fragile for application code to assume that objects already know
> how to serialize *themselves*, and then just ask them to do so (by calling the
> ToDict method) whenever necessary. Yes, in terms of execution control flow,
the
> application code is where the request for serialization is made, but that
> doesn't mean that's where the implementation of serialization should live.

+1 for Wren's suggestion here

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py File appengine/findit/crash/findit_for_chromecrash.py (right): https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/findit_for_chromecrash.py#newcode28 appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self): On 2016/10/24 18:17:26, stgao wrote: > On ...

4 years, 1 month ago (2016-10-24 18:50:20 UTC) #38

Message was sent while issue was closed.

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
File appengine/findit/crash/findit_for_chromecrash.py (right):

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self):
On 2016/10/24 18:17:26, stgao wrote:
> On 2016/10/24 18:09:06, wrengr wrote:
> > On 2016/10/12 22:58:27, sharu jiang wrote:
> > > On 2016/10/12 20:15:17, wrengr wrote:
> > > > On 2016/10/12 01:18:20, sharu jiang wrote:
> > > > > On 2016/10/11 22:57:43, wrengr wrote:
> > > > > > What's the purpose of this? Since _fields is a private member of
> > > namedtuple,
> > > > > it
> > > > > > make sense to leave it private so that it's clear when callers are
> > > depending
> > > > > on
> > > > > > the internals of this class in ways they shouldn't be (i.e., which
are
> > > > > fragile).
> > > > > 
> > > > > This is used in delta test, and later may used to drop empty field in
> > > Culprit.
> > > > 
> > > > I still don't understand. Why can't delta use _fields directly? If all
we
> > need
> > > > is to compare one Culprit to another, then we should provide the
> appropriate
> > > > comparison functions rather than having client code rely on internals.
> > > > 
> > > > This class serves as the API for the library to pass information to the
> > > > application; thus, it needs to serve the purposes of the library first
and
> > > > foremost. I assume you want to drop fields because of serializing to
JSON,
> > but
> > > > that's an application concern not a library concern. The library
shouldn't
> > > have
> > > > to care at all about how the application decides to serialize things.
> > > 
> > > shouldn't _fields supposed to be private and not used directly? This is
not
> > for
> > > comparison. 
> > > 
> > > For "that's an application concern not a library concern", in this case, I
> > think
> > > the serialization part should be in clients' code (findit_for_*) instead
of
> > > Culprit, to do that, the "fields" may still be needed.
> > 
> > (FWIW, whenever I talk about "client code" or "clients (of X)", I mean any
> code
> > that uses X; as distinct from "users", i.e. humans, who use X. I do not
> > specifically mean the AppEngine applications. I believe this has been a
source
> > of confusion, so I will avoid the word "client" below.)
> > 
> > If we're going to be turning Culprit objects into anonymous dicts, then that
> > should be done in/by the Culprit class itself. That way, code using the
class
> > isn't forced to muck around with the innards of the class— thereby becoming
> > fragile to changes in/to the class. If code using the class can simply call
a
> > ToDict method (or similar) and expect that the method will return some sort
of
> > JSON-serializable dictionary, then the code using the class becomes
> independent
> > of how the class is implemented; which in turn frees us up to change the
> > internal representation of Culprit objects without needing to change all the
> > code that uses the Culprit class.
> > 
> > If we're serializing Culprit objects into JSON, then we surely want to
> serialize
> > them in the same way regardless of which AppEngine application happens to be
> > working with that serialized format. Thus, serialization is not
> > application-specific, so it should not be implemented by/in application
code.
> It
> > is much less fragile for application code to assume that objects already
know
> > how to serialize *themselves*, and then just ask them to do so (by calling
the
> > ToDict method) whenever necessary. Yes, in terms of execution control flow,
> the
> > application code is where the request for serialization is made, but that
> > doesn't mean that's where the implementation of serialization should live.
> 
> +1 for Wren's suggestion here

I agree on what you said about the serializing culprit object into JSON, however
I still have one question, should I just use _fields directly in delta test?
Shouldn't _smth be private?

Sharu Jiang

https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py (right): https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py#newcode183 appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py:183: for git_hash in [git_hash1, git_hash2]: On 2016/10/24 17:30:36, wrengr ...

4 years, 1 month ago (2016-10-24 19:14:32 UTC) #39

wrengr

On 2016/10/24 19:14:32, sharu jiang wrote: > https://codereview.chromium.org/2400283003/diff/100001/appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py > File appengine/findit/util_scripts/crash_queries/delta_test/delta_test.py > (right): > > ...

4 years, 1 month ago (2016-10-24 20:48:08 UTC) #40

wrengr

4 years, 1 month ago (2016-10-24 21:22:57 UTC) #41

Message was sent while issue was closed.

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
File appengine/findit/crash/findit_for_chromecrash.py (right):

https://codereview.chromium.org/2400283003/diff/40001/appengine/findit/crash/...
appengine/findit/crash/findit_for_chromecrash.py:28: def fields(self):
On 2016/10/24 18:50:20, sharu jiang wrote:
> On 2016/10/24 18:17:26, stgao wrote:
> > On 2016/10/24 18:09:06, wrengr wrote:
> > > On 2016/10/12 22:58:27, sharu jiang wrote:
> > > > On 2016/10/12 20:15:17, wrengr wrote:
> > > > > On 2016/10/12 01:18:20, sharu jiang wrote:
> > > > > > On 2016/10/11 22:57:43, wrengr wrote:
> > > > > > > What's the purpose of this? Since _fields is a private member of
> > > > namedtuple,
> > > > > > it
> > > > > > > make sense to leave it private so that it's clear when callers are
> > > > depending
> > > > > > on
> > > > > > > the internals of this class in ways they shouldn't be (i.e., which
> are
> > > > > > fragile).
> > > > > > 
> > > > > > This is used in delta test, and later may used to drop empty field
in
> > > > Culprit.
> > > > > 
> > > > > I still don't understand. Why can't delta use _fields directly? If all
> we
> > > need
> > > > > is to compare one Culprit to another, then we should provide the
> > appropriate
> > > > > comparison functions rather than having client code rely on internals.
> > > > > 
> > > > > This class serves as the API for the library to pass information to
the
> > > > > application; thus, it needs to serve the purposes of the library first
> and
> > > > > foremost. I assume you want to drop fields because of serializing to
> JSON,
> > > but
> > > > > that's an application concern not a library concern. The library
> shouldn't
> > > > have
> > > > > to care at all about how the application decides to serialize things.
> > > > 
> > > > shouldn't _fields supposed to be private and not used directly? This is
> not
> > > for
> > > > comparison. 
> > > > 
> > > > For "that's an application concern not a library concern", in this case,
I
> > > think
> > > > the serialization part should be in clients' code (findit_for_*) instead
> of
> > > > Culprit, to do that, the "fields" may still be needed.
> > > 
> > > (FWIW, whenever I talk about "client code" or "clients (of X)", I mean any
> > code
> > > that uses X; as distinct from "users", i.e. humans, who use X. I do not
> > > specifically mean the AppEngine applications. I believe this has been a
> source
> > > of confusion, so I will avoid the word "client" below.)
> > > 
> > > If we're going to be turning Culprit objects into anonymous dicts, then
that
> > > should be done in/by the Culprit class itself. That way, code using the
> class
> > > isn't forced to muck around with the innards of the class— thereby
becoming
> > > fragile to changes in/to the class. If code using the class can simply
call
> a
> > > ToDict method (or similar) and expect that the method will return some
sort
> of
> > > JSON-serializable dictionary, then the code using the class becomes
> > independent
> > > of how the class is implemented; which in turn frees us up to change the
> > > internal representation of Culprit objects without needing to change all
the
> > > code that uses the Culprit class.
> > > 
> > > If we're serializing Culprit objects into JSON, then we surely want to
> > serialize
> > > them in the same way regardless of which AppEngine application happens to
be
> > > working with that serialized format. Thus, serialization is not
> > > application-specific, so it should not be implemented by/in application
> code.
> > It
> > > is much less fragile for application code to assume that objects already
> know
> > > how to serialize *themselves*, and then just ask them to do so (by calling
> the
> > > ToDict method) whenever necessary. Yes, in terms of execution control
flow,
> > the
> > > application code is where the request for serialization is made, but that
> > > doesn't mean that's where the implementation of serialization should live.
> > 
> > +1 for Wren's suggestion here
> 
> I agree on what you said about the serializing culprit object into JSON,
however
> I still have one question, should I just use _fields directly in delta test?
> Shouldn't _smth be private?

Things with a single underscore are generally "protected" in the Java
terminology, whereas things with two underscores are "private" in the Java sense
(though double underscore is also used to refer to builtins, so there's some
confusion here. Also, I don't know whether Google follows the one- vs
two-underscore tradition of the Python community at large).

As far as the Delta class goes, there are two salient approaches to take. We can
either make the class specific to being a delta of Culprit objects, or we can
try to make it generic for some large class of values (e.g., JSON-serializable
things).

If we make it Culprit-specific, then the Delta class is a "friend" in the C++
terminology, so the fact that it accesses the internals of the Culprit class
isn't unexpected. Whenever we update the Culprit class we'll have to make sure
Delta is kept in sync; but if Delta is the only "friend" of Culprit, then it's
cleaner to have Delta depend directly on the innards of Culprit rather than
setting up a bunch of boilerplate for other potential "friend" classes which
don't actually exist.

On the other hand, if we want Delta to be generic over a large class of values,
then it makes more sense IMO to define Delta to operate over dict (and list,
set, etc; i.e., Python built-in types, or JSON-serializable types, or something
similar). In this case we'd call ToDict before computing the delta, and so Delta
doesn't need to know what fields the Culprit object itself has (it can query the
dict for its keys). Of course, if we only ever use Delta for the deltas of
Culprits, then being generic over all dicts(etc) may make things unnecessarily
complicated. (I'm guessing in our case it wouldn't actually be too complicated,
but the concern is still worth mentioning.)

(A third option is to be generic over types which provide certain helper
functions. This would allow avoiding the conversion to Dict, while still letting
the Delta class remain agnostic to the internals of the classes it operates
over. The idea here is to generalize the notion of "delta" or "equality" to the
notion of "zipping" two structures together. My MSE thesis was on how to do this
cleanly and efficiently, so if we want to pursue this path I already know how to
go about it; but that may be like using a cannon as a flyswatter.)

Issue 2400283003: [Findit] Add skeleton code for delta test script. (Closed)

Description

Patch Set 1 #

Patch Set 2 : Address comments. #

Patch Set 3 : seperate --date arguments and some renames. #

Patch Set 4 : Address comments. #

Patch Set 5 : Rebase. #

Patch Set 6 : Address comments. #

Patch Set 7 : Update doc strings. #

Patch Set 8 : Rename run-azalea to run-predator #

Patch Set 9 : Rebase. #

Messages