|
|
Created:
8 years, 11 months ago by Bob Nystrom Modified:
8 years, 10 months ago CC:
reviews_dartlang.org Visibility:
Public. |
DescriptionIntegrate MDN content into API documentation.
Most of the code under mdn/ is Jacob's magic with a little extra hackery
on my part.
Committed: https://code.google.com/p/dart/source/detail?r=3750
Patch Set 1 #Patch Set 2 : Remove temp code. #
Total comments: 296
Patch Set 3 : Respond to review. #Patch Set 4 : Respond to review. #
Created: 8 years, 10 months ago
(Patch set is too large to download)
Messages
Total messages: 14 (0 generated)
https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... File utils/apidoc/apidoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:24: print('Parsing MDN data...'); Seems like a low value message to print. did you mean to check these print statements in? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:40: final mdn; final Map mdn; https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:157: // We got nothing! why not return "" typically that will simplify upstream logic removing unneeded null checks. ? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:172: final mdn = 'https://developer.mozilla.org'; moz --> MOZILLA_URL, mdn--> MDN_URL etc https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:221: mdnUrl = mdnType['srcUrl']; code smell: fn has an unexpected side effect of setting a mdn url. Either change the fn name to make it clearer that this side effect exists or remove the side effect. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:278: } else if (member.name.startsWith('get:')) { instead of hard coding 4, define 'get:' as a constant and use .length either that or use a regexp https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:46: // TODO(rnystrom): Hack! Copied from domTypes.json. Rather than hacking this here, use the same trick as in the documentLoaded method at the bottom of the file to load in the JSON list of all types. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:999: for (Element e in document.queryAll('[style]')) { this is the wrong place to do this. Only do this right before extracting the html fragment. Otherwise you will risk defeating the logic determining whether a block of text is on a line or not. genCleanHtml() is the right place. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1027: var href = a.attributes['href']; use a.href instead of a.attributes['href'] https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1033: lookUpType(maybeType) { return value not used. Name is also misleading. Change to fixHrefForType or something... alternately stop this from being a function at all and instead just add a local variable that indicates the type name. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1056: match = INNER_LINK.firstMatch(href); add a TODO that this is the wrong way to be doing this. rather than guessing at which file name matches which dart type, we actually have the mapping of MDN docs to class names if we performed this logic later in the pipeline. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1185: documentLoaded(null); Looks like this was a hack I forgot to remove. Switch back to waiting for the load event so that measurement stats are more reliable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... File utils/dartdoc/dartdoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... utils/dartdoc/dartdoc.dart:881: * Converts [absolute] which is understood to be a full path from the root of typo: i believe you meant [fullpath] not [absolute] https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... utils/dartdoc/dartdoc.dart:884: String relativePath(String fullPath) { add a todo to create a decent url class in dart that handles these cases. code in extract.dart is also trying to answer these same url questions. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/static... File utils/dartdoc/static/styles.css (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/static... utils/dartdoc/static/styles.css:72: a[rel~="custom"]:after, why are "custom" links styled as if they were external?
Thanks! https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... File utils/apidoc/apidoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:24: print('Parsing MDN data...'); On 2012/01/30 21:36:21, Jacob wrote: > Seems like a low value message to print. did you mean to check these print > statements in? The stages can take a little while, so I thought it might be helpful to let the user know that it's still doing stuff. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:40: final mdn; On 2012/01/30 21:36:21, Jacob wrote: > final Map mdn; Why bother? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:157: // We got nothing! On 2012/01/30 21:36:21, Jacob wrote: > why not return "" > typically that will simplify upstream logic removing unneeded null checks. > ? Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:172: final mdn = 'https://developer.mozilla.org'; On 2012/01/30 21:36:21, Jacob wrote: > moz --> MOZILLA_URL, mdn--> MDN_URL etc The only purpose for having these variables is to make the subsequent string easier to read and easier to fit in 80 chars, so I wanted them as short as possible. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:221: mdnUrl = mdnType['srcUrl']; On 2012/01/30 21:36:21, Jacob wrote: > code smell: fn has an unexpected side effect of setting a mdn url. > Either change the fn name to make it clearer that this side effect exists or > remove the side effect. Renamed to "include...". https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:278: } else if (member.name.startsWith('get:')) { On 2012/01/30 21:36:21, Jacob wrote: > instead of hard coding 4, define 'get:' as a constant and use .length > either that or use a regexp Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:46: // TODO(rnystrom): Hack! Copied from domTypes.json. On 2012/01/30 21:36:21, Jacob wrote: > Rather than hacking this here, use the same trick as in the documentLoaded > method at the bottom of the file to load in the JSON list of all types. For now added a comment explaining we should be doing this better. This is definitely a temp fix. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:999: for (Element e in document.queryAll('[style]')) { On 2012/01/30 21:36:21, Jacob wrote: > this is the wrong place to do this. Only do this right before extracting the > html fragment. Otherwise you will risk defeating the logic determining whether > a block of text is on a line or not. > genCleanHtml() is the right place. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1027: var href = a.attributes['href']; On 2012/01/30 21:36:21, Jacob wrote: > use > a.href instead of a.attributes['href'] See preceding comment. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1033: lookUpType(maybeType) { On 2012/01/30 21:36:21, Jacob wrote: > return value not used. Name is also misleading. Change to > fixHrefForType or something... alternately stop this from being a function at > all and instead just add a local variable that indicates the type name. Renamed and removed return. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1056: match = INNER_LINK.firstMatch(href); On 2012/01/30 21:36:21, Jacob wrote: > add a TODO that this is the wrong way to be doing this. rather than guessing at > which file name matches which dart type, we actually have the mapping of MDN > docs to class names if we performed this logic later in the pipeline. Added a comment up by the giant list of type names. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1185: documentLoaded(null); On 2012/01/30 21:36:21, Jacob wrote: > Looks like this was a hack I forgot to remove. > Switch back to waiting for the load event so that measurement stats are more > reliable. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... File utils/dartdoc/dartdoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... utils/dartdoc/dartdoc.dart:881: * Converts [absolute] which is understood to be a full path from the root of On 2012/01/30 21:36:21, Jacob wrote: > typo: i believe you meant [fullpath] not [absolute] Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... utils/dartdoc/dartdoc.dart:884: String relativePath(String fullPath) { On 2012/01/30 21:36:21, Jacob wrote: > add a todo to create a decent url class in dart that handles these cases. code > in extract.dart is also trying to answer these same url questions. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/static... File utils/dartdoc/static/styles.css (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/static... utils/dartdoc/static/styles.css:72: a[rel~="custom"]:after, On 2012/01/30 21:36:21, Jacob wrote: > why are "custom" links styled as if they were external? The mozilla docs seemed to have some links that used that? Changed it to look for 'http:' and 'https' in the href and style those instead.
https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... File utils/apidoc/apidoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:40: final mdn; On 2012/01/31 20:56:29, Bob Nystrom wrote: > On 2012/01/30 21:36:21, Jacob wrote: > > final Map mdn; > > Why bother? When your documentation is describing the type with words it is a sign you should have instead just listed the type. /** Big ball of JSON containing the scraped MDN documentation */ could change to /** Scraped MDN documentation */ final Map mdn; https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:172: final mdn = 'https://developer.mozilla.org'; On 2012/01/31 20:56:29, Bob Nystrom wrote: > On 2012/01/30 21:36:21, Jacob wrote: > > moz --> MOZILLA_URL, mdn--> MDN_URL etc > > The only purpose for having these variables is to make the subsequent string > easier to read and easier to fit in 80 chars, so I wanted them as short as > possible. At least make them all caps as they are constants. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1027: var href = a.attributes['href']; On 2012/01/31 20:56:29, Bob Nystrom wrote: > On 2012/01/30 21:36:21, Jacob wrote: > > use > > a.href instead of a.attributes['href'] > > See preceding comment. See which preceding comment? This seems like a trivial fix which illustrates the correct way to use the DOM
Thanks! https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... File utils/apidoc/apidoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:40: final mdn; On 2012/01/31 21:08:19, Jacob wrote: > On 2012/01/31 20:56:29, Bob Nystrom wrote: > > On 2012/01/30 21:36:21, Jacob wrote: > > > final Map mdn; > > > > Why bother? > > When your documentation is describing the type with words it is a sign you > should have instead just listed the type. > /** Big ball of JSON containing the scraped MDN documentation */ > could change to > /** Scraped MDN documentation */ > final Map mdn; Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:172: final mdn = 'https://developer.mozilla.org'; On 2012/01/31 21:08:19, Jacob wrote: > On 2012/01/31 20:56:29, Bob Nystrom wrote: > > On 2012/01/30 21:36:21, Jacob wrote: > > > moz --> MOZILLA_URL, mdn--> MDN_URL etc > > > > The only purpose for having these variables is to make the subsequent string > > easier to read and easier to fit in 80 chars, so I wanted them as short as > > possible. > > At least make them all caps as they are constants. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1027: var href = a.attributes['href']; On 2012/01/31 21:08:19, Jacob wrote: > On 2012/01/31 20:56:29, Bob Nystrom wrote: > > On 2012/01/30 21:36:21, Jacob wrote: > > > use > > > a.href instead of a.attributes['href'] > > > > See preceding comment. > > See which preceding comment? This seems like a trivial fix which illustrates the > correct way to use the DOM This one: // Get the raw attribute because we *don't* want the browser to fully- // qualify the name for us since it has the wrong base address for the page. If you do a.href, it returns a resolved fully-qualified URL (which is in this case in correct because the page doesn't know its correct base URL). In other words, if foo/bar.html has <a href="bang.html"> then a.href gives you "foo/bang.html" which isn't what we want.
lgtm
On 2012/01/31 21:27:59, Jacob wrote: > lgtm \o/
This code is... pretty hairy. Obviously this has been checked in already, and I think that's correct (since we want these docs up ASAP), but we should be aware that by doing so we've incurred a considerable amount of technical debt. Despite spending a day and a half going through this, I have only a vague understanding of the overall functioning and a poor understanding of most of the specifics. This is a piece of code we're presumably going to be using for a while, as well as modifying as the HTML libraries shift and we want to refine our scraping; as such, it needs to get to a point where it's straightforward for someone who hasn't seen it before to understand and modify it. To that end, the code needs improved style and documentation (especially the overview in the README), and there are number of places where some refactoring would be extremely helpful. I'm not sure what of this code is Jacob's and what's Bob's, but since this is the CL for all of it I've reviewed it all. I assume each of you will address the comments for the code you wrote. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... File utils/apidoc/apidoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:113: // we can delete this code entirely. I'm not sure about this for the use case of a user familiar with the JS DOM coming to Dart. Without these annotations, there's no clear path from knowing how to do something in JS to figuring out how to do it in Dart. It seems like we should at least include annotations in dart:dom as a lookup table for these users. The main benefit of annotations in dart:html is Ctrl-F searchability and the "oh yeah I know what that does" factor, which might be worth abandoning for increased neatness. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/apidoc.... utils/apidoc/apidoc.dart:259: } It breaks my heart that it takes seven lines in Dart to do what would be a one-liner in Ruby. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... File utils/apidoc/mdn/README.txt (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:2: database.filtered.json. After reading this, I'm having trouble understanding how all this processing fits together. Why is so much of this in JS? What's run in a browser? Why? How? By whom? When? Answering these questions will make this much clearer for someone coming in without much knowledge. I'm being pretty demanding in my comments here because this MDN extraction is a complex piece of code, and it seems crucial that anyone interacting with it have a clear overview of how it works. This README should provide that overview, but I think as-is it falls short. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:5: - read data/domTypes.json What's in this file? Where does it come from? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:14: - for each output/search/<type>.json: Isn't there only one of these files for each type? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:21: extract.sh Should probably mention which directory this needs to be run from. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:23: - run extractRunner.js Is this the same as the compiled extract.dart? If not, what's the relation between the two? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:26: - read data/dartIdl.json What's in this file? Where does it come from? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:31: data on how that file should be processed s/that file/the HTML file/ What sort of data? What uses the data? "An args file" doesn't tell me much. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:32: - invoke dump render tree on that file Make it more explicit that this invokes it in a headless browser and so runs extract.dart.js in the context of the HTML page. Will this access the JSON file? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:33: - when that returns, parse the console output and add it to database.json Does this mean output/database.json? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:35: - save output/database.json Somewhat confusing given that you just said you were writing to it, and you don't also say "save output/errors.json". https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:37: extract.dart Is this run within extractRunner.js? How is its functionality different than that of extractRunner.js? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:38: - xhr output/extract/<type><index>.html.json Is this different than the "read *.json" you're doing in extractRunner.js? If so, how are you reading files there? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:43: - run postProcess.dart Is this run via DumpRenderTree? On the VM? On Frog+Node? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:44: - go through the results for each type looking for the best match Mention what files you're using here. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:47: - write output/obsolete.html What are all these files for? Why are they in HTML? Are they meant to be human-readable? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/REA... utils/apidoc/mdn/README.txt:48: - write output/database.filtered.json which is the best matches Is this just a mapping of type names to the contents of output/extract/*.json? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... File utils/apidoc/mdn/crawl.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:1: var http = require('http'); Why is this file in JS? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:20: throw "Unexpected url:" + link; Space after ":" https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1: #import ("dart:html"); This whole file could use considerably more documentation. I had a very difficult time following it. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:17: Element e = n; You're just calling e.dynamic on the next line :-p https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:31: final DART_REMOVED = "dart_removed"; What are the semantics of this class? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:259: _pageDomain = pageUrl.substring(0, pageUrl.indexOf("/", "https://".length)); Seems like this would be easier to understand with a RegExp. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:271: RegExp fullUrlRegExp = new RegExp("^https?://"); Style nit: no need to assign this to a variable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:291: Element e = new Element.tag("div"); Style nit: redundant type declaration here (and quite a few other places in this file). https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:302: for (Node child in n.nodes) { Style nit: redundant type declaration here and elsewhere. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:326: if (isSkippableType(child) == false) { !isSkippableType(child) https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:344: window.postMessage("START_DART_MESSAGE_UNIQUE_IDENTIFIER$dbJson", "*"); I'm pretty confused about why this is happening. I may figure it out as I read more of the CL, but it would be nice to have some documentation here explaining it. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:367: while (root.nodes.length == 1) { "while (root.nodes.length == 1 && root.nodes.first is Element)" seems cleaner. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:396: String genPrettyHtml(DocumentFragment fragment) { Redundant method https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:471: if (firstLine.toString().indexOf(stripWebkit(prop)) == -1) { This seems conceptually distinct from what the rest of the method is doing. Style nit: "!contains()" is more readable than "indexOf() == -1". https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:479: if (a.text.indexOf(prop) != -1) { Style nit: "a.text.contains(prop)" https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:487: Element findTigherRoot(Element elem, Element root) { s/Tigher/Tighter/ https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:489: while(root != candidate) { Style nit: space after "while" https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:499: SectionParseResult filteredHtml(Element elem, Element root, String prop, This is pretty hard to follow. If you don't end up rewriting it (or even if you do), a few well-placed comments might help. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:510: if (current.classes.contains(DART_REMOVED)) { Style nit: use && rather than nested ifs https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:568: && txt.indexOf(")") != -1) { Style nit: #contains rather than #indexOf https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:577: Element findBest(Element root, List<Text> allText, String prop, String propType) { Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:579: Element cand; Style nit: var cand = root.query("#" + prop); https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:583: cand = root.query("[id=" + prop + "\\(\\)]"); Style nit: use interpolation https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:585: if (cand != null) { Redundant if statement; everything in here will be skipped anyway if cand != null. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:596: // If you are at least 70 pixels from the left, something is definitely fishy and we shouldn't even consider this candidate. Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:633: RegExp firstLower = new RegExp("^[a-z]"); Doesn't need a variable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:637: void scrapeSection(Element root, String sectionSelector, This is another method that could really use some documentation. Ideally it would also be broken up into smaller, semantically-named sub-methods. Style nit: Inconsistent formatting. Fit as many parameters as possible onto each line; indent parameters four spaces, don't align. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:652: if (!match.id.startsWith("section") && !(match.id == "pageText")) { Style nit: != https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:653: throw "Enexpected element $match"; "Unexpected" https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:666: var txt = r.text.trim().split(" ")[0].toLowerCase(); Style nit: final https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:682: ElementList $row = r.elements; What's with the "$"? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:683: if ($row.length == 0 || $row.first.classes.contains(".header")) { No need to manually continue if length is 0... the for loop won't run anyway. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:688: Element e = $row[k]; Unnecessary variable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:725: // "currentType": $($row[1]).text().trim(), // find("code") ? These comments should be cleaned up. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:726: entry["name"] = fullNameCleanup(nameStr.length > 0 ? nameStr : goodName); Long line https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:732: entry["help"] = (helpIndex == -1 || $row[helpIndex] == null) ? altHelp : genPrettyHtmlFromElement($row[helpIndex]); Long line https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:783: if (!cand.classes.contains(DART_REMOVED) && !inTable(cand) ) { // XXX use a neg selector. Long line https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:790: // Don't set here to avoid layouts... cand.classes.add(DART_REMOVED); Long lines https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:799: Element e = pmap[prop]; Style nit: unnecessary variable https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:807: if (e.id.indexOf(matchElement.id) != -1) { Style nit: #contains rather than #indexOf https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:825: //"jsSignature" : nameStr Remove https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:844: if (constRegExp.hasMatch(name)) return true; Style nit: Unnecessary variables https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:847: void markRemoved(var e) { Why not just have this take an Element and use list.forEach(markRemoved)? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:866: name = name.replaceFirst(regExp, "webkit"); Style nit: Both of these variable assignments are unnecessary https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:877: // TODO(jacobr): workaround bug in: Reference bug ID https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:947: void resourceLoaded() { Why is this in its own function? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:984: // Inject CSS to insure lines don't wrap unless it was intentional. s/insure/ensure/ https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1033: lookUpType(maybeType) { It seems unnecessary to me that this function modifies href. If Dart supported falseyness, I'd suggest having it return the new href so you could write "href = lookUpType(...) || href", but that's much more awkward without falseyness. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1072: if (title.toLowerCase().indexOf(currentTypeTiny.toLowerCase()) == -1) { Style nit: !#contains rather than #indexOf == -1 https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1085: if (foundMatch == false) { !foundMatch https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1101: List members = dbEntry['members']; Style nit: declare this closer to where it's used https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1111: scrapeSection(root, "#Constants, #Error_codes, #State_constants", currentType, members, 'constants'); Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1115: "[id^=Properties], #Notes, [id^=Other_properties], #Attributes, #DOM_properties, #Event_handlers, #Event_Handlers", Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.sh (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.sh:1: ../../../frog/minfrog --out=output/extract.dart.js --enable_type_checks --compile-only extract.dart Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extractRunner.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:1: var fs = require('fs'); It's not clear why this is in JS either. The reasoning should definitely be documented in the file and/or the README. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:17: var dartIdl = JSON.parse(fs.readFileSync('data/dartIdl.json', 'utf8').toString()); Line lengths. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:29: function parseFile(type, onDone, entry, file, searchResultIndex) { Why is this a separate function? It's only called once and the name isn't any more semantic than a comment would be. Moving it here just makes the code more non-local. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:43: inputFile = inputFile.substr(1); It would be much clearer to just do "var matchIndex = inputFile.indexOf('<!DOCTYPE', 1)". https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:53: // Disable all existing javascript in the input file to speedup parsing and Grammar nit: "speed up" https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:65: console.warn("Skipping 404 file"); List the filename. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:69: throw "Unexpected file format for " + file; Why are we throwing here instead of warning? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:72: // Remove all easy to remove script tags to speed page load. That's not what this code is doing. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:75: ' if (window.layoutTestController) {\n' + Why are we feature-detecting here? Are we planning to run this on multiple browsers? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:92: '<script type="text/javascript" src="../../output/extract.dart.js"></script>') + Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:96: var dumpFileName = "output/extract/" + file; Style nit: unnecessary variable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:108: 'DumpRenderTree ' + absoluteDumpFileName; TODO: Make this run on platforms other than OS X. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:110: var child = exec(cmd, Unused variable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:114: var msg = msg.substring(0, msg.indexOf(END_DART_MESSAGE)); Shouldn't have "var". Actually, this stuff should probably just use a regexp. Style nit: do this stuff after error checking. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:130: JSON.stringify(errorFiles, null, ' '), 'utf8'); Why is this written again for every error? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:140: // parallelism. This comment should probably be attached to numProcesses. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:143: function processNextTask() { If you're trying to do stuff in parallel, this would probably benefit from using async file IO. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:158: function createTask(type, entry, index) { This also doesn't seem worth a function. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:168: if (entries != null) { Style nit: if (!entries) https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ful... File utils/apidoc/mdn/full_run.sh (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ful... utils/apidoc/mdn/full_run.sh:3: # See output/database.html for a human readable view of the extracted data. This script should definitely be mentioned in the README. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... File utils/apidoc/mdn/postProcess.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:1: #library("dump"); Should probably have the same name as the file. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:9: Set matchedTypes; These should have generic parameters. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:15: /** Returns whether the type has any property matching the specified name. */ s/property/member/ https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:17: Map data = allProps[type]; This type declaration would be unnecessary if allProps had a fully defined type. There are various other unnecessary types in this file. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:23: List<String> sortStringCollection(Collection<String> collection) { I'd call this sortCollection and type it as "Collection<Comparable> -> List<Comparable>". Seems like methods should accept as broad a type as possible, and the caller should cast to something more specific. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:31: Map getMembersMap(Map entry) { Either change the docstring or make this take a List. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:33: Map members = new Map<String, Object>(); s/Object/Dynamic/. Also use a map literal. I guess "<Dynamic>{}" just simplifies to "{}", so even better. Also the type declaration for "members" is unnecessary. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:45: String propType) { Helper methods should be declared within the functions that use them as helpers. This also means you don't have to take sb, type, and members as parameters. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:52: sb.add(""" Style nit: incorrect indentation. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:73: * Score an entry using heuristics to determine how well a entry Grammar nit: "an entry". https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:76: * with class level comments, etc. It'd be nice to have an example here of where we get conflicts and what they look like. Maybe just links to different MDN pages we have to choose between for some case. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:78: num scoreEntry(Map entry, String type) { s/num/int/ https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:81: score++; Why would we ever want to choose a skipped entry? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:85: for (String name in members.getKeys()) { Why are we converting the members list into a map, only to iterate over it as though it were a list? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:103: num score = scoreEntry(entry, type); Indentation. Alternately, "if (entry == null) continue" on the previous line. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:132: final sb = new StringBuffer(); The vast number of string buffers being used in this method make my head spin. In lieu of an actual template system, I think it would be cleaner to write this code as though one existed. That is, have an individual method for each file that will be created; put all string buffer management into those methods. Then main's responsibility becomes simply packaging the data up into the structures that each of these faux-template methods needs. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:134: sb.add(""" Maybe add a TODO here for using a template system once we have one. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:141: font: 14px/1.428 "Lucida Grande", "Lucida Sans Unicode", Lucida, Arial, Helvetica, sans-serif; Line length. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:225: -- <a target="_blank" href="${entry == null ? "???" : entry["srcUrl"]}">scraped url</a> Line lengths. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:235: Map members = getMembersMap(entry); Indentation, here and many other places below https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:289: ["summary", "constructor", "compatibility", "specification", "seeAlso"]) Where's the opening { for this loop? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:328: title = '<h4>Dart type: $type</h4><h2>${title}</h2>'; Style nit: $title, here and line 330. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:344: ${sbExamples.toString()} Style nit: $sbExamples https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:371: <h3>Skipped generating documentation for $numSkipped classes due to no plausible matching files</h3> Line lengths https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:391: font: 14px/1.428 "Lucida Grande", "Lucida Sans Unicode", Lucida, Arial, Helvetica, sans-serif; Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:432: font: 14px/1.428 "Lucida Grande", "Lucida Sans Unicode", Lucida, Arial, Helvetica, sans-serif; Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:455: <table> Indentation https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:458: <th>Type</th><th>Name</th><th>Description</th><th>IDL</th><th>Status</th> Line length https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/sea... File utils/apidoc/mdn/search.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/sea... utils/apidoc/mdn/search.js:42: if (err) throw err; Indentation https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... File utils/dartdoc/dartdoc.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/dartdoc/dartdo... utils/dartdoc/dartdoc.dart:877: return md.markdownToHtml(comment); Seems like maybe a _maybeMarkdownToHtml is called for?
I'll deal with the remaining postProcess comments tomorrow. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... File utils/apidoc/mdn/crawl.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:1: var http = require('http'); On 2012/02/01 00:10:39, nweiz wrote: > Why is this file in JS? Because the dart server libraries currently lack the functionality we need. When they do this script should be ported. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:20: throw "Unexpected url:" + link; On 2012/02/01 00:10:39, nweiz wrote: > Space after ":" Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1: #import ("dart:html"); On 2012/02/01 00:10:39, nweiz wrote: > This whole file could use considerably more documentation. I had a very > difficult time following it. I agree that this script needs a larger comment describing the design but I disagree that most of the small methods should have comments because they are mainly self explanatory. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:17: Element e = n; On 2012/02/01 00:10:39, nweiz wrote: > You're just calling e.dynamic on the next line :-p Notice that this method is documented as a hack to work around the lack of sync measurement. This code will go away soon so is not worth cleaning up. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:31: final DART_REMOVED = "dart_removed"; On 2012/02/01 00:10:39, nweiz wrote: > What are the semantics of this class? documented https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:291: Element e = new Element.tag("div"); On 2012/02/01 00:10:39, nweiz wrote: > Style nit: redundant type declaration here (and quite a few other places in this > file). I wrote this code using the editor and for now that is needed to get it to provide a pleasant auto complete experience. Removed a few types and replaced with final. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:302: for (Node child in n.nodes) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: redundant type declaration here and elsewhere. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:326: if (isSkippableType(child) == false) { On 2012/02/01 00:10:39, nweiz wrote: > !isSkippableType(child) Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:344: window.postMessage("START_DART_MESSAGE_UNIQUE_IDENTIFIER$dbJson", "*"); On 2012/02/01 00:10:39, nweiz wrote: > I'm pretty confused about why this is happening. I may figure it out as I read > more of the CL, but it would be nice to have some documentation here explaining > it. See the comment above... this is a hack to work around a bad bug in the JSON parser. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:367: while (root.nodes.length == 1) { On 2012/02/01 00:10:39, nweiz wrote: > "while (root.nodes.length == 1 && root.nodes.first is Element)" seems cleaner. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:396: String genPrettyHtml(DocumentFragment fragment) { On 2012/02/01 00:10:39, nweiz wrote: > Redundant method removed https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:471: if (firstLine.toString().indexOf(stripWebkit(prop)) == -1) { On 2012/02/01 00:10:39, nweiz wrote: > This seems conceptually distinct from what the rest of the method is doing. > > Style nit: "!contains()" is more readable than "indexOf() == -1". Added a comment explaining why this is the way it is https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:479: if (a.text.indexOf(prop) != -1) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: "a.text.contains(prop)" Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:487: Element findTigherRoot(Element elem, Element root) { On 2012/02/01 00:10:39, nweiz wrote: > s/Tigher/Tighter/ Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:489: while(root != candidate) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: space after "while" Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:499: SectionParseResult filteredHtml(Element elem, Element root, String prop, On 2012/02/01 00:10:39, nweiz wrote: > This is pretty hard to follow. If you don't end up rewriting it (or even if you > do), a few well-placed comments might help. Clarified the above comment as a TODO... this code is lousy. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:510: if (current.classes.contains(DART_REMOVED)) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: use && rather than nested ifs Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:568: && txt.indexOf(")") != -1) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: #contains rather than #indexOf When making comments like that just note that all cases of indexof(...) != -1 should change to contains rather than commenting on every case. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:577: Element findBest(Element root, List<Text> allText, String prop, String propType) { On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:579: Element cand; On 2012/02/01 00:10:39, nweiz wrote: > Style nit: var cand = root.query("#" + prop); I prefer Element cand = root.query("#$prop") https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:583: cand = root.query("[id=" + prop + "\\(\\)]"); On 2012/02/01 00:10:39, nweiz wrote: > Style nit: use interpolation Done. This code started its life as JS. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:585: if (cand != null) { On 2012/02/01 00:10:39, nweiz wrote: > Redundant if statement; everything in here will be skipped anyway if cand != > null. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:596: // If you are at least 70 pixels from the left, something is definitely fishy and we shouldn't even consider this candidate. On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:633: RegExp firstLower = new RegExp("^[a-z]"); On 2012/02/01 00:10:39, nweiz wrote: > Doesn't need a variable. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:637: void scrapeSection(Element root, String sectionSelector, On 2012/02/01 00:10:39, nweiz wrote: > This is another method that could really use some documentation. Ideally it > would also be broken up into smaller, semantically-named sub-methods. > > Style nit: Inconsistent formatting. Fit as many parameters as possible onto each > line; indent parameters four spaces, don't align. Added todo. cleanup indentation https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:652: if (!match.id.startsWith("section") && !(match.id == "pageText")) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: != Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:653: throw "Enexpected element $match"; On 2012/02/01 00:10:39, nweiz wrote: > "Unexpected" Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:666: var txt = r.text.trim().split(" ")[0].toLowerCase(); On 2012/02/01 00:10:39, nweiz wrote: > Style nit: final Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:682: ElementList $row = r.elements; On 2012/02/01 00:10:39, nweiz wrote: > What's with the "$"? This was jquery code to start with.... using $ is a jquery coding style i like similar to hungarian notation. removed. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:683: if ($row.length == 0 || $row.first.classes.contains(".header")) { On 2012/02/01 00:10:39, nweiz wrote: > No need to manually continue if length is 0... the for loop won't run anyway. That is needed. otherwise you will throw an exception on row.first.classes.contains... as row.first will be null https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:688: Element e = $row[k]; On 2012/02/01 00:10:39, nweiz wrote: > Unnecessary variable. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:725: // "currentType": $($row[1]).text().trim(), // find("code") ? On 2012/02/01 00:10:39, nweiz wrote: > These comments should be cleaned up. removed https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:726: entry["name"] = fullNameCleanup(nameStr.length > 0 ? nameStr : goodName); On 2012/02/01 00:10:39, nweiz wrote: > Long line Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:732: entry["help"] = (helpIndex == -1 || $row[helpIndex] == null) ? altHelp : genPrettyHtmlFromElement($row[helpIndex]); On 2012/02/01 00:10:39, nweiz wrote: > Long line Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:783: if (!cand.classes.contains(DART_REMOVED) && !inTable(cand) ) { // XXX use a neg selector. On 2012/02/01 00:10:39, nweiz wrote: > Long line Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:790: // Don't set here to avoid layouts... cand.classes.add(DART_REMOVED); On 2012/02/01 00:10:39, nweiz wrote: > Long lines please add one comment to the top of a file with long lines instead of commenting on each line. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:799: Element e = pmap[prop]; On 2012/02/01 00:10:39, nweiz wrote: > Style nit: unnecessary variable Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:807: if (e.id.indexOf(matchElement.id) != -1) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: #contains rather than #indexOf Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:825: //"jsSignature" : nameStr On 2012/02/01 00:10:39, nweiz wrote: > Remove Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:844: if (constRegExp.hasMatch(name)) return true; On 2012/02/01 00:10:39, nweiz wrote: > Style nit: Unnecessary variables Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:847: void markRemoved(var e) { On 2012/02/01 00:10:39, nweiz wrote: > Why not just have this take an Element and use list.forEach(markRemoved)? I don't follow https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:866: name = name.replaceFirst(regExp, "webkit"); On 2012/02/01 00:10:39, nweiz wrote: > Style nit: Both of these variable assignments are unnecessary Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:947: void resourceLoaded() { On 2012/02/01 00:10:39, nweiz wrote: > Why is this in its own function? obsolete. removed. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:984: // Inject CSS to insure lines don't wrap unless it was intentional. On 2012/02/01 00:10:39, nweiz wrote: > s/insure/ensure/ Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1072: if (title.toLowerCase().indexOf(currentTypeTiny.toLowerCase()) == -1) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: !#contains rather than #indexOf == -1 done. see above comments about only marking one instances of an issue instead of every one https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1085: if (foundMatch == false) { On 2012/02/01 00:10:39, nweiz wrote: > !foundMatch Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1101: List members = dbEntry['members']; On 2012/02/01 00:10:39, nweiz wrote: > Style nit: declare this closer to where it's used It is 6 lines above where it is used. I think this spot is better given that the markRemoved calls are also related to figuring out what the members are. if anything this should be higher not lower in the fn. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1111: scrapeSection(root, "#Constants, #Error_codes, #State_constants", currentType, members, 'constants'); On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1115: "[id^=Properties], #Notes, [id^=Other_properties], #Attributes, #DOM_properties, #Event_handlers, #Event_Handlers", On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extractRunner.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:1: var fs = require('fs'); On 2012/02/01 00:10:39, nweiz wrote: > It's not clear why this is in JS either. The reasoning should definitely be > documented in the file and/or the README. Same reason as other js script. This should be rewritten in dart once server side support is better. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:17: var dartIdl = JSON.parse(fs.readFileSync('data/dartIdl.json', 'utf8').toString()); On 2012/02/01 00:10:39, nweiz wrote: > Line lengths. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:29: function parseFile(type, onDone, entry, file, searchResultIndex) { On 2012/02/01 00:10:39, nweiz wrote: > Why is this a separate function? It's only called once and the name isn't any > more semantic than a comment would be. Moving it here just makes the code more > non-local. Seems like a reasonable function name to me. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:43: inputFile = inputFile.substr(1); On 2012/02/01 00:10:39, nweiz wrote: > It would be much clearer to just do "var matchIndex = > inputFile.indexOf('<!DOCTYPE', 1)". Can't do that now that a toLowerCase is also needed. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:53: // Disable all existing javascript in the input file to speedup parsing and On 2012/02/01 00:10:39, nweiz wrote: > Grammar nit: "speed up" speedup seems like it is valid. http://en.wikipedia.org/wiki/Speedup https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:65: console.warn("Skipping 404 file"); On 2012/02/01 00:10:39, nweiz wrote: > List the filename. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:69: throw "Unexpected file format for " + file; On 2012/02/01 00:10:39, nweiz wrote: > Why are we throwing here instead of warning? because that indicates a serious bug https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:72: // Remove all easy to remove script tags to speed page load. On 2012/02/01 00:10:39, nweiz wrote: > That's not what this code is doing. yep. i used to and then stopped. removed comment. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:75: ' if (window.layoutTestController) {\n' + On 2012/02/01 00:10:39, nweiz wrote: > Why are we feature-detecting here? Are we planning to run this on multiple > browsers? added comment explaining this. // We feature detect whether the browser supports layoutTestController // so we only clear the document content when running in the test shell // and not when debugging using a normal browser. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:92: '<script type="text/javascript" src="../../output/extract.dart.js"></script>') + On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:96: var dumpFileName = "output/extract/" + file; On 2012/02/01 00:10:39, nweiz wrote: > Style nit: unnecessary variable. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:108: 'DumpRenderTree ' + absoluteDumpFileName; On 2012/02/01 00:10:39, nweiz wrote: > TODO: Make this run on platforms other than OS X. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:110: var child = exec(cmd, On 2012/02/01 00:10:39, nweiz wrote: > Unused variable. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:114: var msg = msg.substring(0, msg.indexOf(END_DART_MESSAGE)); On 2012/02/01 00:10:39, nweiz wrote: > Shouldn't have "var". Actually, this stuff should probably just use a regexp. > > Style nit: do this stuff after error checking. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:130: JSON.stringify(errorFiles, null, ' '), 'utf8'); On 2012/02/01 00:10:39, nweiz wrote: > Why is this written again for every error? So that if you press control-c you always have an up to date errors.json file. important as the script takes a while to run. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:140: // parallelism. On 2012/02/01 00:10:39, nweiz wrote: > This comment should probably be attached to numProcesses. Done. Also moved to the top of the file. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:143: function processNextTask() { On 2012/02/01 00:10:39, nweiz wrote: > If you're trying to do stuff in parallel, this would probably benefit from using > async file IO. I strongly disagree because the bottleneck is in the extraction not reading the files from disk. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:158: function createTask(type, entry, index) { On 2012/02/01 00:10:39, nweiz wrote: > This also doesn't seem worth a function. I disagree https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:168: if (entries != null) { On 2012/02/01 00:10:39, nweiz wrote: > Style nit: if (!entries) I disagree. I prefer != null as it more clearly specifies the intent and would be what you would need to write in dart. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ful... File utils/apidoc/mdn/full_run.sh (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ful... utils/apidoc/mdn/full_run.sh:3: # See output/database.html for a human readable view of the extracted data. On 2012/02/01 00:10:39, nweiz wrote: > This script should definitely be mentioned in the README. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... File utils/apidoc/mdn/postProcess.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:1: #library("dump"); On 2012/02/01 00:10:39, nweiz wrote: > Should probably have the same name as the file. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:9: Set matchedTypes; On 2012/02/01 00:10:39, nweiz wrote: > These should have generic parameters. I don't find that helpful when dealing with JSON.
BTW, to clarify, Bob and I needed to collaborate on this code far before it was in a code quality state to check in. Thanks for all the detailed comments but keep in mind that this code wasn't expected to yet be at nearly the code quality bar to check in. Perhaps checking in to an experimental directory would have made that clearer.
https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... File utils/apidoc/mdn/crawl.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:1: var http = require('http'); On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Why is this file in JS? > Because the dart server libraries currently lack the functionality we need. > When they do this script should be ported. It would be good to document exactly what's keeping all the JS scripts as JS. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1: #import ("dart:html"); On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > This whole file could use considerably more documentation. I had a very > > difficult time following it. > > I agree that this script needs a larger comment describing the design but I > disagree that most of the small methods should have comments because they are > mainly self explanatory. Some of the methods are certainly self-explanatory, but some of them are less so than they seem at first glance. For example, it would be useful to know what criteria determined the first line and why we needed to find it for "findFirstLine", it's not immediately obvious what all the parameters for "findBest" mean, etc. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:291: Element e = new Element.tag("div"); On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Style nit: redundant type declaration here (and quite a few other places in > this > > file). > I wrote this code using the editor and for now that is needed to get it to > provide a pleasant auto complete experience. Removed a few types and replaced > with final. I'm generally not too keen on making code harder to read in order to make it easier to write. Do you know how soon the editor is going to be able to infer the types of variables? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:344: window.postMessage("START_DART_MESSAGE_UNIQUE_IDENTIFIER$dbJson", "*"); On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > I'm pretty confused about why this is happening. I may figure it out as I read > > more of the CL, but it would be nice to have some documentation here > explaining > > it. > See the comment above... this is a hack to work around a bad bug in the JSON > parser. The comment made me think that only the "\n" stuff was a JSON bug. How does postMessage fit in to that? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:579: Element cand; On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Style nit: var cand = root.query("#" + prop); > I prefer > Element cand = root.query("#$prop") Agreed about "#$prop", but I'd prefer "final" since #query always returns an Element (editor concerns aside). https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:847: void markRemoved(var e) { On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Why not just have this take an Element and use list.forEach(markRemoved)? > I don't follow Type this as "void markRemoved(Element e)" and remove the check for "e is Element". Then below, replace e.g. "markRemoved(document.queryAll('.pageToc, footer, header, #nav-toolbar'))" with "document.queryAll('.pageToc, footer, header, #nav-toolbar').forEach(markRemoved)". https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extractRunner.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:29: function parseFile(type, onDone, entry, file, searchResultIndex) { On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Why is this a separate function? It's only called once and the name isn't any > > more semantic than a comment would be. Moving it here just makes the code more > > non-local. > > Seems like a reasonable function name to me. The name is reasonable, but "// parse the HTML file" is reasonable as well. My point is that moving this into a separate function means the reader of the code has to jump around the file more to follow the control flow. The semantics of the parameters also become more obscure; it's not clear that onDone always points to processNextTask, it's not clear that file is always type + searchResultIndex + '.html', it's not clear that type and entry come from domTypes and cacheData and thus what their structure and semantics are. There's certainly something to be said for functions that are only called once being used to organize the code, but I don't think this is the right chunk of code to pull out. The large number of parameters it takes is an indication of this. I feel like good candidates for pulling out are chunks of code that do a single discrete operation that could be plausibly re-used, rather than code like this which is just part of the main business logic of the script. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:53: // Disable all existing javascript in the input file to speedup parsing and On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Grammar nit: "speed up" > > speedup seems like it is valid. > http://en.wikipedia.org/wiki/Speedup "Speedup" is a noun; "speed up" is the verb form. http://en.wiktionary.org/wiki/speedup https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:130: JSON.stringify(errorFiles, null, ' '), 'utf8'); On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Why is this written again for every error? > > So that if you press control-c you always have an up to date errors.json file. > important as the script takes a while to run. I see. Useful inline comment? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:158: function createTask(type, entry, index) { On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > This also doesn't seem worth a function. > > I disagree Why? I don't think it adds any clarity over just passing a function to tasks.push. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:168: if (entries != null) { On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Style nit: if (!entries) > I disagree. I prefer != null as it more clearly specifies the intent and would > be what you would need to write in dart. I'm not sure I like the idea of writing Javascript as though it were dart, as opposed to writing it idiomatically, but I guess an argument could be made either way. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... File utils/apidoc/mdn/postProcess.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:9: Set matchedTypes; On 2012/02/01 07:48:26, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > These should have generic parameters. > I don't find that helpful when dealing with JSON. I find it helpful as a reader of the code. I want to know things like how deeply-nested the objects are, and once you can get rid of the type-declared variables you'll need the full declaration to get type inference.
Fixes are present in https://chromiumcodereview.appspot.com/9315026 https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... File utils/apidoc/mdn/crawl.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:1: var http = require('http'); On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Why is this file in JS? > > Because the dart server libraries currently lack the functionality we need. > > When they do this script should be ported. > > It would be good to document exactly what's keeping all the JS scripts as JS. that seems low value. Just look at each call to a Node method (e.g. http. and fs.) and you have something that is missing in Dart currently. However I have added a TODO to convert to dart when all these methods are available. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:344: window.postMessage("START_DART_MESSAGE_UNIQUE_IDENTIFIER$dbJson", "*"); On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > I'm pretty confused about why this is happening. I may figure it out as I > read > > > more of the CL, but it would be nice to have some documentation here > > explaining > > > it. > > See the comment above... this is a hack to work around a bad bug in the JSON > > parser. > > The comment made me think that only the "\n" stuff was a JSON bug. How does > postMessage fit in to that? postMessage is just a natural way you communicate from Dart to JS. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:579: Element cand; On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Style nit: var cand = root.query("#" + prop); > > I prefer > > Element cand = root.query("#$prop") > > Agreed about "#$prop", but I'd prefer "final" since #query always returns an > Element (editor concerns aside). Fixed. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:847: void markRemoved(var e) { On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Why not just have this take an Element and use list.forEach(markRemoved)? > > I don't follow > > Type this as "void markRemoved(Element e)" and remove the check for "e is > Element". Then below, replace e.g. "markRemoved(document.queryAll('.pageToc, > footer, header, #nav-toolbar'))" with "document.queryAll('.pageToc, footer, > header, #nav-toolbar').forEach(markRemoved)". Added a todo clarifying that the way the code is currently written is ugly but it will become less so when ElementList also supports .classes, etc. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:877: // TODO(jacobr): workaround bug in: On 2012/02/01 00:10:39, nweiz wrote: > Reference bug ID removed todo. I forget what this was about. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extractRunner.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:29: function parseFile(type, onDone, entry, file, searchResultIndex) { On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Why is this a separate function? It's only called once and the name isn't > any > > > more semantic than a comment would be. Moving it here just makes the code > more > > > non-local. > > > > Seems like a reasonable function name to me. > > The name is reasonable, but "// parse the HTML file" is reasonable as well. My > point is that moving this into a separate function means the reader of the code > has to jump around the file more to follow the control flow. The semantics of > the parameters also become more obscure; it's not clear that onDone always > points to processNextTask, it's not clear that file is always type + > searchResultIndex + '.html', it's not clear that type and entry come from > domTypes and cacheData and thus what their structure and semantics are. > > There's certainly something to be said for functions that are only called once > being used to organize the code, but I don't think this is the right chunk of > code to pull out. The large number of parameters it takes is an indication of > this. I feel like good candidates for pulling out are chunks of code that do a > single discrete operation that could be plausibly re-used, rather than code like > this which is just part of the main business logic of the script. here's a different take on why this is a good function: if there was test code for this file, you would absolutely want to have the test code just call parseFile on the specific file being tested. This function encapsulates just the logic of parsing a single file. The function that calls it includes the cruft to handle scheduling tasks to run this fn. Even though this fn is only called from one context, having that separation is still handy IMHO. Obviously this is a borderline case and you could design either way and it would be reasonable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:53: // Disable all existing javascript in the input file to speedup parsing and On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Grammar nit: "speed up" > > > > speedup seems like it is valid. > > http://en.wikipedia.org/wiki/Speedup > > "Speedup" is a noun; "speed up" is the verb form. > http://en.wiktionary.org/wiki/speedup Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:130: JSON.stringify(errorFiles, null, ' '), 'utf8'); On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Why is this written again for every error? > > > > So that if you press control-c you always have an up to date errors.json file. > > > important as the script takes a while to run. > > I see. Useful inline comment? Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:158: function createTask(type, entry, index) { On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > This also doesn't seem worth a function. > > > > I disagree > > Why? I don't think it adds any clarity over just passing a function to > tasks.push. Keep in mind this is JavaScript not Dart so you'd have to write something silly like: (function (type, entry, index) { return function () { var file = type + index + '.html'; parseFile(type, processNextTask, entry, file, index); }; )(type, entry, index); If you wanted to avoid writing a function due to JavaScript's rule that only a function call creates a new context. I think the existing code is far more readable. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:168: if (entries != null) { On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Style nit: if (!entries) > > I disagree. I prefer != null as it more clearly specifies the intent and would > > be what you would need to write in dart. > > I'm not sure I like the idea of writing Javascript as though it were dart, as > opposed to writing it idiomatically, but I guess an argument could be made > either way. writing JavaScript as if it were dart is the right pragmatic choice when the code will be ported to dart. If this code was going to stay in dart I would agree that idiomatic JavaScript is the only acceptable way to go. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... File utils/apidoc/mdn/postProcess.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:1: #library("dump"); Added a class level comment to clarify that this file needs to split up into a quick and dirty pretty printer and a post process step. // TODO(jacobr): this file conflates pretty printing of the JSON database with // filtering the database to select the best matches per file. // Separate out the two tasks as quick and dirty coding is correct for pretty // printing but more carefully documented code is required for the code // filtering the database. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:9: Set matchedTypes; On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > These should have generic parameters. > > I don't find that helpful when dealing with JSON. > > I find it helpful as a reader of the code. I want to know things like how > deeply-nested the objects are, and once you can get rid of the type-declared > variables you'll need the full declaration to get type inference. I don't want to write Map<String, Map<String, Map<String, var>>> Tried the middle ground of specifying one level more of the type hierarchy. Not sure if this will work on dartium or trigger type errors. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:15: /** Returns whether the type has any property matching the specified name. */ On 2012/02/01 00:10:39, nweiz wrote: > s/property/member/ Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:17: Map data = allProps[type]; On 2012/02/01 00:10:39, nweiz wrote: > This type declaration would be unnecessary if allProps had a fully defined type. > > There are various other unnecessary types in this file. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:23: List<String> sortStringCollection(Collection<String> collection) { On 2012/02/01 00:10:39, nweiz wrote: > I'd call this sortCollection and type it as "Collection<Comparable> -> > List<Comparable>". Seems like methods should accept as broad a type as possible, > and the caller should cast to something more specific. There is little value in being generic when I only need to sort strings. This is a helper method in a pretty printer not a generic library call. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:31: Map getMembersMap(Map entry) { On 2012/02/01 00:10:39, nweiz wrote: > Either change the docstring or make this take a List. removed comment as it was confusing https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:33: Map members = new Map<String, Object>(); On 2012/02/01 00:10:39, nweiz wrote: > s/Object/Dynamic/. Also use a map literal. I guess "<Dynamic>{}" just simplifies > to "{}", so even better. > > Also the type declaration for "members" is unnecessary. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:45: String propType) { On 2012/02/01 00:10:39, nweiz wrote: > Helper methods should be declared within the functions that use them as helpers. > This also means you don't have to take sb, type, and members as parameters. agreed. done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:52: sb.add(""" On 2012/02/01 00:10:39, nweiz wrote: > Style nit: incorrect indentation. this is the correct indentation so that the output HTML is nicely indented. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:73: * Score an entry using heuristics to determine how well a entry On 2012/02/01 00:10:39, nweiz wrote: > Grammar nit: "an entry". Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:76: * with class level comments, etc. On 2012/02/01 00:10:39, nweiz wrote: > It'd be nice to have an example here of where we get conflicts and what they > look like. Maybe just links to different MDN pages we have to choose between for > some case. added a big comment /** * Score entries using similarity heuristics calculated from the observed and * expected list of members. We could be much less naive and penalize spurious * methods, prefer entries with class level comments, etc. This method is * needed becase we extract entries for each of the top search results for * each class name and rely on these scores to determine which entry was * best. Typically all scores but one will be zero. Multiple pages have * non-zero scores when MDN has multiple pages on the same class or pages on * similar classes (e.g. HTMLElement and Element), or pages on Mozilla * specific classes that are similar to DOM classes (Console). */ https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:78: num scoreEntry(Map entry, String type) { On 2012/02/01 00:10:39, nweiz wrote: > s/num/int/ there isn't any reason why the score won't be a floating point number in the future. thus num is correct. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:85: for (String name in members.getKeys()) { On 2012/02/01 00:10:39, nweiz wrote: > Why are we converting the members list into a map, only to iterate over it as > though it were a list? to remove any possible duplicates from the original list. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:103: num score = scoreEntry(entry, type); On 2012/02/01 00:10:39, nweiz wrote: > Indentation. > > Alternately, "if (entry == null) continue" on the previous line. control flow with continue is less readable in general. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:132: final sb = new StringBuffer(); On 2012/02/01 00:10:39, nweiz wrote: > The vast number of string buffers being used in this method make my head spin. > In lieu of an actual template system, I think it would be cleaner to write this > code as though one existed. That is, have an individual method for each file > that will be created; put all string buffer management into those methods. Then > main's responsibility becomes simply packaging the data up into the structures > that each of these faux-template methods needs. I agree that this code is ugly due to the lack of a template system. Added a TODO. However, it isn't worthwhile to cleanup the style for this debug output generation code here as it is throw away debugging code. The only worthwhile code that needs to be maintainable in this file is the post processing code. As mentioned in the TODO at the top of the file I should split that out from this debugging code. The post processing code should be maintainable, the debugging output code should be just quick and dirty and so no extra work should be performed to make it production quality. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:134: sb.add(""" On 2012/02/01 00:10:39, nweiz wrote: > Maybe add a TODO here for using a template system once we have one. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:141: font: 14px/1.428 "Lucida Grande", "Lucida Sans Unicode", Lucida, Arial, Helvetica, sans-serif; On 2012/02/01 00:10:39, nweiz wrote: > Line length. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:225: -- <a target="_blank" href="${entry == null ? "???" : entry["srcUrl"]}">scraped url</a> On 2012/02/01 00:10:39, nweiz wrote: > Line lengths. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:235: Map members = getMembersMap(entry); On 2012/02/01 00:10:39, nweiz wrote: > Indentation, here and many other places below Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:289: ["summary", "constructor", "compatibility", "specification", "seeAlso"]) On 2012/02/01 00:10:39, nweiz wrote: > Where's the opening { for this loop? wow... i wish that was a compile warning or at least a lint warning. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:328: title = '<h4>Dart type: $type</h4><h2>${title}</h2>'; On 2012/02/01 00:10:39, nweiz wrote: > Style nit: $title, here and line 330. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:344: ${sbExamples.toString()} On 2012/02/01 00:10:39, nweiz wrote: > Style nit: $sbExamples Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:371: <h3>Skipped generating documentation for $numSkipped classes due to no plausible matching files</h3> On 2012/02/01 00:10:39, nweiz wrote: > Line lengths Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:391: font: 14px/1.428 "Lucida Grande", "Lucida Sans Unicode", Lucida, Arial, Helvetica, sans-serif; On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:432: font: 14px/1.428 "Lucida Grande", "Lucida Sans Unicode", Lucida, Arial, Helvetica, sans-serif; On 2012/02/01 00:10:39, nweiz wrote: > Line length Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:455: <table> On 2012/02/01 00:10:39, nweiz wrote: > Indentation Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:458: <th>Type</th><th>Name</th><th>Description</th><th>IDL</th><th>Status</th> On 2012/02/01 00:10:39, nweiz wrote: > Line length Done.
https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... File utils/apidoc/mdn/crawl.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:1: var http = require('http'); On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 21:23:02, nweiz wrote: > > On 2012/02/01 07:48:26, Jacob wrote: > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > Why is this file in JS? > > > Because the dart server libraries currently lack the functionality we need. > > > When they do this script should be ported. > > > > It would be good to document exactly what's keeping all the JS scripts as JS. > > that seems low value. Just look at each call to a Node method (e.g. http. and > fs.) > and you have something that is missing in Dart currently. However I have added > a TODO to convert to dart when all these methods are available. It was non-obvious to me. It's important that someone coming across this in three months is able to understand whether or not it should be converted to Dart. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:344: window.postMessage("START_DART_MESSAGE_UNIQUE_IDENTIFIER$dbJson", "*"); On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 21:23:02, nweiz wrote: > > On 2012/02/01 07:48:26, Jacob wrote: > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > I'm pretty confused about why this is happening. I may figure it out as I > > read > > > > more of the CL, but it would be nice to have some documentation here > > > explaining > > > > it. > > > See the comment above... this is a hack to work around a bad bug in the JSON > > > parser. > > > > The comment made me think that only the "\n" stuff was a JSON bug. How does > > postMessage fit in to that? > > postMessage is just a natural way you communicate from Dart to JS. What does that have to do with JSON parser bugs? https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extractRunner.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:29: function parseFile(type, onDone, entry, file, searchResultIndex) { On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 21:23:02, nweiz wrote: > > On 2012/02/01 07:48:26, Jacob wrote: > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > Why is this a separate function? It's only called once and the name isn't > > any > > > > more semantic than a comment would be. Moving it here just makes the code > > more > > > > non-local. > > > > > > Seems like a reasonable function name to me. > > > > The name is reasonable, but "// parse the HTML file" is reasonable as well. My > > point is that moving this into a separate function means the reader of the > code > > has to jump around the file more to follow the control flow. The semantics of > > the parameters also become more obscure; it's not clear that onDone always > > points to processNextTask, it's not clear that file is always type + > > searchResultIndex + '.html', it's not clear that type and entry come from > > domTypes and cacheData and thus what their structure and semantics are. > > > > There's certainly something to be said for functions that are only called once > > being used to organize the code, but I don't think this is the right chunk of > > code to pull out. The large number of parameters it takes is an indication of > > this. I feel like good candidates for pulling out are chunks of code that do a > > single discrete operation that could be plausibly re-used, rather than code > like > > this which is just part of the main business logic of the script. > > here's a different take on why this is a good function: > if there was test code for this file, you would absolutely want to have the test > code just call parseFile on the specific file being tested. > This function encapsulates just the logic of parsing a single file. The > function that calls it includes the cruft to handle scheduling tasks to run this > fn. Even though this fn is only called from one context, having that separation > is still handy IMHO. > Obviously this is a borderline case and you could design either way and it would > be reasonable. This function also includes reading the input file off the disk and writing it back, things that render it much more difficult to test. If we were architecting for testability, I would still recommend inlining it and factoring out more discrete and idempotent sub-functions. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extractRunner.js:158: function createTask(type, entry, index) { On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 21:23:02, nweiz wrote: > > On 2012/02/01 07:48:26, Jacob wrote: > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > This also doesn't seem worth a function. > > > > > > I disagree > > > > Why? I don't think it adds any clarity over just passing a function to > > tasks.push. > > Keep in mind this is JavaScript not Dart so you'd have to write something silly > like: > (function (type, entry, index) { > return function () { > var file = type + index + '.html'; > parseFile(type, processNextTask, entry, file, index); > }; > )(type, entry, index); > > If you wanted to avoid writing a function due to JavaScript's rule that only a > function call creates a new context. > I think the existing code is far more readable. Good point, I had forgotten about Javascript's crazy binding rules. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... File utils/apidoc/mdn/postProcess.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:9: Set matchedTypes; On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 21:23:02, nweiz wrote: > > On 2012/02/01 07:48:26, Jacob wrote: > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > These should have generic parameters. > > > I don't find that helpful when dealing with JSON. > > > > I find it helpful as a reader of the code. I want to know things like how > > deeply-nested the objects are, and once you can get rid of the type-declared > > variables you'll need the full declaration to get type inference. > I don't want to write > Map<String, Map<String, Map<String, var>>> > Tried the middle ground of specifying one level more of the type hierarchy. Not > sure if this will work on dartium or trigger type errors. A single-level declaration is okay, I suppose, as long as all the variables assigned to the map values also have single-level generic types declared. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:23: List<String> sortStringCollection(Collection<String> collection) { On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > I'd call this sortCollection and type it as "Collection<Comparable> -> > > List<Comparable>". Seems like methods should accept as broad a type as > possible, > > and the caller should cast to something more specific. > > There is little value in being generic when I only need to sort strings. This > is a helper method in a pretty printer not a generic library call. There's little value in being String-specific, either. I think making the method broader when there's no cost to doing so is a good idea, even if you don't anticipate re-using it. It expresses the intent of the method better, which makes the code easier to understand. When the method is called "sortStringCollection", I expect it to do something String-specific and have to spend extra mental cycles figuring out that it doesn't. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:52: sb.add(""" On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Style nit: incorrect indentation. > this is the correct indentation so that the output HTML is nicely indented. "<tr>" and "</tr>" should have the same indentation. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:76: * with class level comments, etc. On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > It'd be nice to have an example here of where we get conflicts and what they > > look like. Maybe just links to different MDN pages we have to choose between > for > > some case. > > added a big comment > > /** > * Score entries using similarity heuristics calculated from the observed and > * expected list of members. We could be much less naive and penalize spurious > * methods, prefer entries with class level comments, etc. This method is > * needed becase we extract entries for each of the top search results for > * each class name and rely on these scores to determine which entry was > * best. Typically all scores but one will be zero. Multiple pages have > * non-zero scores when MDN has multiple pages on the same class or pages on > * similar classes (e.g. HTMLElement and Element), or pages on Mozilla > * specific classes that are similar to DOM classes (Console). > */ This is great. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:103: num score = scoreEntry(entry, type); On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > Indentation. > > > > Alternately, "if (entry == null) continue" on the previous line. > control flow with continue is less readable in general. It seems to me like it's the same case as "if (...) return". I find it more readable than deeply-nested conditionals. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:132: final sb = new StringBuffer(); On 2012/02/02 05:26:38, Jacob wrote: > On 2012/02/01 00:10:39, nweiz wrote: > > The vast number of string buffers being used in this method make my head spin. > > In lieu of an actual template system, I think it would be cleaner to write > this > > code as though one existed. That is, have an individual method for each file > > that will be created; put all string buffer management into those methods. > Then > > main's responsibility becomes simply packaging the data up into the structures > > that each of these faux-template methods needs. > I agree that this code is ugly due to the lack of a template system. Added a > TODO. However, it isn't worthwhile to cleanup the style for this debug output > generation code here as it is throw away debugging code. The only worthwhile > code that needs to be maintainable in this file is the post processing code. As > mentioned in the TODO at the top of the file I should split that out from this > debugging code. The post processing code should be maintainable, the debugging > output code should be just quick and dirty and so no extra work should be > performed to make it production quality. When will the debugging code be removed?
https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... File utils/apidoc/mdn/crawl.js (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/cra... utils/apidoc/mdn/crawl.js:1: var http = require('http'); On 2012/02/02 19:54:34, nweiz wrote: > On 2012/02/02 05:26:38, Jacob wrote: > > On 2012/02/01 21:23:02, nweiz wrote: > > > On 2012/02/01 07:48:26, Jacob wrote: > > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > > Why is this file in JS? > > > > Because the dart server libraries currently lack the functionality we > need. > > > > When they do this script should be ported. > > > > > > It would be good to document exactly what's keeping all the JS scripts as > JS. > > > > that seems low value. Just look at each call to a Node method (e.g. http. and > > fs.) > > and you have something that is missing in Dart currently. However I have > added > > a TODO to convert to dart when all these methods are available. > > It was non-obvious to me. It's important that someone coming across this in > three months is able to understand whether or not it should be converted to > Dart. The trouble is if I made a list of which methods still need to be implemented by the dart server side support then that list will be obsolete in 3 months so there isn't value in explicitly listing that here. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... File utils/apidoc/mdn/extract.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:1: #import ("dart:html"); On 2012/02/01 21:23:02, nweiz wrote: > On 2012/02/01 07:48:26, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > This whole file could use considerably more documentation. I had a very > > > difficult time following it. > > > > I agree that this script needs a larger comment describing the design but I > > disagree that most of the small methods should have comments because they are > > mainly self explanatory. > > Some of the methods are certainly self-explanatory, but some of them are less so > than they seem at first glance. For example, it would be useful to know what > criteria determined the first line and why we needed to find it for > "findFirstLine", it's not immediately obvious what all the parameters for > "findBest" mean, etc. Done. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/ext... utils/apidoc/mdn/extract.dart:344: window.postMessage("START_DART_MESSAGE_UNIQUE_IDENTIFIER$dbJson", "*"); On 2012/02/02 19:54:34, nweiz wrote: > On 2012/02/02 05:26:38, Jacob wrote: > > On 2012/02/01 21:23:02, nweiz wrote: > > > On 2012/02/01 07:48:26, Jacob wrote: > > > > On 2012/02/01 00:10:39, nweiz wrote: > > > > > I'm pretty confused about why this is happening. I may figure it out as > I > > > read > > > > > more of the CL, but it would be nice to have some documentation here > > > > explaining > > > > > it. > > > > See the comment above... this is a hack to work around a bad bug in the > JSON > > > > parser. > > > > > > The comment made me think that only the "\n" stuff was a JSON bug. How does > > > postMessage fit in to that? > > > > postMessage is just a natural way you communicate from Dart to JS. > > What does that have to do with JSON parser bugs? The postMessage line isn't related to the JSON bug. Added a comment to clarify its use. // Use postMessage to end the JSON to JavaScript. TODO(jacobr): use a simple // isolate based Dart-JS interop solution in the future. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... File utils/apidoc/mdn/postProcess.dart (right): https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:23: List<String> sortStringCollection(Collection<String> collection) { On 2012/02/02 19:54:34, nweiz wrote: > On 2012/02/02 05:26:38, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > I'd call this sortCollection and type it as "Collection<Comparable> -> > > > List<Comparable>". Seems like methods should accept as broad a type as > > possible, > > > and the caller should cast to something more specific. > > > > There is little value in being generic when I only need to sort strings. This > > is a helper method in a pretty printer not a generic library call. > > There's little value in being String-specific, either. I think making the method > broader when there's no cost to doing so is a good idea, even if you don't > anticipate re-using it. It expresses the intent of the method better, which > makes the code easier to understand. When the method is called > "sortStringCollection", I expect it to do something String-specific and have to > spend extra mental cycles figuring out that it doesn't. having a sortCollection method would be reasonable. However keep in mind that in Dart that would mean the output would need to be a List<Collection> instead of a List<String> which is a little annoying. Anyway, if you feel really strongly about this I could go either way. FWIW, this method could easily become string specific if we decided it was important to use a case insensitive string comparison method, etc. Anyway, as a design principle I believe that it is rarely worth it to generalize code before you have to. If needed it would be trivial to refactor this to a general method in the future. Being more specific makes code less abstract so often easier to follow. Obviously in this case the code is very simple so it isn't a big deal one way or the other. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:52: sb.add(""" On 2012/02/02 19:54:34, nweiz wrote: > On 2012/02/02 05:26:38, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Style nit: incorrect indentation. > > this is the correct indentation so that the output HTML is nicely indented. > > "<tr>" and "</tr>" should have the same indentation. fixed https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:81: score++; On 2012/02/01 00:10:39, nweiz wrote: > Why would we ever want to choose a skipped entry? I doubt we would. Added a TODO to consider removing skipped entries completely instead of just giving them lower scores. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:103: num score = scoreEntry(entry, type); On 2012/02/02 19:54:34, nweiz wrote: > On 2012/02/02 05:26:38, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > Indentation. > > > > > > Alternately, "if (entry == null) continue" on the previous line. > > control flow with continue is less readable in general. > > It seems to me like it's the same case as "if (...) return". I find it more > readable than deeply-nested conditionals. Some people go so far as to prohibit if (...) return; calls at the top of methods I think those are fine and can improve readability in a lot of cases. However, returns in arbitrary locations in the middle of methods are bad and significantly hurt readability by making the code flow more surprising. As code is written once and often read multiple times I prefer less surprising code to more terse code for cases when the extra code bloat is small. https://chromiumcodereview.appspot.com/9225039/diff/3003/utils/apidoc/mdn/pos... utils/apidoc/mdn/postProcess.dart:132: final sb = new StringBuffer(); On 2012/02/02 19:54:34, nweiz wrote: > On 2012/02/02 05:26:38, Jacob wrote: > > On 2012/02/01 00:10:39, nweiz wrote: > > > The vast number of string buffers being used in this method make my head > spin. > > > In lieu of an actual template system, I think it would be cleaner to write > > this > > > code as though one existed. That is, have an individual method for each file > > > that will be created; put all string buffer management into those methods. > > Then > > > main's responsibility becomes simply packaging the data up into the > structures > > > that each of these faux-template methods needs. > > I agree that this code is ugly due to the lack of a template system. Added a > > TODO. However, it isn't worthwhile to cleanup the style for this debug output > > generation code here as it is throw away debugging code. The only worthwhile > > code that needs to be maintainable in this file is the post processing code. > As > > mentioned in the TODO at the top of the file I should split that out from this > > debugging code. The post processing code should be maintainable, the > debugging > > output code should be just quick and dirty and so no extra work should be > > performed to make it production quality. > > When will the debugging code be removed? in my copious free time. If you think it is a priority you are welcome to split out the actual logic from the code that generates debugging html. |