utils/apidoc/mdn/README.txt - Issue 9225039: Integrate MDN content into API documentation.

Side by Side Diff: utils/apidoc/mdn/README.txt

Issue 9225039: Integrate MDN content into API documentation. (Closed) Base URL: https://dart.googlecode.com/svn/branches/bleeding_edge/dart

Patch Set: Remove temp code. Created 8 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
(Empty)
	1 Here's a rough walkthrough of how this works. The ultimate output file is

	2 database.filtered.json.
	nweiz 2012/02/01 00:10:39 After reading this, I'm having trouble understandi After reading this, I'm having trouble understanding how all this processing fits together. Why is so much of this in JS? What's run in a browser? Why? How? By whom? When? Answering these questions will make this much clearer for someone coming in without much knowledge. I'm being pretty demanding in my comments here because this MDN extraction is a complex piece of code, and it seems crucial that anyone interacting with it have a clear overview of how it works. This README should provide that overview, but I think as-is it falls short.
	3

	4 search.js

	5 - read data/domTypes.json
	nweiz 2012/02/01 00:10:39 What's in this file? Where does it come from? What's in this file? Where does it come from?
	6 - for each dom type:

	7 - search for page on www.googleapis.com

	8 - write search results to output/search/<type>.json

	9 . this is a list of search results and urls to pages

	10

	11 crawl.js

	12 - read data/domTypes.json

	13 - for each dom type:

	14 - for each output/search/<type>.json:
	nweiz 2012/02/01 00:10:39 Isn't there only one of these files for each type? Isn't there only one of these files for each type?
	15 - for each result in the file:

	16 - try to scrape that cached MDN page from webcache.googleusercontent.com

	17 - write mdn page to output/crawl/<type><index of result>.html

	18 - write output/crawl/cache.json

	19 . it maps types -> search result page urls and titles

	20

	21 extract.sh
	nweiz 2012/02/01 00:10:39 Should probably mention which directory this needs Should probably mention which directory this needs to be run from.
	22 - compile extract.dart to js

	23 - run extractRunner.js
	nweiz 2012/02/01 00:10:39 Is this the same as the compiled extract.dart? If Is this the same as the compiled extract.dart? If not, what's the relation between the two?
	24 - read data/domTypes.json

	25 - read output/crawl/cache.json

	26 - read data/dartIdl.json
	nweiz 2012/02/01 00:10:39 What's in this file? Where does it come from? What's in this file? Where does it come from?
	27 - for each scraped search result page:

	28 - create a cleaned up html page in output/extract/<type><index>.html that

	29 contains the scraped content + a script tag that includes extract.dart.js.

	30 - create an args file in output/extract/<type><index>.html.json with some

	31 data on how that file should be processed
	nweiz 2012/02/01 00:10:39 s/that file/the HTML file/ What sort of data? Wha s/that file/the HTML file/ What sort of data? What uses the data? "An args file" doesn't tell me much.
	32 - invoke dump render tree on that file
	nweiz 2012/02/01 00:10:39 Make it more explicit that this invokes it in a he Make it more explicit that this invokes it in a headless browser and so runs extract.dart.js in the context of the HTML page. Will this access the JSON file?
	33 - when that returns, parse the console output and add it to database.json
	nweiz 2012/02/01 00:10:39 Does this mean output/database.json? Does this mean output/database.json?
	34 - add any errors to output/errors.json

	35 - save output/database.json
	nweiz 2012/02/01 00:10:39 Somewhat confusing given that you just said you we Somewhat confusing given that you just said you were writing to it, and you don't also say "save output/errors.json".
	36

	37 extract.dart
	nweiz 2012/02/01 00:10:39 Is this run within extractRunner.js? How is its fu Is this run within extractRunner.js? How is its functionality different than that of extractRunner.js?
	38 - xhr output/extract/<type><index>.html.json
	nweiz 2012/02/01 00:10:39 Is this different than the "read .json" you're do Is this different than the "read .json" you're doing in extractRunner.js? If so, how are you reading files there?
	39 - all sorts of shenanigans to actually pull the content out of the html

	40 - build a JSON object with the results

	41 - do a postmessage with that object so extractRunner.js can pull it out

	42

	43 - run postProcess.dart
	nweiz 2012/02/01 00:10:39 Is this run via DumpRenderTree? On the VM? On Frog Is this run via DumpRenderTree? On the VM? On Frog+Node?
	44 - go through the results for each type looking for the best match
	nweiz 2012/02/01 00:10:39 Mention what files you're using here. Mention what files you're using here.
	45 - write output/database.html

	46 - write output/examples.html

	47 - write output/obsolete.html
	nweiz 2012/02/01 00:10:39 What are all these files for? Why are they in HTML What are all these files for? Why are they in HTML? Are they meant to be human-readable?
	48 - write output/database.filtered.json which is the best matches
	nweiz 2012/02/01 00:10:39 Is this just a mapping of type names to the conten Is this just a mapping of type names to the contents of output/extract/*.json?
OLD	NEW

« utils/apidoc/apidoc.dart ('K') | « utils/apidoc/html_diff.dart ('k') | utils/apidoc/mdn/crawl.js » ('j') | utils/apidoc/mdn/crawl.js » ('J')