lib/src/line_splitter.dart - Issue 1255643002: New, simpler and faster line splitter.

Side by Side Diff: lib/src/line_splitter.dart

Issue 1255643002: New, simpler and faster line splitter. (Closed) Base URL: https://github.com/dart-lang/dart_style.git@master

Patch Set: Optimize nesting. Reformat. Created 5 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
1 // Copyright (c) 2014, the Dart project authors. Please see the AUTHORS file	1 // Copyright (c) 2015, the Dart project authors. Please see the AUTHORS file

2 // for details. All rights reserved. Use of this source code is governed by a	2 // for details. All rights reserved. Use of this source code is governed by a

3 // BSD-style license that can be found in the LICENSE file.	3 // BSD-style license that can be found in the LICENSE file.

4	4

5 library dart_style.src.line_splitter;	5 library dart_style.src.line_splitter;

6	6

7 import 'dart:math' as math;	7 import 'package:collection/priority_queue.dart';

8	8

9 import 'chunk.dart';	9 import 'chunk.dart';

10 import 'debug.dart' as debug;	10 import 'debug.dart' as debug;

11 import 'line_prefix.dart';

12 import 'line_writer.dart';	11 import 'line_writer.dart';

13 import 'rule/rule.dart';	12 import 'rule/rule.dart';

14	13 import 'rule_set.dart';

15 /// Takes a series of [Chunk]s and determines the best way to split them into	14

16 /// lines of output that fit within the page width (if possible).	15 /// To ensure the solver doesn't go totally pathological on giant code, we cap

17 ///	16 /// it at a fixed number of attempts.

18 /// Imagine a naïve solution. You start at the first chunk in the list and work	17 ///

19 /// your way forward. For each chunk, you try every possible value of the rule	18 /// If the optimal solution isn't found after this many tries, it just uses the

20 /// that the chunk uses. Then, you ask the rule if that chunk should split or	19 /// best it found so far.

21 /// not when the rule has that value. If it does, then you try every possible	20 const _maxAttempts = 50000;

22 /// way you could modify the expression nesting context at that point.	21

23 ///	22 /// Takes a set of chunks and determines the best values for its rules in order

24 /// For each of those possible results, you start with that as a partial	23 /// to fit it inside the page boundary.

25 /// solution. You move onto the next chunk and repeat the process, building on	24 ///

26 /// to that partial solution. When you reach the last chunk, you determine the	25 /// This problem is exponential in the number of rules and a single expression

27 /// cost of that set of rule values and keep track of the best one.	26 /// in Dart can be quite large, so it isn't feasible to brute force this. For

28 ///	27 /// example:

29 /// Once you've tried every single possible permutation of rule values and	28 ///

30 /// nesting stacks, you will have exhaustively covered every possible way the	29 /// outer(

31 /// line can be split and you'll know which one had the lowest cost. That's	30 /// fn(1 + 2, 3 + 4, 5 + 6, 7 + 8),

32 /// your result.	31 /// fn(1 + 2, 3 + 4, 5 + 6, 7 + 8),

33 ///	32 /// fn(1 + 2, 3 + 4, 5 + 6, 7 + 8),

34 /// The obvious problem is that this is exponential in the number of values for	33 /// fn(1 + 2, 3 + 4, 5 + 6, 7 + 8));

35 /// each rule and the expression nesting levels, both of which can be quite	34 ///

36 /// large with things like long method chains containing function literals,	35 /// There are 509,607,936 ways this can be split.

37 /// large collection literals, etc. To tame that, this uses dynamic programming.	36 ///

38 ///	37 /// The problem is even harder because we may not be able to easily tell if a

39 /// As the naïve solver is working its way down the chunks, it's building up a	38 /// given solution is the best one. It's possible that there is no solution

40 /// partial solution. This solution has:	39 /// that fits in the page (due to long strings or identifiers) so the winning

41 ///	40 /// solution may still have overflow characters. This makes it hard to know

42 /// * A length—the number of chunks in it.	41 /// when we are done and can stop looking.

43 /// * The set of rule values we have selected for the rules used by those	42 ///

44 /// chunks.	43 /// There are a couple of pieces of domain knowledge we use to cope with this:

45 /// * The expression nesting stack that is currently active at the point where	44 ///

46 /// the current chunk splits.	45 /// - Changing a rule from unsplit to split will never lower its cost. A

47 ///	46 /// solution with all rules unsplit will always be the one with the lowest

48 /// We'll call this partial solution a [LinePrefix]. The naïve solution starts	47 /// cost (zero). Conversely, setting all of its rules to the maximum split

49 /// with an empty prefix (zero length, no rule values, no expression nesting).	48 /// value will always have the highest cost.

50 /// It grabs the next chunk after that prefix and says, OK, given this prefix	49 ///

51 /// what are all of the possible ways we can split that chunk (rule values and	50 /// (You might think there is a converse rule about overflow characters. The

52 /// nesting stacks). Create a new, one-chunk-longer prefix for each of those	51 /// solution with the fewest splits will have the most overflow, and the

53 /// and then recursively solve each of their suffixes.	52 /// solution with the most splits will have the least overflow. Alas, because

54 ///	53 /// of indentation, that isn't always the case. Adding a split may increase

55 /// Normally, this would basically be a depth-first traversal of the entire	54 /// overflow in some cases.)

56 /// possible solution space. However, there is a simple optimization we can	55 ///

57 /// make. When brute forcing the solution tree, we will often solve many	56 /// - If all of the chunks for a rule are inside lines that already fit in the

58 /// prefixes with the same length.	57 /// page, then splitting that rule will never improve the solution.

59 ///	58 ///

60 /// For example, if the first chunk's rule can take three different values, we	59 /// We start off with a [SolveState] where all rules are unbound (which

61 /// will solve three line prefixes of length one, one for each rule value.	60 /// implicitly treats them as unsplit). For a given solve state, we can produce

62 /// Those in turn may have lots of permutations leading to even more solutions	61 /// a set of expanded states that takes some of the rules in the first long

63 /// all with line prefixes of length two, and so on.	62 /// line and bind them to split values. This always produces new solve states

64 ///	63 /// with higher cost (but often fewer overflow characters) than the parent

65 /// In many cases, we have to treat these line prefixes separately. If two	64 /// state.

66 /// prefixes have the same length, but fix a different value for some rule, they	65 ///

67 /// may lead to different ways of splitting the suffix since that rule may	66 /// We take these expanded states and add them to a work list sorted by cost.

68 /// appear in a later chunk too.	67 /// Since unsplit rules always have lower cost solutions, we know that no state

69 ///	68 /// we enqueue later will ever have a lower cost than the ones we already have

70 /// But in many cases, the line prefixes are equivalent. For example, consider:	69 /// enqueued.

71 ///	70 ///

72 /// function(	71 /// Then we keep pulling states off the work list and expanding them and adding

73 /// first(arg, arg, arg, arg, arg, arg),	72 /// the results back into the list. We do this until we hit a solution where

74 /// second,	73 /// all characters fit in the page. The first one we find will have the lowest

75 /// third(arg, arg, arg, arg, arg, arg));	74 /// cost and we're done.

76 ///	75 ///

77 /// The line splitter will try a number of different ways to split the argument	76 /// We also keep running track of the best solution we've found so far that

78 /// list to `first`. For each of those, it then has to try all of the different	77 /// has the fewest overflow characters and the lowest cost. If no solution fits,

79 /// ways it could split `third`. However, those two are unrelated. The way	78 /// we'll use this one.

80 /// you split `first` has no impact on how you split `third`.	79 ///

81 ///	80 /// As a final escape hatch for pathologically nasty code, after trying some

82 /// The precise way to state that is that as the line prefix grows past	81 /// fixed maximum number of solve states, we just bail and return the best

83 /// `first()` and its argument list, it contains less state. It discards the	82 /// solution found so far.

84 /// rule value it selected for the argument list rule since that rule doesn't	83 ///

85 /// appear anywhere in the suffix and can't affect it. Any expression nesting	84 /// Even with the above algorithmic optimizations, complex code may still

86 /// inside the argument list has been popped off by the time we get to `second`.	85 /// require a lot of exploring to find an optimal solution. To make that fast,

87 ///	86 /// this code is carefully profiled and optimized. If you modify this, make

88 /// We'll try every possible way to split `first()`'s argument list and then	87 /// sure to test against the benchmark to ensure you don't regress performance.

89 /// recursively solve the remainder of the line for each of those possible

90 /// splits. But each of those remainders will end up having identical prefixes.

91 /// For any given [LinePrefix], there is a single best solution for the suffix.

92 /// So, once we've found it, we can just reuse it the next time that same

93 /// prefix comes up.

94 ///

95 /// In other words, we add a memoization table, [_bestSplits]. It is a map from

96 /// [LinePrefix]es to [SplitSet]s. For each prefix, it stores the best set of

97 /// splits to use for the following suffix. With this, instead of exploring the

98 /// entire solution tree, which would be exponential, we just traverse the

99 /// solution graph where entire branches of the tree that are identical

100 /// collapse back together.

101 ///

102 /// The trick to making this work is storing the minimal amount of data in the

103 /// line prefix to uniquely identify a suffix while still collapsing as many

104 /// branches as possible. In practice, this means aggressively forgetting state

105 /// when the prefix grows. In particular, we do not need to keep track of

106 /// rule values we selected in the prefix if that rule does not affect anything

107 /// in the suffix.

108 ///

109 /// There are a couple of other smaller scale optimizations too. If we ever hit

110 /// a solution whose cost is zero, we know we won't do better, so we stop

111 /// searching. If we make it to the end of the chunk list and we haven't gone

112 /// past the end of the line, we know we don't need to split anywhere.

113 ///

114 /// But the memoization is the main thing that makes this tractable.

115 class LineSplitter {	88 class LineSplitter {

116 final LineWriter _writer;	89 final LineWriter _writer;

117	90

118 /// The list of chunks being split.	91 /// The list of chunks being split.

119 final List<Chunk> _chunks;	92 final List<Chunk> _chunks;

120	93

	94 /// The set of soft rules whose values are being selected.

	95 final List<Rule> _rules;

	96

121 /// The number of characters of additional indentation to apply to each line.	97 /// The number of characters of additional indentation to apply to each line.

122 ///	98 ///

123 /// This is used when formatting blocks to get the output into the right	99 /// This is used when formatting blocks to get the output into the right

124 /// column based on where the block appears.	100 /// column based on where the block appears.

125 final int _blockIndentation;	101 final int _blockIndentation;

126	102

127 /// Memoization table for the best set of splits for the remainder of the	103 /// The starting column of the first line.

128 /// line following a given prefix.	104 final int _firstLineIndent;

129 final _bestSplits = <LinePrefix, SplitSet>{};	105

130	106 /// The list of solve states to explore further.

131 /// The rules that appear in the first `n` chunks of the line where `n` is	107 ///

132 /// the index into the list.	108 /// This is sorted lowest-cost first. This ensures that as soon as we find a

133 final _prefixRules = <Set<Rule>>[];	109 /// solution that fits in the page, we know it will be the lowest cost one

134	110 /// and can stop looking.

135 /// The rules that appear after the first `n` chunks of the line where `n` is	111 final _workList = new HeapPriorityQueue<SolveState>();

136 /// the index into the list.	112

137 final _suffixRules = <Set<Rule>>[];	113 /// The lowest cost solution found so far.

	114 SolveState _bestSolution;

138	115

139 /// Creates a new splitter for [_writer] that tries to fit [chunks] into the	116 /// Creates a new splitter for [_writer] that tries to fit [chunks] into the

140 /// page width.	117 /// page width.

141 LineSplitter(this._writer, this._chunks, this._blockIndentation);	118 LineSplitter(this._writer, List<Chunk> chunks, int blockIndentation,

142	119 int firstLineIndent,

143 /// Convert the line to a [String] representation.	120 {bool flushLeft: false})

144 ///	121 : _chunks = chunks,

145 /// It will determine how best to split it into multiple lines of output and	122 // Collect the set of soft rules that we need to select values for.

146 /// write the result to [writer].	123 _rules = chunks

	124 .map((chunk) => chunk.rule)

	125 .where((rule) => rule != null && rule is! HardSplitRule)

	126 .toSet()

	127 .toList(growable: false),

	128 _blockIndentation = blockIndentation,

	129 _firstLineIndent = flushLeft ? 0 : firstLineIndent + blockIndentation {

	130 // Store the rule's index in the rule so we can get from a chunk to a rule

	131 // index quickly.

	132 for (var i = 0; i < _rules.length; i++) {

	133 _rules[i].index = i;

	134 }

	135 }

	136

	137 /// Determine the best way to split the chunks into lines that fit in the

	138 /// page, if possible.

	139 ///

	140 /// Returns a [SplitSet] that defines where each split occurs and the

	141 /// indentation of each line.

147 ///	142 ///

148 /// [firstLineIndent] is the number of characters of whitespace to prefix the	143 /// [firstLineIndent] is the number of characters of whitespace to prefix the

149 /// first line of output with.	144 /// first line of output with.

150 SplitSolution apply(int firstLineIndent, {bool flushLeft}) {	145 SplitSet apply() {

	146 // Start with a completely unbound, unsplit solution.

	147 _workList.add(new SolveState(this, new RuleSet(_rules.length)));

	148

	149 var attempts = 0;

	150 while (!_workList.isEmpty) {

	151 var state = _workList.removeFirst();

	152

	153 if (state.isBetterThan(_bestSolution)) {

	154 _bestSolution = state;

	155

	156 // Since we sort solutions by cost the first solution we find that

	157 // fits is the winner.

	158 if (_bestSolution.overflowChars == 0) break;

	159 }

	160

	161 if (debug.traceSplitter) {

	162 var best = state == _bestSolution ? " (best)" : "";

	163 debug.log("$state$best");

	164 debug.dumpLines(_chunks, _firstLineIndent, state.splits);

	165 debug.log();

	166 }

	167

	168 if (attempts++ > _maxAttempts) break;

	169

	170 // Try bumping the rule values for rules whose chunks are on long lines.

	171 state.expand();

	172 }

	173

151 if (debug.traceSplitter) {	174 if (debug.traceSplitter) {

152 debug.log(debug.green("\nSplitting:"));	175 debug.log("$_bestSolution (winner)");

153 debug.dumpChunks(0, _chunks);	176 debug.dumpLines(_chunks, _firstLineIndent, _bestSolution.splits);

154 debug.log();	177 debug.log();

155 }	178 }

156	179

157 // Ignore the trailing rule on the last chunk since it isn't used for	180 return _bestSolution.splits;

158 // anything.	181 }

159 var ruleChunks = _chunks.take(_chunks.length - 1);	182 }

160	183

161 // Pre-calculate the set of rules appear before and after each length. We	184 /// A possibly incomplete solution in the line splitting search space.

162 // use these frequently when creating [LinePrefix]es and they only depend	185 ///

163 // on the length, so we can cache them up front.	186 /// A single [SolveState] binds some subset of the rules to values while

164 for (var i = 0; i < _chunks.length; i++) {	187 /// leaving the rest unbound. If every rule is bound, the solve state describes

165 _prefixRules.add(ruleChunks.take(i).map((chunk) => chunk.rule).toSet());	188 /// a complete solution to the line splitting problem. Even if rules are

166 _suffixRules.add(ruleChunks.skip(i).map((chunk) => chunk.rule).toSet());	189 /// unbound, a state can also usually be used as a solution by treating all

167 }	190 /// unbound rules as unsplit. (The usually is because a state that constrains

168	191 /// an unbound rule to split can't be used with that rule unsplit.)

169 var prefix = new LinePrefix(firstLineIndent + _blockIndentation,	192 ///

170 flushLeft: flushLeft);	193 /// From a given solve state, we can explore the search tree to more refined

171 var solution = new SplitSolution(prefix);	194 /// solve states by producing new ones that add more bound rules to the current

172 _tryChunkRuleValues(solution, prefix);	195 /// state.

173 return solution;	196 class SolveState implements Comparable<SolveState> {

174 }	197 final LineSplitter _splitter;

175	198 final RuleSet _ruleValues;

176 /// Finds the best set of splits to apply to the remainder of the chunks	199

177 /// following [prefix].	200 /// The unbound rules in this state that can be bound to produce new more

178 ///	201 /// refined states.

179 /// This can only be called for a suffix that begins a new line. (In other	202 ///

180 /// words, the last chunk in the prefix cannot be unsplit.)	203 /// Keeping this set small is the key to make the entire line splitter

181 SplitSet _findBestSplits(LinePrefix prefix) {	204 /// perform well. If we consider too make rules at each state, our

182 // Use the memoized result if we have it.	205 /// exploration of the solution space is too branchy and we waste time on

183 if (_bestSplits.containsKey(prefix)) {	206 /// dead end solutions.

184 if (debug.traceSplitter) {	207 ///

185 debug.log("memoized splits for $prefix = ${_bestSplits[prefix]}");	208 /// Here is the key insight. The line splitter treats any unbound rule as

186 }	209 /// being unsplit. This means refining a solution always means taking a rule

187 return _bestSplits[prefix];	210 /// that is unsplit and making it split. That monotonically increases the

188 }	211 /// cost, but may help fit the solution inside the page.

189	212 ///

190 if (debug.traceSplitter) {	213 /// We want to keep the cost low, so the only reason to consider making a

191 debug.log("find splits for $prefix");	214 /// rule split is if it reduces an overflowing line. It's also the case that

192 debug.indent();	215 /// splitting an earlier rule will often reshuffle the rest of the line.

193 }	216 ///

194	217 /// Taking that into account, the only rules we consider binding to extend a

195 var solution = new SplitSolution(prefix);	218 /// solve state are unbound rules inside the first line that is overflowing.

196 _tryChunkRuleValues(solution, prefix);	219 /// Even if a line has dozens of rules, this generally keeps the branching

197	220 /// down to a few. It also means rules inside lines that already fit are

198 if (debug.traceSplitter) {	221 /// never touched.

199 debug.unindent();	222 ///

200 debug.log("best splits for $prefix = ${solution.splits}");	223 /// There is one other set of rules that go in here. Sometimes a bound rule

201 }	224 /// in the solve state constrains some other unbound rule to split. In that

202	225 /// case, we also consider that active so we know to not leave it at zero.

203 return _bestSplits[prefix] = solution.splits;	226 final _liveRules = new Set<Rule>();

204 }	227

205	228 /// The set of splits chosen for this state.

206 /// Updates [solution] with the best rule value selection for the chunk	229 SplitSet get splits => _splits;

207 /// immediately following [prefix].	230 SplitSet _splits;

208 void _tryChunkRuleValues(SplitSolution solution, LinePrefix prefix) {	231

209 // If we made it to the end, this prefix can be solved without splitting	232 /// The number of characters that do not fit inside the page with this set of

210 // any chunks.	233 /// splits.

211 if (prefix.length == _chunks.length - 1) {	234 int get overflowChars => _overflowChars;

212 solution.update(this, new SplitSet());	235 int _overflowChars;

213 return;	236

214 }	237 /// Whether we can treat this state as a complete solution by leaving its

215	238 /// unbound rules unsplit.

216 var chunk = _chunks[prefix.length];	239 ///

217	240 /// This is generally true but will be false if the state contains any

218 // See if we've already selected a value for the rule.	241 /// unbound rules that are constrained to not be zero by other bound rules.

219 var value = prefix.ruleValues[chunk.rule];	242 /// This avoids picking a solution that leaves those rules at zero when they

220	243 /// aren't allowed to be.

221 if (value == null) {	244 bool _isComplete = true;

222 // No, so try every possible value for the rule.	245

223 for (value = 0; value < chunk.rule.numValues; value++) {	246 SolveState(this._splitter, this._ruleValues) {

224 _tryRuleValue(solution, prefix, value);	247 _calculateSplits();

225 }	248 _calculateCost();

226 } else if (value == -1) {	249 }

227 // A -1 "value" means, "any non-zero value". In other words, the rule has	250

228 // to split somehow, but can split however it chooses.	251 /// Orders this state relative to [other].

229 for (value = 1; value < chunk.rule.numValues; value++) {	252 ///

230 _tryRuleValue(solution, prefix, value);	253 /// This is the best-first ordering that the [LineSplitter] uses in its

231 }	254 /// worklist. It prefers cheaper states even if they overflow because this

232 } else {	255 /// ensures it finds the best solution first as soon as it finds one that

233 // Otherwise, it's constrained to a single value, so use it.	256 /// fits in the page so it can early out.

234 _tryRuleValue(solution, prefix, value);	257 int compareTo(SolveState other) {

235 }	258 // TODO(rnystrom): It may be worth sorting by the estimated lowest number

236 }	259 // of overflow characters first. That doesn't help in cases where there is

237	260 // a solution that fits, but may help in corner cases where there is no

238 /// Updates [solution] with the best solution that can be found by setting	261 // fitting solution.

239 /// the chunk after [prefix]'s rule to [value].	262

240 void _tryRuleValue(SplitSolution solution, LinePrefix prefix, int value) {	263 if (splits.cost != other.splits.cost) {

241 // If we already have an unbeatable solution, don't bother trying this one.	264 return splits.cost.compareTo(other.splits.cost);

242 if (solution.cost == 0) return;	265 }

243	266

244 var chunk = _chunks[prefix.length];	267 if (overflowChars != other.overflowChars) {

245	268 return overflowChars.compareTo(other.overflowChars);

246 if (chunk.rule.isSplit(value, chunk)) {	269 }

247 // The chunk is splitting in an expression, so try all of the possible	270

248 // nesting combinations.	271 // Distinguish states based on the rule values just so that states with the

249 var ruleValues = _advancePrefix(prefix, value);	272 // same cost range but different rule values don't get considered identical

250 var longerPrefixes = prefix.split(chunk, _blockIndentation, ruleValues);	273 // by HeapPriorityQueue.

251 for (var longerPrefix in longerPrefixes) {	274 for (var rule in _splitter._rules) {

252 _tryLongerPrefix(solution, prefix, longerPrefix);	275 var value = _ruleValues.getValue(rule);

253	276 var otherValue = other._ruleValues.getValue(rule);

254 // If we find an unbeatable solution, don't try any more.	277

255 if (solution.cost == 0) break;	278 if (value != otherValue) return value.compareTo(otherValue);

256 }	279 }

257 } else {	280

258 // We didn't split here, so add this chunk and its rule value to the	281 // If we get here, this state is identical to [other].

259 // prefix and continue on to the next.	282 return 0;

260 var extended = prefix.extend(_advancePrefix(prefix, value));	283 }

261 _tryChunkRuleValues(solution, extended);	284

262 }	285 /// Returns `true` if this state is a better solution to use as the final

263 }	286 /// result than [other].

264	287 bool isBetterThan(SolveState other) {

265 /// Updates [solution] with the solution for [prefix] assuming it uses	288 // If this state contains an unbound rule that we know can't be left

266 /// [longerPrefix] for the next chunk.	289 // unsplit, we can't pick this as a solution.

267 void _tryLongerPrefix(	290 if (!_isComplete) return false;

268 SplitSolution solution, LinePrefix prefix, LinePrefix longerPrefix) {	291

269 var remaining = _findBestSplits(longerPrefix);	292 // Anything is better than nothing.

270	293 if (other == null) return true;

271 // If it wasn't possible to split the suffix given this nesting stack,	294

272 // skip it.	295 // Prefer the solution that fits the most in the page.

273 if (remaining == null) return;	296 if (overflowChars != other.overflowChars) {

274	297 return overflowChars < other.overflowChars;

275 solution.update(this, remaining.add(prefix.length, longerPrefix.column));	298 }

276 }	299

277	300 // Otherwise, prefer the best cost.

278 /// Determines the set of rule values for a new [LinePrefix] one chunk longer	301 return splits.cost < other.splits.cost;

279 /// than [prefix] whose rule on the new last chunk has [value].	302 }

280 ///	303

281 /// Returns a map of [Rule]s to values for those rules for the values that	304 /// Enqueues more solve states to consider based on this one.

282 /// span the prefix and suffix of the [LinePrefix].	305 ///

283 Map<Rule, int> _advancePrefix(LinePrefix prefix, int value) {	306 /// For each unbound rule in this state that occurred in the first long line,

284 // Get the rules that appear in both in and after the new prefix. These are	307 /// enqueue solve states that bind that rule to each value it can have and

285 // the rules that already have values that the suffix needs to honor.	308 /// bind all previous rules to zero. (In other words, try all subsolutions

286 var prefixRules = _prefixRules[prefix.length + 1];	309 /// where that rule becomes the first new rule to split at.)

287 var suffixRules = _suffixRules[prefix.length + 1];	310 void expand() {

288	311 var unsplitRules = _ruleValues.clone();

289 var nextRule = _chunks[prefix.length].rule;	312

290 var updatedValues = {};	313 // Walk down the rules looking for unbound ones to try.

291	314 var triedRules = 0;

292 for (var prefixRule in prefixRules) {	315 for (var rule in _splitter._rules) {

293 var ruleValue =	316 if (_liveRules.contains(rule)) {

294 prefixRule == nextRule ? value : prefix.ruleValues[prefixRule];	317 // We found one worth trying, so try all of its values.

295	318 for (var value = 1; value < rule.numValues; value++) {

296 if (suffixRules.contains(prefixRule)) {	319 var boundRules = unsplitRules.clone();

297 // If the same rule appears in both the prefix and suffix, then preserve	320

298 // its exact value.	321 var mustSplitRules;

299 updatedValues[prefixRule] = ruleValue;	322 var valid = boundRules.tryBind(_splitter._rules, rule, value, (rule) {

300 }	323 if (mustSplitRules == null) mustSplitRules = [];

301	324 mustSplitRules.add(rule);

302 // If we haven't specified any value for this rule in the prefix, it	325 });

303 // doesn't place any constraint on the suffix.	326

304 if (ruleValue == null) continue;	327 // Make sure we don't violate the constraints of the bound rules.

305	328 if (!valid) continue;

306 // Enforce the constraints between rules.	329

307 for (var suffixRule in suffixRules) {	330 var state = new SolveState(_splitter, boundRules);

308 if (suffixRule == prefixRule) continue;	331

309	332 // If some unbound rules are constrained to split, remember that.

310 // See if the prefix rule's value constrains any values in the suffix.	333 if (mustSplitRules != null) {

311 var value = prefixRule.constrain(ruleValue, suffixRule);	334 state._isComplete = false;

312	335 state._liveRules.addAll(mustSplitRules);

313 // Also consider the backwards case, where a later rule in the suffix	336 }

314 // constrains a rule in the prefix.	337

315 if (value == null) {	338 _splitter._workList.add(state);

316 value = suffixRule.reverseConstrain(ruleValue, prefixRule);

317 }	339 }

318	340

319 if (value != null) {	341 // Stop once we've tried all of the ones we can.

320 updatedValues[prefixRule] = ruleValue;	342 if (++triedRules == _liveRules.length) break;

321 updatedValues[suffixRule] = value;	343 }

	344

	345 // Fill in previous unbound rules with zero.

	346 if (!_ruleValues.contains(rule)) {

	347 // Pass a dummy callback because zero will never fail. (If it would

	348 // have, that rule would already be bound to some other value.)

	349 if (!unsplitRules.tryBind(_splitter._rules, rule, 0, (_) {})) {

	350 break;

322 }	351 }

323 }	352 }

324 }	353 }

325	354 }

326 return updatedValues;	355

	356 /// Calculates the [SplitSet] for this solve state, assuming any unbound

	357 /// rules are set to zero.

	358 void _calculateSplits() {

	359 // Figure out which expression nesting levels got split and need to be

	360 // assigned columns.

	361 var usedNestingLevels = new Set();

	362 for (var i = 0; i < _splitter._chunks.length - 1; i++) {

	363 var chunk = _splitter._chunks[i];

	364 if (chunk.rule.isSplit(_getValue(chunk.rule), chunk)) {

	365 usedNestingLevels.add(chunk.nesting);

	366 chunk.nesting.clearTotalUsedIndent();

	367 }

	368 }

	369

	370 for (var nesting in usedNestingLevels) {

	371 nesting.refreshTotalUsedIndent(usedNestingLevels);

	372 }

	373

	374 _splits = new SplitSet(_splitter._chunks.length);

	375 for (var i = 0; i < _splitter._chunks.length - 1; i++) {

	376 var chunk = _splitter._chunks[i];

	377 if (chunk.rule.isSplit(_getValue(chunk.rule), chunk)) {

	378 var indent = 0;

	379 if (!chunk.flushLeftAfter) {

	380 // Add in the chunk's indent.

	381 indent = _splitter._blockIndentation + chunk.indent;

	382

	383 // And any expression nesting.

	384 indent += chunk.nesting.totalUsedIndent;

	385 }

	386

	387 _splits.add(i, indent);

	388 }

	389 }

	390 }

	391

	392 /// Gets the value to use for [rule], either the bound value or `0` if it

	393 /// isn't bound.

	394 int _getValue(Rule rule) {

	395 if (rule is HardSplitRule) return 0;

	396

	397 return _ruleValues.getValue(rule);

327 }	398 }

328	399

329 /// Evaluates the cost (i.e. the relative "badness") of splitting the line	400 /// Evaluates the cost (i.e. the relative "badness") of splitting the line

330 /// into [lines] physical lines based on the current set of rules.	401 /// into [lines] physical lines based on the current set of rules.

331 int _evaluateCost(LinePrefix prefix, SplitSet splits) {	402 void _calculateCost() {

332 assert(splits != null);	403 assert(_splits != null);

333	404

334 // Calculate the length of each line and apply the cost of any spans that	405 // Calculate the length of each line and apply the cost of any spans that

335 // get split.	406 // get split.

336 var cost = 0;	407 var cost = 0;

337 var length = prefix.column;	408 _overflowChars = 0;

338	409

339 var splitRules = new Set();	410 var length = _splitter._firstLineIndent;

340	411

341 endLine() {	412 // The unbound rules in use by the current line. This will be null after

342 // Punish lines that went over the length. We don't rule these out	413 // the first long line has completed.

343 // completely because it may be that the only solution still goes over	414 var currentLineRules = [];

344 // (for example with long string literals).	415

345 if (length > _writer.pageWidth) {	416 endLine(int end) {

346 cost += (length - _writer.pageWidth) * Cost.overflowChar;	417 // Track lines that went over the length. It is only rules contained in

347 }	418 // long lines that we may want to split.

348 }	419 if (length > _splitter._writer.pageWidth) {

349	420 _overflowChars += length - _splitter._writer.pageWidth;

	421

	422 // Only try rules that are in the first long line, since we know at

	423 // least one of them will be split.

	424 if (currentLineRules != null && currentLineRules.isNotEmpty) {

	425 _liveRules.addAll(currentLineRules);

	426 currentLineRules = null;

	427 }

	428 } else {

	429 // The line fit, so don't keep track of its rules.

	430 if (currentLineRules != null) {

	431 currentLineRules.clear();

	432 }

	433 }

	434 }

	435

350 // The set of spans that contain chunks that ended up splitting. We store	436 // The set of spans that contain chunks that ended up splitting. We store

351 // these in a set so a span's cost doesn't get double-counted if more than	437 // these in a set so a span's cost doesn't get double-counted if more than

352 // one split occurs in it.	438 // one split occurs in it.

353 var splitSpans = new Set();	439 var splitSpans = new Set();

354	440

355 for (var i = prefix.length; i < _chunks.length; i++) {	441 for (var i = 0; i < _splitter._chunks.length; i++) {

356 var chunk = _chunks[i];	442 var chunk = _splitter._chunks[i];

357	443

358 length += chunk.text.length;	444 length += chunk.text.length;

359	445

360 if (i < _chunks.length - 1) {	446 // Ignore the split after the last chunk.

361 if (splits.shouldSplitAt(i)) {	447 if (i == _splitter._chunks.length - 1) break;

362 endLine();

363	448

364 splitSpans.addAll(chunk.spans);	449 if (_splits.shouldSplitAt(i)) {

	450 endLine(i);

365	451

366 if (chunk.rule != null && !splitRules.contains(chunk.rule)) {	452 splitSpans.addAll(chunk.spans);

367 // Don't double-count rules if multiple splits share the same

368 // rule.

369 splitRules.add(chunk.rule);

370 cost += chunk.rule.cost;

371 }

372	453

373 // Include the cost of the nested block.	454 // Include the cost of the nested block.

374 if (chunk.blockChunks.isNotEmpty) {	455 if (chunk.blockChunks.isNotEmpty) {

375 cost += _writer.formatBlock(chunk, splits.getColumn(i)).cost;	456 cost +=

376 }	457 _splitter._writer.formatBlock(chunk, _splits.getColumn(i)).cost;

	458 }

377	459

378 // Start the new line.	460 // Start the new line.

379 length = splits.getColumn(i);	461 length = _splits.getColumn(i);

380 } else {	462 } else {

381 if (chunk.spaceWhenUnsplit) length++;	463 if (chunk.spaceWhenUnsplit) length++;

382	464

383 // Include the nested block inline, if any.	465 // Include the nested block inline, if any.

384 length += chunk.unsplitBlockLength;	466 length += chunk.unsplitBlockLength;

	467

	468 // If we might be in the first overly long line, keep track of any

	469 // unbound rules we encounter. These are ones that we'll want to try to

	470 // bind to shorten the long line.

	471 if (currentLineRules != null &&

	472 chunk.rule != null &&

	473 !chunk.isHardSplit &&

	474 !_ruleValues.contains(chunk.rule)) {

	475 currentLineRules.add(chunk.rule);

385 }	476 }

386 }	477 }

387 }	478 }

388	479

	480 // Add the costs for the rules that split.

	481 _ruleValues.forEach(_splitter._rules, (rule, value) {

	482 // A rule may be bound to zero if another rule constrains it to not split.

	483 if (value != 0) cost += rule.cost;

	484 });

	485

389 // Add the costs for the spans containing splits.	486 // Add the costs for the spans containing splits.

390 for (var span in splitSpans) cost += span.cost;	487 for (var span in splitSpans) cost += span.cost;

391	488

392 // Finish the last line.	489 // Finish the last line.

393 endLine();	490 endLine(_splitter._chunks.length);

394	491

395 return cost;	492 _splits.setCost(cost);

	493 }

	494

	495 String toString() {

	496 var buffer = new StringBuffer();

	497

	498 buffer.writeAll(

	499 _splitter._rules.map((rule) {

	500 var valueLength = "${rule.fullySplitValue}".length;

	501

	502 var value = "?";

	503 if (_ruleValues.contains(rule)) {

	504 value = "${_ruleValues.getValue(rule)}";

	505 }

	506

	507 value = value.padLeft(valueLength);

	508 if (_liveRules.contains(rule)) {

	509 value = debug.bold(value);

	510 } else {

	511 value = debug.gray(value);

	512 }

	513

	514 return value;

	515 }),

	516 " ");

	517

	518 buffer.write(" \$${splits.cost}");

	519

	520 if (overflowChars > 0) buffer.write(" (${overflowChars} over)");

	521 if (!_isComplete) buffer.write(" (incomplete)");

	522 if (splits == null) buffer.write(" invalid");

	523

	524 return buffer.toString();

396 }	525 }

397 }	526 }

398

399 /// Keeps track of the best set of splits found so far for a suffix of some

400 /// prefix.

401 class SplitSolution {

402 /// The prefix whose suffix we are finding a solution for.

403 final LinePrefix _prefix;

404

405 /// The best set of splits currently found.

406 SplitSet get splits => _bestSplits;

407 SplitSet _bestSplits;

408

409 /// The lowest cost currently found.

410 int get cost => _lowestCost;

411 int _lowestCost;

412

413 /// Whether a solution that fits within a page has been found yet.

414 bool get isAdequate => _lowestCost != null && _lowestCost < Cost.overflowChar;

415

416 SplitSolution(this._prefix);

417

418 /// Compares [splits] to the best solution found so far and keeps it if it's

419 /// better.

420 void update(LineSplitter splitter, SplitSet splits) {

421 var cost = splitter._evaluateCost(_prefix, splits);

422

423 if (_lowestCost == null \|\| cost < _lowestCost) {

424 _bestSplits = splits;

425 _lowestCost = cost;

426 }

427

428 if (debug.traceSplitter) {

429 var best = _bestSplits == splits ? " (best)" : "";

430 debug.log(debug.gray("$_prefix $splits \$$cost$best"));

431 debug.dumpLines(splitter._chunks, _prefix, splits);

432 debug.log();

433 }

434 }

435 }

436

437 /// An immutable, persistent set of enabled split [Chunk]s.

438 ///

439 /// For each chunk, this tracks if it has been split and, if so, what the

440 /// chosen column is for the following line.

441 ///

442 /// Internally, this uses a sparse parallel list where each element corresponds

443 /// to the column of the chunk at that index in the chunk list, or `null` if

444 /// there is no active split there. This had about a 10% perf improvement over

445 /// using a [Set] of splits or a persistent linked list of split index/indent

446 /// pairs.

447 class SplitSet {

448 List<int> _columns;

449

450 /// Creates a new empty split set.

451 SplitSet() : this._(const []);

452

453 SplitSet._(this._columns);

454

455 /// Returns a new [SplitSet] containing the union of this one and the split

456 /// at [index] with next line starting at [column].

457 SplitSet add(int index, int column) {

458 var newIndents = new List(math.max(index + 1, _columns.length));

459 newIndents.setAll(0, _columns);

460 newIndents[index] = column;

461

462 return new SplitSet._(newIndents);

463 }

464

465 /// Returns `true` if the chunk at [splitIndex] should be split.

466 bool shouldSplitAt(int index) =>

467 index < _columns.length && _columns[index] != null;

468

469 /// Gets the zero-based starting column for the chunk at [index].

470 int getColumn(int index) => _columns[index];

471

472 String toString() {

473 var result = [];

474 for (var i = 0; i < _columns.length; i++) {

475 if (_columns[i] != null) {

476 result.add("$i:${_columns[i]}");

477 }

478 }

479

480 return result.join(" ");

481 }

482 }

OLD	NEW

« no previous file with comments | « lib/src/line_prefix.dart ('k') | lib/src/line_writer.dart » ('j') | no next file with comments »