The previous paragraph layout algorithm had a couple of flaws:
- It always produced line break opportunities between runs although on
the textual level there might have been none.
- It didn't handle trailing spacing correctly in some cases.
- It wouldn't have been easily adaptable to Knuth-Plass style optimal
line breaking because it was fundamentally structured first-fit
run-by-run.
The new paragraph layout algorithm fixes these flaws. It proceeds
roughly in the following stages:
1. Collect all text in the paragraph.
2. Compute BiDi embedding levels.
3. Shape all runs, layout all children and store the resulting items in
a reusable (possibly even cacheable) `ParLayout`.
3. Iterate over all line breaks in the concatenated text.
4. Construct lightweight `LineLayout` objects for full lines instead of
runs. These mostly borrow from the `ParLayout` and only reshape the
first and last run if necessary. The design allows to use Harfbuzz's
UNSAFE_TO_BREAK mechanism to make reshaping more efficient. The size
of a `LineLayout` can be measured without building the line's frame.
5. Build only the selected line's frames and stack them.
This creates a smaller state machine helper type for softness coalescing, which does not own the resulting nodes. While this creates a bit more duplication in stack and par builder, it makes it a lot easier to integrate additional logic into the paragraph builder.
Furthermore:
- Line breaks are now "hard", that is, not coalesced with each other.
- Text nodes with equal style are now merged allowing for example `f{}i` to form a ligature.
- New naming scheme
- TextNode instead of NodeText
- CallExpr instead of ExprCall
- ...
- Less glob imports
- Removes Value::Args variant
- Removes prelude
- Renames Layouted to Fragment
- Moves font into env
- Moves shaping into layout
- Moves frame into separate module
Adds top-edge and bottom-edge parameters to the font function. These define how
the box around a word is computed. The possible values are:
- ascender
- cap-height (default top edge)
- x-height
- baseline (default bottom edge)
- descender
The defaults are chosen so that it's easy to create good-looking designs with
vertical alignment. Since they are much tighter than what most other software
uses by default, the default leading had to be increased to 50% of the font size
and paragraph spacing to 100% of the font size.
The values cap-height and x-height fall back to ascender in case they are zero
because this value may occur in fonts that don't have glyphs with cap- or
x-height (like Twitter Color Emoji). Since cap-height is the default top edge,
doing no fallback would break things badly.
Removes softness in favor of a simple boolean for pages and a more finegread u8
for spacing. This is needed to make paragraph spacing consume line spacing
created by hard line breaks.
This makes expansion behaviour inheritable by placing it into the area and passing it down during layouting instead of computing some approximation of what we want during execution.
- Only add line spacing between lines. Previously, line spacing was added below
every line, making `#box[word]` higher than just `word`.
- Compute box height of text as `ascender - descender` so that the full word is
contained in the box.
The name run was a relict of the time where a line consisted of a set of runs with same alignment. While these runs still exist conceptually, they are all stored flatly together in what was now renamed from `run` to `line`.
- Make page break behaviour more consistent
- Allow skipping reference image testing for single tests with `// compare-ref: false` (useful for tests which only check error messages)
This basically reverts the earlier change from parbreaks to par nodes because:
- It is simpler and less nested
- It works way better with functions that layout their body inline like `font`, which where buggy before, previously
The original reasons for changing to par nodes were:
- the envisioned design of the layouter at that time (based on dynamic nodes etc.), which is not relevant anymore
- possibly existing benefits with regards to incremental compilation, which are unsure and outweighed by the immediate benefits of the parbreak-representation
- Refactors the tokenizer to be lazy: It does not emit pre-parsed function tokens, but instead allows it's mode to be changed. The modes are tracked on a stack to allow nested compute/typesetting (pop/push).
- Introduces delimited groups into the parser, which make it easy to parse delimited expressions without handling the delimiters in the parsing code for the group's content. A group is started with `start_group`. When reaching the group's end (matching delimiter) the eat and peek methods will simply return `None` instead of the delimiter, stopping the content parser and bubbling up the call stack until `end_group` is called to clear up the situation.
- In addition to syntax trees there are now `Value`s, which syntax trees can be evaluated into (e.g. the tree is `5+5` and the value is `10`)
- Parsing is completely pure, function calls are not parsed into nodes, but into simple call expressions, which are resolved later
- Functions aren't dynamic nodes anymore, but simply functions which receive their arguments as a table and the layouting context
- Functions may return any `Value`
- Layouting is powered by functions which return the new `Commands` value, which informs the layouting engine what to do
- When a function returns a non-`Commands` value, the layouter simply dumps the value into the document in monospace
- Use fontdock for indexing fonts and querying
- Typst binary now automatically indexes and uses system fonts in addition to a fixed font folder!
- Removes subsetting support for now (was half-finished anyways, plan is to use harfbuzz for subsetting in the future)
- Adds font width configuration support
- Problems -> Diagnostics
- Position -> Pos
- offset_spans -> Offset trait
- Size -> Length (and some more size types renamed)
- Paper into its own module
- scope::Parser -> parsing::CallParser
- Create `Decorations` alias
- Remove lots of double newlines
- Switch from f32 to f64
- Forced line breaks with backslash followed by whitespace
- (Multline) raw text in backticks
- Set font class fallbacks with [font.family] (e.g. [font.family: monospace=("CMU Typewriter Text")])
- More sophisticated procedure to find end of function, which accounts for comments, strings, raw text and nested functions (this is a mix of a feature and a bug fix)
- Dynamic models instead of SyntaxTrees
- No more ParseResult/LayoutResult
- Errors and Decorations which are propagated to parent contexts
- Models are finally clonable