Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not end paragraph before ::: in fenced divs #428

Merged
merged 2 commits into from
Apr 3, 2024
Merged

Conversation

Witiko
Copy link
Owner

@Witiko Witiko commented Mar 21, 2024

Closes #407.

Tasks

@Witiko Witiko added bug lua Related to the Lua interface and implementation syntax extension Related to syntax extensions and dialects of markdown labels Mar 21, 2024
@Witiko Witiko added this to the 3.5.0 milestone Mar 21, 2024
@Witiko
Copy link
Owner Author

Witiko commented Mar 25, 2024

I pressed on using the original approach of @Omikhleia, which relied on buffering the contents of a fenced div before parsing them. However, I realized that buffering the body of a fenced div would have so many different corner cases just to properly handle nested fenced divs and fenced code blocks that this approach is not worth the effort. I stashed my changes here.

Instead, I mapped out the viable approach that we should implement. Our issue is with the handling of trailing div fences when the blankBeforeFencedDiv option has been disabled, which it is by default:

markdown/markdown.dtx

Lines 30960 to 30981 in 415379f

% If the `blank_before_div_fence` parameter is `false`, we will have the
% closing div at the beginning of a line break the current paragraph if
% we are currently nested in a div.
%
% \end{markdown}
% \begin{macrocode}
local function check_div_level(s, i, current_level) -- luacheck: ignore s i
current_level = tonumber(current_level)
return current_level > 0
end
local is_inside_div = Cmt(Cb("div_level"), check_div_level)
local fencestart = is_inside_div * fenced_div_end
if not blank_before_div_fence then
self.update_rule("EndlineExceptions", function(previous_pattern)
if previous_pattern == nil then
previous_pattern = parsers.EndlineExceptions
end
return previous_pattern + fencestart
end)
end

Namely, the pattern is_inside_div is not sufficient to decide whether we should break a paragraph before a line that looks like the trailing div fence. This is because the line may be nested in a block-level element and will not end a fenced div, see the example from #403 (comment):

::: {#some-identifier}
This is the beginning of a div

> This is a blockquote that contains three colons:
> :::

This is the end of a div
:::

Here, the line > ::: will break the paragraph but it will not end the fenced div. This causes us to incorrectly produce a paragraph break before the three colons rather than the soft line break that we would expect and that is produced when we set the option blankBeforeFencedDiv to true.

In #427, @lostenderman followed my suggestion and tried to fix the issue by tracking whether we are on the top level of a fenced div or nested in a block-level element using a named capture group is_top_level. This allowed us to strengthen the condition for breaking a paragraph by requiring that not only are we in a fenced div but we are on the top level of a fenced div.

However, this approach does not work well with nested fenced divs, which would require that the named capture group is a stack rather than a scalar, so that we can know whether we are still at the top level of a parent fenced div when we have emerged from a nested fenced div. More crucially, this approach does not fix the issue for trailing lines of blockquotes and other elements such as list items, which should be able to close a div but are technically still nested in a block-level element:

::: {#some-identifier}
> This is a blockquote followed by the end of a div:
:::

Therefore, we need a different approach.

A more relaxed condition for the closing div fence is that it has the same indent level as the opening div fence. Currently, we use the function increment_div_level() to store the current nesting level inside the named capture group div_level. This is awkward, because we keep converting between a string and an integer, and also limiting, because we can only store a scalar value efficiently. Instead, we may use a similar hack as @lostenderman used in #226 for the named capture group indent_info that carries an anonymous Lua table instead of a string. We can use the same technique to make the named capture group div_level from a string that stores an integer into an array of opening indent levels for each level of fenced div nesting. Then, we would only break a paragraph when the closing div fence would have the same indent level. We should be able to tease this information out of the named capture group indent_info.

if fenced_div_indent_table.indents ~= nil then
num_opening_indents = #fenced_div_indent_table.indents
end
return num_current_indents == num_opening_indents
Copy link
Owner Author

@Witiko Witiko Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lostenderman This feels like the conceptually correct approach. However, num_current_indents is currently still wrong for e.g. lazy continuation lines:

::: {#an-identifier}
> a
:::  <!-- Here, `num_current_indents` is still 1 but `num_opening_indents` is 0. -->

This is also corroborated by the CI, which fails on the single test case in fenced-divs.test that involves what-would-be lazy continuation lines if the fenced div syntax extension were disabled:

::: {#another-identifier}
> This is a blockquote immediately in a
> div
:::

I will appreciate your advice on refining the current approach. In the Endline pattern, we call the parsers.check_minimal_indent and parsers.check_optional_indent patterns before the EnclineExceptions pattern and these update indent_table.trail with information about the current line. Perhaps we could use indent_table.trail to better determine num_current_indents? I am still not comfortable around the indent-info / trail-info objects and how to interpret their properties and will appreciate a nudge in the right direction (or a commit).

Copy link
Owner Author

@Witiko Witiko Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lostenderman If you are busy, I will likely figure it out eventually at this point. However, you would likely be much more efficient at this, since the indent-info / trail-info objects are your code. I will appreciate your insight.

@Witiko Witiko requested a review from lostenderman March 26, 2024 00:24
% \end{markdown}
% \begin{macrocode}
self.initialize_named_group("div_level", "0")
% Initialize a named group named `fenced_div_level` for tracking how deep
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
% Initialize a named group named `fenced_div_level` for tracking how deep
% Initialize a named group `fenced_div_level` for tracking how deep

@Witiko Witiko force-pushed the main branch 2 times, most recently from 4cf1e86 to bd7ac0a Compare April 1, 2024 13:24
@Witiko Witiko merged commit 32b52ba into main Apr 3, 2024
11 of 14 checks passed
@Witiko Witiko deleted the fix/fenced-divs branch June 16, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug lua Related to the Lua interface and implementation syntax extension Related to syntax extensions and dialects of markdown
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do not end paragraph before ::: in fenced divs
1 participant