Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve underscores in emphasis. #760

Merged
merged 3 commits into from
Aug 15, 2022
Merged

Preserve underscores in emphasis. #760

merged 3 commits into from
Aug 15, 2022

Conversation

nojaf
Copy link
Collaborator

@nojaf nojaf commented Aug 8, 2022

Fixes #389.
This isn't an airtight fix but will work in the most common situations.
It worked for the Fantomas documentation.

@nhirschey
Copy link
Collaborator

As written you will fail markdown emphasis near commas, semi-colons, etc. such as in the below test (you can test the spec here). I suggest a fix below, a little more tight but still not perfect.

[<Test>]
let ``Underscore inside italic and bold near punctuation is preserved`` () =
    let doc = "This is **bold_bold**, and this _italic_; and _this_too_: again."

    let expected =
        "<p>This is <strong>bold_bold</strong>, and this <em>italic</em>; and <em>this_too</em>: again.</p>\r\n"
        |> properNewLines

    Markdown.ToHtml doc |> shouldEqual expected

(*
Actual:
"<p>This is **bold_bold**, and this _italic_; and _this_too_: again.</p>"
*)

It will pass if you instead don't end emphasis with underscores that come before letters or numbers:

/// Succeeds when the specificed character list starts with a letter or number
let inline (|AlphaNum|_|) input =
    let re = """^[a-zA-Z0-9]"""
    let match' = Regex.Match(Array.ofList input |> String, re)
    if match'.Success then
        let entity = match'.Value
        let _, rest = List.splitAt entity.Length input
        Some(char entity, rest)
    else
        None

/// Matches a list if it starts with a sub-list that is delimited
/// using the specified delimiters. Returns a wrapped list and the rest.
///
/// This is similar to `List.Delimited`, but it skips over escaped characters.
let (|DelimitedMarkdown|_|) bracket input =
    let _startl, endl = bracket, bracket
    // Like List.partitionUntilEquals, but skip over escaped characters
    let rec loop acc =
        function
        | EscapedChar (x, xs) -> loop (x :: '\\' :: acc) xs
        | input when List.startsWith endl input ->
            let rest = List.skip bracket.Length input
            match rest with
            | AlphaNum (x, xs ) -> loop (x :: endl @ acc) xs
            | _ -> Some (List.rev acc, input)
        | x :: xs -> loop (x :: acc) xs
        | [] -> None
    // If it starts with 'startl', let's search for 'endl'
    if List.startsWith bracket input then
        match loop [] (List.skip bracket.Length input) with
        | Some (pre, post) -> Some(pre, List.skip bracket.Length post)
        | None -> None
    else
        None

@nojaf
Copy link
Collaborator Author

nojaf commented Aug 9, 2022

Thanks!

@dsyme dsyme merged commit a4a5c93 into fsprojects:main Aug 15, 2022
@nojaf nojaf deleted the fix-389 branch August 15, 2022 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Markdown parser gets multiple-underscores-inside-italics wrong
3 participants