-
-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract askama_parser from askama_derive #782
Conversation
e9782e7
to
c58643c
Compare
Could you please split up the commit a little? Your explanations sound sensible, but it would be a bit easier to review if every change was its own commit (removing Span, making Config 'static again, moving the code). |
Seems like we can now create the parser outside of the proc macro so that looks good to me. 👍 |
ee1994a
to
109bb62
Compare
@Kijewski I've split things up here into hopefully bite-sized pieces. Please let me know if you there's anything else you think could use improvement. |
Upon further review, I'm not sure the cut is in exactly the right place here. Towit: |
9136054
to
51d49e8
Compare
Thanks for your patience @Kijewski @djc @GuillaumeGomez. I've re-split the crates per my previous comment, and I think the commit sequence looks good. Please take another look and let me know any feedback you have, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with most of the direction here, though I have bunch of (mostly stylistic) nits.
askama_derive/src/parser/syntax.rs
Outdated
@@ -0,0 +1,22 @@ | |||
#[derive(Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like tiny modules like this. Please move it to the bottom of parser/mod.rs
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
askama_derive/src/parser/error.rs
Outdated
use std::borrow::Cow; | ||
use std::fmt; | ||
|
||
#[derive(Debug, Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes no sense to stick CompileError
in parser
. We can just return Cow<'static, str>
directly in the Result
from parse()
, instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger that. I had not grokked that it's not used internally within the parser, just at that top-level parse
method. After further reflection, I'd prefer an error type with public accessors to get the line and column numbers and input snippet, and perhaps in the future additional details.
After looking at the three error cases returned by the parse
method, I believe they can be coalesced into a single case (logically, if not at the type level) with the combinators nom::combinator::all_consuming
and nom::compbinator::complete
, so I'd propose doing that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with having a fully-formed Error
type, if you want to wire something up -- it just shouldn't be the CompileError
type that's used for all the other stuff in derive
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
askama_derive/src/generator.rs
Outdated
@@ -1,7 +1,8 @@ | |||
use crate::config::{get_template_source, read_config_file, Config, WhitespaceHandling}; | |||
use crate::heritage::{Context, Heritage}; | |||
use crate::input::{Print, Source, TemplateInput}; | |||
use crate::parser::{parse, CompileError, Cond, CondTest, Expr, Loop, Node, Target, When, Whitespace, Ws}; | |||
use crate::parser::node::{Cond, CondTest, Loop, Node, Target, When, Whitespace, Ws}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the current number of items, re-exporting items from the nested modules in the parser
module seems like a better solution to me than exposing parser
's internal module structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imported like this to make the change minimal, but if I had been writing the code from scratch all the references here would be in the style node::Cond
, node::When
, etc. But I hear you on the blanket imports here. If we're expecting them to be imported directly, would it be a good idea to make the names more verbose (e.g. LoopNode
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so? Again, the number of types exposed here is pretty small, and pretty much all of them are nodes anyway. You added a bunch of documentation, I think that should be good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
askama_parser/src/expr.rs
Outdated
@@ -35,17 +35,17 @@ pub(crate) enum Expr<'a> { | |||
} | |||
|
|||
impl Expr<'_> { | |||
pub(super) fn parse(i: &str) -> IResult<&str, Expr<'_>> { | |||
pub(crate) fn parse(i: &str) -> IResult<&str, Expr<'_>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are actually identical in this case, just leave them as-is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
askama_parser/src/lib.rs
Outdated
pub(crate) mod expr; | ||
pub(crate) mod node; | ||
pub(crate) mod syntax; | ||
pub mod error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be kept pub(crate)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer applicable.
//! which represent either some literal text in the template or one of the | ||
//! three types of template tags: expressions, blocks, or comments. | ||
//! | ||
//! The main entry point to this crate is the [`parse()`](./fn.parse.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a very nice public API. Maybe change it to wrap a struct Ast<'a> { nodes: Vec<Node<'a>> }
and change parse()
to become Ast::from_str()
? (In a separate commit.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree, and propose that the type be named Template
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like Template
since that's already what the trait is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I was thinking std::str::FromStr
, but upon review I noticed this:
FromStr does not have a lifetime parameter, and so you can only parse types that do not contain a lifetime parameter themselves.
Perhaps you mean a non-trait from_str
method? Since it also takes a Syntax
parameter, I'm wondering if from_str
fully captures the meaning.
As to the new struct's name, some things stood out. The parse_template
helper is what produces this top-level result and the bodies of loops, if and else clauses, etc. and it makes sense to unify the concepts. Those elements are frequently referred to as blocks. Perhaps struct Block<'a> { nodes : Vec<Node<'a>> }
?
askama_parser/src/node.rs
Outdated
#[derive(Debug, PartialEq)] | ||
pub enum Node<'a> { | ||
/// Literal text to output directly. | ||
/// | ||
/// The first and third tuple elements are the left- and right-side |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If making this API public, consider changing these variants to name their fields instead of using tuple variants. (Separate commit.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the API refactor, these tuple variants all seem self-explanatory, please advise if you disagree.
askama_parser/src/node.rs
Outdated
Name(&'a str), | ||
/// Destruture a tuple value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Done.
askama_parser/src/node.rs
Outdated
/// | ||
/// Second field is "minus/plus sign was used on the right part of the item". | ||
/// First tuple value is the setting (`-`/`+`/`~`) for the left side of the block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably name the fields here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
askama_parser/src/node.rs
Outdated
pub ws3: Ws, | ||
} | ||
|
||
/// A single branch of a match block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very fond of type aliases these days. I think we should make When
a proper struct
with fields, and same for Cond
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
askama_derive/src/generator.rs
Outdated
Err(e) => syn::Error::new(Span::call_site(), e) | ||
.to_compile_error() | ||
.into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point I guess we could implement to_compile_error()
manually (untested):
format!("::core::compile_error!({:?});", format_args!("{e}")).parse()
If we can omit the proc_macro2
dependency altogether in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, although proc_macro2
is still required by a number of the integration crates.
For those following along at home, the requested changes not related to API improvements and documentation are up at https://github.com/djc/askama/compare/main...couchand:askama:2023-03/extract-parser |
3880bf4
to
a1481da
Compare
I've rewritten the patch set with the idea that we want to keep everything within There are a large number of commits that are each the implementation of some big picture idea for one of the AST node types, repeated for each node type. These goals are all in pursuit of making the AST well-structured, so that containment within sub-blocks is clearly maintained. The major sticking point to that was the treatment of
I appreciate your patience reviewing this patch set, hopefully I have made each commit small enough that it is not too much of a bear to go through it. Happy to take any feedback or suggestions! |
I like the PR very much! Every change is well thought out and well explained. Actually the only thing I don't like is the name "Tag". I don't associate anything with the name, but maybe that's just me. I guess if we apply the names XML uses, then our "Tag" is an "Element", our "Lit" is "Data", and our "Node" is, well, a "Node". |
I would prefer to merge something closer to the original scope of the PR, before going into increasing depth of changing the structure of the AST types to your preferences. In particular, it would be nice to get the documentation changes in separately, because one potentially useful input to judging AST type refactoring is to review the diff's line count changes. In its current state (with a bunch of changes with naming choices I question and other choices that don't have obvious value to me), I don't think I'll be able to prioritize digesting this PR among the other ~70 GitHub notifications I should be attending to. |
a1481da
to
26e5eff
Compare
Thanks @Kijewski!
I hear you @djc. If you'd prefer separate PRs for the refactorings rather than including it all here, I'm happy to split those commits out. Before doing so, I'd like to get general consensus on the overall direction so I'm not spinning my wheels. (It's a lot of commits because there are many variants but it's only a handful of "changes", and without the context of the end goal each can seem to lack purpose). This PR works backwards from the last commit (which extracts the crate). Per previous discussion, it seems wise to figure out the public API before extracting the code rather than after. So based on your previous comments I worked to design that public API in a way that's internally consistent and exposes the structure a downstream consumer would need.
I'd suggest one of the best ways to judge this refactor is to look at
Happy to take naming suggestions. The structs currently called |
I'm not sure what overall direction you mean. If by overall direction you mean a bunch of the refactoring of the
"internally consistent" is too abstract to be meaningful to me, and "the structure a downstream consumer" seems quite specific to your opinions -- I think the existing code generator is a downstream consumer and while it might not be the prettiest code, it does okay.
I can get behind the idea that the whitespace handling could be more principled. Maybe that should be a separate first PR, with minimal changes to the rest of the structure and naming?
I don't like The way to avoid bikeshedding is to avoid renaming things for now while we get the structure right, then renaming things can come last. |
Ok, to be honest I'm struggling with what's actionable here. Do you want me to:
I mean the shape of the new public API. Here is one proposal, if it's not right let's discuss where it falls short so that my next attempt can meet your goals.
These changes are mostly driven by the generator code, the whitespace handling in particular. I've even removed the new things I'd added for other potential users. But, I'd point out, we also shouldn't totally ignore potential new users when designing a new public API.
I hear what you're saying, but I'm not sure if there's a big win to be had there? I don't know if it really makes sense without the structural changes -- the structure is what makes it principled, just as with structured programming generally. By my count, that PR would have 51 of the 64 commits from this one, everything from 1366d70 to 09a264a. Is there a smaller subset of these commits that you'd suggest go first?
Fair point, though OTOH, it's already a bit muddled given usage like: https://github.com/djc/askama/blob/077ba18bf95832e042157db6284d50460f56ab3e/askama_derive/src/parser/node.rs#L66 https://github.com/djc/askama/blob/077ba18bf95832e042157db6284d50460f56ab3e/askama_derive/src/heritage.rs#L62 https://github.com/djc/askama/blob/077ba18bf95832e042157db6284d50460f56ab3e/askama_derive/src/heritage.rs#L76 https://github.com/djc/askama/blob/077ba18bf95832e042157db6284d50460f56ab3e/askama_derive/src/generator.rs#L638
FWIW,
Well, not exactly. In the old structure,
If I may, a
I played around with having an enum with variants for these (Jinja docs call them "delimiters"), but I didn't find it to be practically useful either for the generator code or for writing clear documentation, so I would now consider that a non-goal. What is incredibly useful is correlating whitespace suppression to the structure it concerns, which is why my focus is there.
I strongly agree. Let's focus on the purpose of the structures first and then later identify the right names for them. |
"just be patient" won't get us going anywhere. Honestly, I don't have clear idea on how to navigate this either. Maybe an idea would be to first transpose the Another approach would be to start by landing smaller parts that we have consensus on already, to at least make some progress.
It's not obvious to me that it's useful/necessary to wrap (the equivalent of)
In my mind what you call |
#834 extracts a parser crate and improves the parser API a bunch. |
I'm going to close this, as the stated goal (extract parser crate) has been achieved and the remaining changes will need substantial work to rebase -- it probably makes sense to start a new PR instead. |
This builds on the great work by @Kijewski in #772 to make the parser code more maintainable, and replaces the draft PR on this matter at #748.
An attempt is made to keep the code unchanged where possible. Several minor changes were made to try to keep the dependencies and API surface minimal, including:
span
fromCompileError
(it's alwaysSpan::call_site
anyway)Syntax
struct -- in my back-of-the-napkin analysis I don't see how that one-time copy of six short strings would be performance-sensitive.Next steps include bringing over the parser AST documentation that I had started on #748, and additional API cleanup as necessary.
c.f. #760