Annotations 2.0 #75

FabioBatSilva · 2016-03-20T18:32:55Z

Move namespace to Doctrine\Annotations (Removing Common)
Uses hoa/compiler instead of doctrine/lexer ( see : grammar )
Drop AnnotationRegistry and all autoload magic
Drop Attribute/Attributes annotations
Drop SimpleAnnotationReader
Drop FileCacheReader
Drop IndexedReader
Requires php 7

TODO:

Local reviews (checkout + run locally):

schmittjoh · 2016-03-21T18:16:25Z

How does it affect userland? Any benefits/goals?

stof · 2016-03-21T18:31:39Z

composer.json

    ],
    "require": {
-        "php": ">=5.3.2",
-        "doctrine/lexer": "1.*"
+        "php":          ">=7.0.0",


please don't align constraints. It leads to a nightmare when adding new constraints (merge conflicts for nothing due to alignment changes)

@FabioBatSilva

Aligning constraints tells us you're manually editing composer.son, which should not be necessary when requiring dependencies.

stof · 2016-03-21T18:46:48Z

Drop Attribute/Attributes annotations

Why is it dropped ?

yourwebmaker · 2016-03-21T18:50:00Z

src/Configuration.php

+    }
+
+    /**
+     * @return \Doctrine\Annotations\Metadata\MetadataFactory


Sometimes you're using FQCN and sometimes you're using alias. It'd be better to use only one.

@FabioBatSilva keep fqcn

donquixote · 2017-08-26T18:20:32Z

src/AnnotationReader.php

@@ -82,7 +82,7 @@ public function getClassAnnotations(ReflectionClass $class) : array
    /**
     * {@inheritDoc}
     */
-    public function getClassAnnotation(ReflectionClass $class, $annotationName)
+    public function getClassAnnotation(ReflectionClass $class, string $annotationName)


I would suggest to split decorative CS changes and BC-breaking signature changes into separate distinct commits.
Adding a type hint changes the signature, which breaks any implementing classes.

And then, the decorative code style changes can go into a dedicated PR to the 1.x branch.

donquixote · 2017-08-27T20:50:49Z

src/Builder.php

+     * Constructor.
+     *
+     * @param \Doctrine\Annotations\Resolver        $resolver
+     * @param \Doctrine\Annotations\MetadataFactory $metadataFactory


The type is incorrect, it should be \Doctrine\Annotations\Metadata\MetadataFactory.

donquixote · 2017-08-27T20:53:35Z

src/Metadata/MetadataFactory.php

+
+        $class       = new \ReflectionClass($className);
+        $constructor = $class->getConstructor();
+        $docComment  = $class->getDocComment();


Unused local variable $docComment.

donquixote · 2017-08-27T21:03:56Z

Did anyone run a benchmark to compare the hoa compiler to the one currently in use?
Maybe it is all "fast enough" so we don't have to worry. Just asking.

donquixote · 2017-08-28T02:06:29Z

src/Metadata/MetadataFactory.php

@@ -107,11 +107,11 @@ public function getMetadataFor(string $className)
     */
    private function isAnnotation(ReflectionClass $class, array $annotations) : bool
    {
-        if ($class->isSubclassOf('Doctrine\Annotations\Annotation')) {
+        if ($class->isSubclassOf(Annotation::CLASS)) {


::CLASS should be lowercase ::class.

donquixote · 2017-08-28T03:11:43Z

Architecture: I propose to drop Metadata and MetadataFactory.
Instead, have one factory object per class (or give it a different name, dunno (*)).

class Builder
{
    [..]

    /**
     * @param Context   $context
     * @param Reference $reference
     *
     * @return object
     */
    public function create(Context $context, Reference $reference)
    {
        $target    = $reference->nested ? Target::TARGET_ANNOTATION : $context->getTarget();
        $fullClass = $this->resolver->resolve($context, $reference->name);
        $values    = $reference->values;

        if (null === $factory = $this->factoryProvider->classGetFactory($fullClass)) {
            throw InvalidAnnotationException::notAnnotationException($fullClass, $reference->name, $context->getDescription());
        }

        return $factory->instantiate($context, $values, $target);
    }
}

Now all the metadata stuff can be encapsulated in the $factory object.
There can even be separate factory classes depending how the class should be constructed, and how the annotation values are transformed into arguments.
There can also be a ClassNotFoundFactory, which always throws an exception in the ->instantiate() method.
The interface for those factories really only has this one method.

(*) I was initially going to call the "factory" "instantiator". But this name already exists in Doctrine\Instantiator\Instantiator. So...

donquixote · 2017-08-28T06:16:09Z

src/Reflection/ReflectionMethod.php

+        }
+
+        return $this->imports = array_merge($classImports, $traitImports);
+    }


I think this is confusing. If the method is coming from a trait, then the annotation is living in the trait file, and the imports should be from the trait's file only. If the method is declared in the class itself, then the annotation is living in the class file, and it should use the imports from the class file. I don't see a case where it should combine imports from different files.

Here is how to find out if a method is defined in a trait, https://stackoverflow.com/a/45912866/246724.

Btw, imo all of this reflection + inheritance adds unnecessary complexity.

Well, forbidding people to use inheritance and traits when using annotations would mean that nobody would migrate to version 2

From a functional perspective, it should stay the exact same, maybe with some API changes, but no behavioral ones.

The point here is getting rid of our own hacky parser, using a formalised one (HOA's)

Oh, maybe I was unclear.

all of this reflection + inheritance adds unnecessary complexity.

What I mean is we don't need to inherit from \ReflectionClass to find the imports. Instead, have an ImportFinder or something like that.

Of course, people who write annotated classes should be allowed to use inheritance, and traits!

From a functional perspective, it should stay the exact same, maybe with some API changes, but no behavioral ones.

If the code in the PR is replicating existing behavior, then it needs to stay this way.
Or, if we agree that the old behavior is wrong, we could have two implementations of ImportFinder or of the class name resolver: One that operates the BC way, another that operates the "correct" way.

Why would we say that "the old behavior is wrong"?

Consider this example:

File T.php:

<?php namespace Acme\Foo; use Acme\Annotation\Hello; trait T { /** * @Hello("I am an annotation on a trait method.") * @Goodbye("I am annotation on a trait method, but the import is in the class file.") */ function foo() {} }

File C.php:

<?php namespace Acme\Bar; use Acme\Annotation\Goodbye; class C { use T; }

With the behavior proposed in the PR, which I assume is also the current behavior, the second annotation @Goodbye(..) will use the import Acme\Annotation\Goodbye from the class file.

I am saying this is wrong. It should only use the imports from the trait file. So the @Hello(..) should work, but the @Goodbye(..) should not.

This would be consistent with how the language itself works.
Imports are only available within the same file.

Well personally I think having annotations on a method in a trait is probably a bad idea anyway. but if we support it, it should at least be "correct". Unless, of course, it is for BC reasons.

The point here is getting rid of our own hacky parser, using a formalised one (HOA's)

Which I assume will be more maintainable, more reliable, more understandable (people have to look at the grammar only). So yeah, seems like a good idea.

Personally I care more about the registry going away.

About the methods defined in traits: It gets even more interesting if the method is renamed.

Trait T { /** * @Hello() */ function foo() {} } class C { use T { foo as bar; } } $m = new \ReflectionMethod('C', 'bar'); $reader->getMethodAnnotations($m);

The current behavior will not understand that the method is defined in a trait under a different name.

And about properties in traits - this is more difficult. There is no \ReflectionProperty::getFileName().
https://stackoverflow.com/questions/18257158/how-to-extract-start-line-of-a-property-declaration-in-php

As a heuristic, we could say that:

If none of the traits of the class has a property with the same name, then the property belongs to the class, obviously.

If one or more of the traits define the property then we compare the doc comment. See https://3v4l.org/KY9nl.

Maybe all of this should be discussed in a separate issue. I only brought it up here because the PR affects the code where this behavior is implemented.

@donquixote we kinda fixed all of these horrors in roave/better-reflection, although it is not the primary aim here to provide very precise reflection of ugly stuff like traits.

Hywan · 2017-08-28T09:36:32Z

Just a link about hoa/compiler performances, https://blog.hoa-project.net/2016/08-Performance-boost-for-Hoa-Compiler.html. cc @donquixote

donquixote · 2017-08-28T15:46:29Z

Thanks @Hywan !
This does not compare it with a hand-written parser, only with previous hoa parsers.

In general, a hand-written or a generated parser should be faster than a parser combinator, or one that needs to interpret a piece of grammar in every step. Since it adds overhead and indirection to every micro-operation, the difference could be something like factor 2 (or more, or less). I remember this, because I used to experiment with the vektah parser combinator.

Your linked article, "Exporting the parser into PHP code", claims that the parser can be exported to PHP code. If this is equivalent to a generator, then it should be similarly fast as a hand-written parser.

Of course, even if it would take 2x longer, does not mean we have to care, if overall it is still "fast enough".

stof · 2017-08-28T18:03:43Z

I'm still against the namespace change if it does not have a migration path:

if packages are not able to easily support both v1 and v2 of the library, it means that a project cannot migrate to v2 until all its dependencies using doctrine/annotations are migrated, and it cannot update other dependencies already migrated until that time. Given the number of packages relying on doctrine/annotations out there to parse annotations, such community split is a bad news
just migrating to v2 is not an option for packages needing to keep support for PHP 5.x (meaning that if they cannot support both versions, they will stay on v1 forever).

So this means we need a continuous migration path if you want to keep the namespace change.
This could be done by doing a new 1.x release adding class aliases using the new namespace. See how Twig did it for instance.

Hywan · 2017-08-29T08:26:19Z

This does not compare it with a hand-written parser, only with previous hoa parsers.

Yes it doesn't, but it gives an overview of the last big improvements :-). I should have clarified this, sorry.

In general, a hand-written or a generated parser should be faster than a parser combinator, or one that needs to interpret a piece of grammar in every step. Since it adds overhead and indirection to every micro-operation, the difference could be something like factor 2 (or more, or less). I remember this, because I used to experiment with the vektah parser combinator.

True and false (well, you have started your sentence with “In general” 😉). In PHP, a parser combinator might be slower than a generated parser because of function calls and indirections, but it will not be the bottleneck I guess. The real bottleneck is the data copy. In a parser combinator, you have to copy the data being parsed into each parser. Even if PHP does a COW (Copy-On-Write), each data split (substr) will generate a new copy, and so it's going to be slow. This particular problem can also be present in a generated parser, but the API surface is much smaller. For instance, the lexer for hoa/compiler does not consume the data by doing a substring, it just read it with an offset: https://github.com/hoaproject/Compiler/blob/c86ccfbce9b9cad17cf84ffdf5c505c695d83d7a/Llk/Lexer.php#L276-L282 This is highly optimized.

In a parser combinator however, the lexer and the parser phases are “merged”, so the memory peak should be smaller than in a generated parser. However, regarding the last improvement in hoa/compiler, the lexer now works as a buffered iterator, so the behavior is similar to a parser combinator: https://github.com/hoaproject/Compiler/blob/c86ccfbce9b9cad17cf84ffdf5c505c695d83d7a/Llk/Parser.php#L162-L165

A parser combinator is like a hand-written parser, except it has a predefined formalism, is more testable, is more re-usable etc. Compared to a generated parser, the API is larger. However, a generated parser can be seen as a parser combinator with a small API. hoa/compiler has only one method: _parse, which adapts its behavior whether it meets a token, a concatenation, a choice, or a repetition, which are the rules (the grammar description language intrinsics/constructions). One method also means a better caching by the VM, and the CPU.

This thread is not the place to debate about this, but: A generated parser, a parser combinator, or a hand-written parser can all be fast and efficient, or slow and ineffective. It really depends of how they are implemented. They all have pros and cons. I personally prefer a parser combinator when working with Rust (see nom) because it is testable and brings interesting garantees, while when working with PHP, I prefer a generated parser.

Your linked article, "Exporting the parser into PHP code", claims that the parser can be exported to PHP code. If this is equivalent to a generator, then it should be similarly fast as a hand-written parser.

I would claim that a hand-written parser is most of the time not fast. You have to re-optimise and re-implement everything, like the lexer (a good one is not simple) and the parser with all the optimisation. And the error-management, the AST builder, the memory management, the profiling etc. It's better to have a hackable compiler toolchain I guess.

But indeed, once a Hoa\Compiler\Llk\Parser is compiled into PHP code, it just creates an instance of Hoa\Compiler\Llk\Parser directly without loading the grammar from a textual file. It builds the grammar as a set of rules, which is fast to instanciate, and cachable by the VM. The lexing and parsing in themselves are optimized and use a very small API, which is also cachable correctly by the VM.

The most obvious way to be faster now is to use really good data structures instead of generic array, but we are limited by the language (I want php-ds in the core, pleaaase).

Of course, even if it would take 2x longer, does not mean we have to care, if overall it is still "fast enough".

Correct. I don't want to speak for the Doctrine team, but my understanding of the problem is the following: Drop a hand-written, hard to maintain, hacky, and maybe buggy parser by a formal parser which is easy to maintain and fast enough. hoa/compiler plays this role. Also, hoa/compiler brings interesting algorithms to generate data from a grammar (it is called Grammar-based Testing). More resources about this:

https://hoa-project.net/En/Literature/Hack/Compiler.html#Generation the documentation,
https://hoa-project.net/En/Literature/Research/Amost12.pdf the research paper,
https://keynote.hoa-project.net/Amost12/EDGB12.pdf the presentation of the research paper,
https://mnt.io/2014/09/30/generate-strings-based-on-regular-expressions/ explaining how to generate data from a regular expression.

These algorithms can help to test the Doctrine annotations, and DQL.

donquixote · 2017-08-29T16:37:04Z

I would claim that a hand-written parser is most of the time not fast. You have to re-optimise and re-implement everything, like the lexer (a good one is not simple) and the parser with all the optimisation. And the error-management, the AST builder, the memory management, the profiling etc. It's better to have a hackable compiler toolchain I guess.

Maybe I should have said "hardcoded" rather than "hand-written".
And instead of "should be faster", I should have said we are comparing the fastest theoretically possible parsers of each category. If you make all the right choices, what remains is the overhead and indirection. E.g. for this same reason, a parser in C would be faster than one in PHP.
You don't need a lexer or memory management, if all you do is string index lookups, like here:https://github.com/donquixote/annotation-parser/blob/1.0/src/Parser/AnnotationParser.php. Also an AST doesn't have to be complicated.

This is not an argument against the hoa parser, just a conversation.

Hywan · 2017-08-30T11:47:14Z

We agree 😃.

Hywan · 2017-09-14T23:20:01Z

src/Parser/grammar.pp

+#constant:
+    <identifier> (<colon> <colon> <identifier>)?
+
+string:


If a rule recognizes only one token, then it's faster to just use this token. I assume being fast is important in this context.

Same for the text, number, and identifier rules.

Annotations 2.0: Read annotations from functions

range-of-motion · 2019-08-26T14:16:37Z

What's going on with this? I'd very much like to see this because of multi line support.

alcaeus · 2020-04-01T11:27:52Z

Closing this PR: there has been a second effort to create 2.0 which has been just as successful as this one. We'll be revisiting this at a later date.

Ocramius added the enhancement label Mar 21, 2016

Ocramius added this to the v2.0.0 milestone Mar 21, 2016

stof reviewed Mar 21, 2016
View reviewed changes

yourwebmaker reviewed Mar 21, 2016
View reviewed changes

donquixote reviewed Aug 26, 2017

View reviewed changes

donquixote reviewed Aug 27, 2017

View reviewed changes

donquixote reviewed Aug 28, 2017

View reviewed changes

donquixote mentioned this pull request Aug 29, 2017

annotations 2.0: Rename library to doctrine/annotations2 ? #144

Closed

FractalizeR mentioned this pull request Sep 4, 2017

Multi-line annotation issues zircote/swagger-php#326

Closed

Hywan reviewed Sep 14, 2017

View reviewed changes

alcaeus mentioned this pull request Dec 15, 2017

The future of Common 3.0 doctrine/common#826

Open

Majkl578 mentioned this pull request Feb 3, 2018

Enhancement: Use doctrine/coding-standard #177

Closed

2 tasks

Majkl578 mentioned this pull request Apr 27, 2018

Open 2.0 - PHP 7.2, change namespace #193

Merged

Merge pull request #1 from lennerd/2.0

bd9ed95

Annotations 2.0: Read annotations from functions

Majkl578 changed the base branch from 2.0 to master May 7, 2018 23:07

Majkl578 mentioned this pull request Dec 18, 2018

[meta] Annotations 2.0 (ng) #232

Closed

27 tasks

Majkl578 mentioned this pull request Feb 23, 2019

Introduce annotation metadata #247

Merged

alcaeus mentioned this pull request Dec 16, 2019

Address deprecations from persistence doctrine/orm#7953

Merged

alcaeus force-pushed the master branch from 8a68219 to f9deab6 Compare April 1, 2020 11:21

alcaeus closed this Apr 1, 2020

alcaeus removed the enhancement label Apr 1, 2020

alcaeus removed this from the 2.0.0 milestone Apr 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotations 2.0 #75

Annotations 2.0 #75

FabioBatSilva commented Mar 20, 2016 •

edited

Loading

schmittjoh commented Mar 21, 2016

stof Mar 21, 2016

localheinz Mar 23, 2016

stof commented Mar 21, 2016

yourwebmaker Mar 21, 2016

guilhermeblanco Mar 21, 2016

donquixote Aug 26, 2017

donquixote Aug 26, 2017

donquixote Aug 27, 2017

donquixote Aug 27, 2017

donquixote commented Aug 27, 2017

donquixote Aug 28, 2017

donquixote commented Aug 28, 2017

donquixote Aug 28, 2017

donquixote Aug 28, 2017

donquixote Aug 28, 2017

stof Aug 28, 2017

Ocramius Aug 28, 2017

donquixote Aug 28, 2017

donquixote Aug 28, 2017

donquixote Aug 28, 2017

Ocramius Aug 28, 2017

Hywan commented Aug 28, 2017

donquixote commented Aug 28, 2017

stof commented Aug 28, 2017

Hywan commented Aug 29, 2017

donquixote commented Aug 29, 2017

Hywan commented Aug 30, 2017

Hywan Sep 14, 2017

range-of-motion commented Aug 26, 2019

alcaeus commented Apr 1, 2020

Annotations 2.0 #75

Annotations 2.0 #75

Conversation

FabioBatSilva commented Mar 20, 2016 • edited Loading

schmittjoh commented Mar 21, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stof commented Mar 21, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

donquixote commented Aug 27, 2017

Choose a reason for hiding this comment

donquixote commented Aug 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Hywan commented Aug 28, 2017

donquixote commented Aug 28, 2017

stof commented Aug 28, 2017

Hywan commented Aug 29, 2017

donquixote commented Aug 29, 2017

Hywan commented Aug 30, 2017

Choose a reason for hiding this comment

range-of-motion commented Aug 26, 2019

alcaeus commented Apr 1, 2020

FabioBatSilva commented Mar 20, 2016 •

edited

Loading