-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(parsers) Parser registry #723
Conversation
099dd2f
to
e7f21ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it doesn't really do nothing alone, as you said, I'm interested in having this merged. Thanks for this !
src/Core/Scheduler/Scheduler.js
Outdated
@@ -175,6 +202,18 @@ Scheduler.prototype.getProtocolProvider = function getProtocolProvider(protocol) | |||
return this.providers[protocol]; | |||
}; | |||
|
|||
|
|||
Scheduler.prototype.addFormatParser = function addParser(format, parser) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not take an array as format
, so we could batch a lot of format into one parser ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on this... parser
is provided provided as a class rather than an instance so the change is only cosmetic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, parsers now have mimetypes, extensions and format strings that are all used as registration keys
src/Core/Scheduler/Scheduler.js
Outdated
import B3dmParser from '../../Parser/B3dmParser'; | ||
import PotreeBinParser from '../../Parser/PotreeBinParser'; | ||
import PotreeCinParser from '../../Parser/PotreeCinParser'; | ||
// import GLTFLoader from '../../Parser/GLTFLoader'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to support it before merging ? If not, please remove those comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will remove them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
What I meant by "parser needs more work" in the other PR: as we create a registry, we should use it in other providers instead of referencing parsers directly. This will allow to be able to use different parsers in a provider according to the needed output format. This work should be done in this PR to avoid an inconsistent state in the codebase (I don't expect it to be too complicated anyway). Thanks! |
Even better: why not moving them out of providers, and have something like this: return provider.executeCommand(cmd)
.then(blob => parser.parse(blob, options))
.then((result) => {
... The problem here could be that we don't have anything to parse for texture, but a "fake" parser for image could be created: I can also see (later on) having the |
I quickly added my thoughts here zarov@cc623b6 |
This PR uses the scheduler as registry for parsers, but it doesn't use the scheduler for the parsing itself. That doesn't make too much sense to me.
That may be just me but I actually don't see the value of doing the parsing in the scheduler. What is the benefit compared to doing it in the provider? I understand that the core thing of the scheduler is the priority queue. Here for the parsing the priority queue is not used, so what's the point of doing the parsing within the scheduler? |
@elemoine the idea is to separate more the things we currently have in itowns: instead of adding everything in providers, we could have a simpler chain, managed here in the scheduler. Provider -> (Fetcher ->) Parser -> Stylizer I think having those blocks called somewhere will be a pain to maintain later, and moving it all outside of providers makes things easier. See the discussion in #182 |
To add on what @zarov said: parser are also good candidates for workers. If we start using workers, we'll need to schedule call to them too. This is a step is this direction. |
As I anticipated, that means it turns into a big PR, but you asked for it...
I will probably not have time in the near future to go much further on this PR, so do not ask me to cover all providers and parsers :). In the meantime, I created a branch for using the format registry on 3dtiles : format_registry_3dtiles. I think it works but I have not tested it much and changes are not trivial so it may deserve its own PR. |
fbd4c03
to
d4806c1
Compare
|
:) Yes of course, but I sincerely think raster provider is obsolete as it contains too much hardcoded stuff and that the existing functionnality of rasterizing single files will be superseded by a properly configured FileProvider in the same way that the new functionnality of rasterizing WFS features will be offered by a properly configured WfsProvider . So that will be treated in another PR once this PR and #705 are merged. (this PR is already big enough), similar to the upgrade of the 3dTilesProvider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(comments of @zarov have been addressed)
src/Core/Scheduler/Scheduler.js
Outdated
import B3dmParser from '../../Parser/B3dmParser'; | ||
import PotreeBinParser from '../../Parser/PotreeBinParser'; | ||
import PotreeCinParser from '../../Parser/PotreeCinParser'; | ||
// import GLTFLoader from '../../Parser/GLTFLoader'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/Core/Scheduler/Scheduler.js
Outdated
@@ -175,6 +202,18 @@ Scheduler.prototype.getProtocolProvider = function getProtocolProvider(protocol) | |||
return this.providers[protocol]; | |||
}; | |||
|
|||
|
|||
Scheduler.prototype.addFormatParser = function addParser(format, parser) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, parsers now have mimetypes, extensions and format strings that are all used as registration keys
Do we even want to introduce a At the last codesprint we exchanged on iTowns' API and the need of moving away from the current layer/provider design and instead implement something similar to OpenLayers or vector-tiles = splitting sources and layers. So IMHO we should work toward this goal instead of adding new features built on the current broken layer/provider API. |
We definitely agree that legacy providers have to be removed/reworked/replaced and that the provider/layer API should be redesigned. To stay on topic here, I encourage you to use more relevant PR/issues :
Putting that aside, do you have any thoughts on the scope of this PR, which is introducing a parser registry, normalizing the parser interface and using it in the providers of geometry layers ? (putting aside 3dtiles to limit the size of the PR, and the TileProvider which is not reading from any source as far as I understand) |
d4806c1
to
bb44641
Compare
bb44641
to
9a9f019
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbredif if you don't have time to work on this, I'll gladly take over, in another branch and PR (as I'm really interested to see more advance on this as I said). It needs more work here, in particularly in documentation.
that.parsers[format].push(parser); | ||
} | ||
register(parser.format); | ||
parser.extensions.forEach(register); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be simple and only rely on format
. We ditched mimetypes
in #597, and I fail to see how we could benefit by having extensions
and mimetypes
on top of format
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not so sure either, but these 3 (extensions, mimetypes, format) seem complementary :
- extension may be the only thing you get when you are handed over a file (by url or drag n drop)
- mimetype is needed by wfs (in the query string)
- mimetype may be provided in the response header of fetches
- format is for me the name of the parser given by itowns, with more uniform naming conventions
- ...
What might still be missing is some kind of accept
function that analyse the file content (header?) to determine whether it is an acceptable file to parse (eg reading the first 4 bytes of tiles in 3dTiles formats) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extension may be the only thing you get when you are handed over a file (by url or drag n drop)
Extension is not really reliable, and can be confusing: a xml
file can be a KML, GPX or even something else.
mimetype is needed by wfs (in the query string)
It isn't really a mimetype, see for example the (usecase in GeoServer)[http://docs.geoserver.org/stable/en/user/services/wfs/outputformats.html]. But I see the necessity here.
mimetype may be provided in the response header of fetches
Yes, but then we should maybe rework on the whole format/mimetype thing.
format is for me the name of the parser given by itowns, with more uniform naming conventions
Agreed, that's why I think it should only be registered with the format
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with @zarov : I'd rather not have 3 different ways of declaring a format.
What we could have, though, is a format detector: if the format isn't explicitely specified, we can try to deduce from: url, filename, magic bytes, etc (something similar to what is done in 3dTileProvider and RasterProvider)
@@ -256,6 +274,29 @@ Scheduler.prototype.getProtocolProvider = function getProtocolProvider(protocol) | |||
return this.providers[protocol]; | |||
}; | |||
|
|||
|
|||
Scheduler.prototype.addFormatParser = function addFormatParser(parser) { | |||
// eslint-disable-next-line no-console |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not forget to remove the log before merging, you should remove this comment imho ;)
The Parser normalization part of this PR is for me compulsory, but I must confess that I am now not entirely sold at a parser registry maintained by the scheduler. In fact I started on the premise that the protocol provider registry was a good thing, but I am not sure of it any more : if we get rid of JSON layers, we could ask the user code to create protocol and parser objects and hand them over in the layer options which would not require any protocol registry. I will open soon a new issue to discuss this. For now, a middle ground could be to have each provider have its own registry as a simple array of default parsers in the preprocess function, but leave the opportunity to pass in the layer options a WDYT?
Please go ahead 😀 . |
It is an option indeed, and then we wouldn't have to maintain a list of relation between format/mimetype and parsers.
I feel that going to this is like the current
It is to me the best solution, everything assembled in one place, the Scheduler, based on the configuration given by the user. We could drop the list of default providers and let the user decide which one to use with their layer. But I really don't think we will benefit by spliting everything in multiple class. |
Let me rephrase myself to show that I am sure we are already on the same page 😃 . I agree that file extension only is not reliable (or even sometimes against the spec as for 3dtiles), but sometimes it works and some other times it may still allow to filter out easily some parsers.
In this case, maybe parsers with mentions in the wfs spec may have an optional 'wfsOutputFormat' string. (maybe with some defaults for others)
I agree, it is a small step to prepare for an upcoming more drastic decoupling that would require more discussion. but if there is already an agreement for a centralized parser registry, then go for it ! |
Following iTowns#723 and the model of Providers, a Parser registry has been partially added to the Scheduler. You can now register a Parser using scheduler.addFormatParser(formatName, parser). In the Scheduler queue, the Parser will be called right after the Provider. It takes the datablob returned by the Provider, parses it, and return it. The parser selection relies on the format setted in the layer. For layers that don't have a format, or have a format that is not supported, a fake parser, returning immediatly the blob, is setted. This commit introduces three parsers: GeoJson, Gpx and Kml. This allows to refactor a bit the RasterProvider, in hope of letting go of it in the future.
Following iTowns#723 and the model of Providers, a Parser registry has been partially added to the Scheduler. You can now register a Parser using scheduler.addFormatParser(formatName, parser). In the Scheduler queue, the Parser will be called right after the Provider. It takes the datablob returned by the Provider, parses it, and return it. The parser selection relies on the format setted in the layer. For layers that don't have a format, or have a format that is not supported, a fake parser, returning immediatly the blob, is setted. This commit introduces three parsers: GeoJson, Gpx and Kml. This allows to refactor a bit the RasterProvider, in hope of letting go of it in the future. BREAKING CHANGE: GpxParser doesn't return a THREE.Mesh anymore.
Following iTowns#723 and the model of Providers, a Parser registry has been partially added to the Scheduler. You can now register a Parser using scheduler.addFormatParser(formatName, parser). In the Scheduler queue, the Parser will be called right after the Provider. It takes the datablob returned by the Provider, parses it, and return it. The parser selection relies on the format setted in the layer. For layers that don't have a format, or have a format that is not supported, a fake parser, returning immediatly the blob, is setted. This commit introduces three parsers: GeoJson, Gpx and Kml. This allows to refactor a bit the RasterProvider, in hope of letting go of it in the future. BREAKING CHANGE: GpxParser doesn't return a THREE.Mesh anymore.
Following iTowns#723 and the model of Providers, a Parser registry has been partially added to the Scheduler. You can now register a Parser using scheduler.addFormatParser(formatName, parser). In the Scheduler queue, the Parser will be called right after the Provider. It takes the datablob returned by the Provider, parses it, and return it. The parser selection relies on the format setted in the layer. For layers that don't have a format, or have a format that is not supported, a fake parser, returning immediatly the blob, is setted. This commit introduces three parsers: GeoJson, Gpx and Kml. This allows to refactor a bit the RasterProvider, in hope of letting go of it in the future. BREAKING CHANGE: GpxParser doesn't return a THREE.Mesh anymore.
Following iTowns#723 and the model of Providers, a Parser registry has been partially added to the Scheduler. You can now register a Parser using scheduler.addFormatParser(formatName, parser). In the Scheduler queue, the Parser will be called right after the Provider. It takes the datablob returned by the Provider, parses it, and return it. The parser selection relies on the format setted in the layer. For layers that don't have a format, or have a format that is not supported, a fake parser, returning immediatly the blob, is setted. This commit introduces three parsers: GeoJson, Gpx and Kml. This allows to refactor a bit the RasterProvider, in hope of letting go of it in the future. BREAKING CHANGE: GpxParser doesn't return a THREE.Mesh anymore.
#966 covers this, we can close this PR |
Description
This PR introduces a registry of format parsers in the scheduler, similar to the protocol registry that is already there.
Motivation and Context
The goal is to decouple format parsing from the providers, so that protocols, formats and styling may be mixed and matched (instead of current providers that hardcode formats).
(extracted from #705)