-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default constructors with full data #2945
Comments
I haven't figured out a way to export the macros yet. Currently they have to be invoked locally, as they need the data's module structure. Using
What if we make
The script could be configurable with a config file for locales, fallback, etc. |
If we check-in the built data (rather than build.rs), we could consider Still not sure about how to handle CLDR versions. A few choices:
Related: #2715 |
|
I like this. We get tons of control over the lifecycle of CLDR data in this crate. E.g. we could say that the crate will support up to 3 CLDR versions, and by default will be the latest (So using one of the cldr-xx features is unstable, but unstable in a very well defined way. Ideally we use deprecation warnings if possible; unsure if it is) Plus, it makes the CLDR version something that the final application can select: if there are intermediate crates using ICU4X default data, their choices do not artificially constrain the final application
I think this is a good idea, yeah.
We could use features. Perhaps some major defaults of "top X locales" and "all locales", plus individual features for each locale. Unsure.
It's ... possible! But I think this may have a problem with multiple crates: if two crates in the same deptree that depend on ICU4X default to this kind of thing, they'll end up stepping over each others' toes. I don't really like that. I think the way to make it pluggable is to have a feature that makes this use a GlobalPluggableProvider, which looks for a particularly defined C function to get the provider (e.g. your code must define |
Hmm. So the It still needs to work with DCE. |
For ICU4X docs, it would be nice if we could use the |
Yep
It would have to go through Any or Buffer provider, but you'd still get DCE normally, yes? It's no different from using a blob provider normally. |
That feels pretty brittle IMO (with baked data), we could generate testdata into the default crate, but then we won't be testing actual "normal" data loading paths, it would be a weird special thing we do. Basically, we can't pull in real baked testdata due to the circular dependency problem. We could do |
By DCE, I mean that individual data keys worth of data should be garbage collected. Actually I don't quite see how |
Oh, you wouldn't get that, no.
Every macro would just wrap the same provider in that case, a provider that calls the C function. |
We could have an extern C function for each individual data key. 🤔 |
Does this work? #[cfg(feature = "custom_global_data")]
include!(env!("ICU4X_CUSTOM_GLOBAL_DATA_PATH"));
#[cfg(not(feature = "custom_global_data"))]
include!("default_global_data/mod.rs"); If you want custom global data you need to enable the |
Yeah I like that |
Conclusion:
|
LGTM to me as well. Occurs to me that the long term name for this might be "autodata", since @robertbastian doesn't like "globaldata" as much. (we're not using "globaldata" publicly so far, this would mostly just be an internal name) |
I missed the end of this discussion and don't agree with the conclusion
I think forcing users to make this explicit choice is bad, and we will keep seeing users struggle. Understanding what
This is a high barrier of entry. There should be an easy path that doesn't require this level of knowledge, and ICU4X should reach those users. A user should be able to use the latest version of ICU4X to do i18n without needing to learn what the word "provider" means. Also I really don't like the long names, And, extending this to |
Bikeshed for constructor names: #3536 |
I'll note that the three bullet points (especially the first two) go together; I'm more comfortable using the new constructors in docs tests if they signal data provenance. |
I was thinking that perhaps a more scalable way to handle recursive constructors would be to do what LocaleCanonicalizer is doing: the lowest-level "core" constructor is one that contains data being loaded for this specific class, with all other classes passed in as arguments. All the other constructors simply call the appropriate constructor on child classes and then invoke the core constructor with the results. |
FFI API discussion:
LGTM: @robertbastian, @sffc, @eggrobin, @Manishearth |
We have built data providers as a first-class feature in ICU4X. We currently tutor clients on how to build their data file and detail all the knobs at their disposal, which is essential to ICU4X's mission. Looking forward to 2.0 and beyond, though, we want to guarantee ICU4X's success when it comes to ease of use and adoption.
Eventually I think we want to work toward something like
I think we can make this work with an optional dependency
icu_provider_default
crate that exports declarative macros, one per key, which are then invoked in the context of the component crate. This way we can avoid the cyclic dependency issue of baked data needing to import everyone's provider modules.There are numerous questions to be answered, including:
icu_provider_default
crate?Also see:
The text was updated successfully, but these errors were encountered: