-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display the entire IWP layer #24
Comments
Reminders:
|
High Ice IWP run 2/21 - Out of memory error on Delta error cancelled staging
Staging
11,954 staged files were transferred to /scratch and job was cancelled after this, ran for 4.5 hours. |
High Ice IWP run 2/22
Staging
Merging StagedWent well overall, got an error output, but this is the only one I saw: By the time merging concluded, the head node contained 15,113 files (2.03 GB) Raster Highest
Raster Lower
Web Tiling
Done! |
Investigating the scarcity of IWP in new web tilesThe web tiles produced by the new batch of IWP data are far more scarce than the last batch of web tiles.
Feb 22, 2023 workflow run:
|
Category: Permafrost Subsurface Features |
IWP dataset on Google Kubernetes EngineWith the successful execution of a small run of the kubernetes & parsl workflow on the Google Kubernetes Engine (GKE) (nice work @shishichen! 🎉), we have an updated game plan for processing the entire IWP workflow (high, med, and low) within 1 run (with deduplication between all regions and adjacent files).
|
@julietcohen @shishichen a quick thought as we're preparing for this layer integration - this is probably obvious to you, but I thought I'd throw it out there just in case. As the high, medium, and low images have been tiled and deduplicated separately, we need to combine the two output datasets, dealing with duplicate polygons. I think the main issue is that we need to deduplicate the regions where High data overlaps with Med/Low data. This is not the whole dataset, and should primarily be on the boundaries of where the datasets overlap. If we query to find the list of tiles/images that overlap at the boundaries of those datasets, that list should be much smaller than the full list of all dataset images and tiles, and would save a huge amount of processing time, at the cost of a more complicated selection process for images and then a merging process of old and new tiles. As an example, I made up the following scenario with High (grey) and Med/Low (salmon) images. In this case, only images H1, H2, ML1, and ML2 need to be reprocessed, and they only affect the tiles in rows 3 and 4 -- the tiles in rows 1, 2, 5, and 6 can be copied across straight to the output dataset without any reprocessing. All of this can be determined ahead of time via calculations on the image footprints, which should be very fast. Does that make sense to you? One thing I wondered about was whether the images like H3 that overlap H1 in row 3 would have an impact on tile row 3. Need to think about that. |
Thanks for the description and the visual, Matt! That all aligns with my understanding as well. Reminders for where the data is stored and published: DOI: A2KW57K57This DOI is associated with the published metadata package that will be updated with the tiles that have all been deduplicated between high, med, and low.
DOI: A24F1MK7QThis DOI is not associated with a metadata package. This DOI only exists as a subdirectory within
|
This issue is to track the progress of generating web tiles & 3dtiles for the entire Ice Wedge Polygon dataset
The text was updated successfully, but these errors were encountered: