-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resizing During Training and Eval #8
Comments
Thank you for your interest in our work! The question you raised has already been considered in our experimental setup. Since our model was trained on lower-resolution images, it has not been exposed to high-resolution ones, making it unnecessary to test on high-resolution images. Of course, while we did not conduct related experiments, we speculate that the testing performance would improve with high-resolution images, as they contain more information than their lower-resolution counterparts. However, we believe that this change would have a minor impact on performance, which is not central to our core contributions. To facilitate easier evaluation of the model, we chose to assess it on low-resolution images of 1024x512. |
Thank you for the quick reply! When reporting other methods and comparing to yours, do you evaluate them under the same setting (at downsampled size)? |
If you're still seeking answers or further clarification, we encourage you to explore our latest checkpoints. Aimed at enhancing real-world applicability and showcasing the exceptional capabilities of our approach, we meticulously carried out two experimental series: synthetic-to-real and cityscapes-to-acdc. We've made available the corresponding checkpoints at Cityscapes and UrbanSyn+GTAV+Synthia, both of which have demonstrated remarkable results. To ensure peak performance, these configurations were rigorously trained and tested using their native resolutions. For usage instructions, refer to the discussion here. |
In our paper, the performance metrics for other methods were sourced directly from their original publications. Since the PEFT and DGSS methods mentioned in Tables 2 and 3 were not adapted for VFMs, we replicated them under the configurations previously described. In other words, every metric presented in each table either comes from its original publication or is obtained under configurations that are strictly identical to those used for our method in the same table. |
Hi ! I have two questions regarding the question above :
Thank you for your help |
|
Thank you very much for all your answers !! This is really helpful you said that you provided validation for DINOv2 on 1024x2048 resolution images, but in the link you provied, it's for 1024x1024. Am I missing something ? or is it just a typo ? Thank you ! |
During training, images will be resized to 1024x2048 and then cropped to 1024x1024. For validation, a sliding window of size 1024x1024 is used on the 1024x2048 images. Hence, I refer to it as a checkpoint at 1024x2048. |
Thank you very much for your help ! |
Hi! I noticed that the train pipeline for dg on gta-->cv train ons 512x512 crops on a downsampled gta image (1280, 720). However, during evaluation on cityscapes, you are evaluating on 512x512 crops on a downsampled cityscapes image (1024, 512).
Was this intended, as evaluation should occur on the original image size for cityscapes (2048, 1024)>
The text was updated successfully, but these errors were encountered: