image input size #5085

mahilaMoghadami · 2023-09-08T08:00:13Z

hello

what does this line in the config file mean?

MIN_SIZE_TRAIN: (800, 900, 1000, 1100, 1200)
MAX_SIZE_TRAIN: 1999

how the detector handles variable input size?
imagine input image is (850, 950) . how detector resize this image and proceed it in network?

thank you

chogerlate · 2023-09-23T00:27:59Z

MIN_SIZE_TRAIN: (800, 900, 1000, 1100, 1200) # resizing the shorter edge (randomly pick some value)
MAX_SIZE_TRAIN: 1999 # adjusting the longer edge. ensuring that the maximum edge of an processed image would not exceed 1999

These 2 lines represent the process of resizing the raw image to resized image, ensures that the aspect ratio of the image is maintained.

why ?:

Resizing an image in one step without considering the aspect ratio could distort the image. This is because the height and width of the image could be scaled by different factors, causing the objects in the image to appear stretched or squished. This distortion could negatively impact the performance of the model, as the features learned during training may not accurately represent the objects in the distorted image.

Answer to your two questions:

1. How does the detector handle variable input sizes?

Those two lines of code in your configuration are only the first step in ensuring that when you resize an image in your model pipeline, the aspect ratio of the input image is maintained. (training phase)
In inference phases, these two lines might not be needed.

2. Imagine the input image is (850, 950). How does a detector resize this image and proceed with it in the network?

resized the shorter edge (850) to some random value in your MIN_SIZE_TRAIN: (800, 900, 1000, 1100, 1200)
Checking the longer edge In this case, there is no need to adjust anything.
passing your processed image to the model pipeline

mahilaMoghadami added the documentation Problems about existing documentation or comments label Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image input size #5085

image input size #5085

mahilaMoghadami commented Sep 8, 2023

chogerlate commented Sep 23, 2023

image input size #5085

image input size #5085

Comments

mahilaMoghadami commented Sep 8, 2023

chogerlate commented Sep 23, 2023

why ?:

1. How does the detector handle variable input sizes?

2. Imagine the input image is (850, 950). How does a detector resize this image and proceed with it in the network?