Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image input size #5085

Open
mahilaMoghadami opened this issue Sep 8, 2023 · 1 comment
Open

image input size #5085

mahilaMoghadami opened this issue Sep 8, 2023 · 1 comment
Labels
documentation Problems about existing documentation or comments

Comments

@mahilaMoghadami
Copy link

hello

what does this line in the config file mean?

MIN_SIZE_TRAIN: (800, 900, 1000, 1100, 1200)
MAX_SIZE_TRAIN: 1999

  1. how the detector handles variable input size?
  2. imagine input image is (850, 950) . how detector resize this image and proceed it in network?

thank you

@mahilaMoghadami mahilaMoghadami added the documentation Problems about existing documentation or comments label Sep 8, 2023
@chogerlate
Copy link

MIN_SIZE_TRAIN: (800, 900, 1000, 1100, 1200) # resizing the shorter edge (randomly pick some value)
MAX_SIZE_TRAIN: 1999 # adjusting the longer edge. ensuring that the maximum edge of an processed image would not exceed 1999

These 2 lines represent the process of resizing the raw image to resized image, ensures that the aspect ratio of the image is maintained.

why ?:

Resizing an image in one step without considering the aspect ratio could distort the image. This is because the height and width of the image could be scaled by different factors, causing the objects in the image to appear stretched or squished. This distortion could negatively impact the performance of the model, as the features learned during training may not accurately represent the objects in the distorted image.

Answer to your two questions:

1. How does the detector handle variable input sizes?

  • Those two lines of code in your configuration are only the first step in ensuring that when you resize an image in your model pipeline, the aspect ratio of the input image is maintained. (training phase)
  • In inference phases, these two lines might not be needed.

2. Imagine the input image is (850, 950). How does a detector resize this image and proceed with it in the network?

  1. resized the shorter edge (850) to some random value in your MIN_SIZE_TRAIN: (800, 900, 1000, 1100, 1200)
  2. Checking the longer edge In this case, there is no need to adjust anything.
  3. passing your processed image to the model pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Problems about existing documentation or comments
Projects
None yet
Development

No branches or pull requests

2 participants