Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor scraper to exit properly when exceptions are raised #288

Merged
merged 3 commits into from
Aug 9, 2024

Conversation

dan-niles
Copy link
Collaborator

Fix #285

Changes:

  • Return 1 whenever a exception is raised.
  • Move self.zim_file.finish() to an else section in the try-exception block.
  • Move the removal of temp directory and files into the finally section in the try-exception block.
  • Create delete_callback in utils.py since the one in zimscraperlib.filesystem does not check for a file's existence before trying to delete it.
    • Whenever an exception is raised there could be callbacks that were already initiated. Since the temp folder gets deleted now, these callbacks would throw an error saying FileNotFound. Checking for the existence of the files let's us avoid these errors.

@dan-niles dan-niles self-assigned this Aug 7, 2024
@dan-niles dan-niles requested a review from benoit74 August 7, 2024 13:50
Copy link

codecov bot commented Aug 7, 2024

Codecov Report

Attention: Patch coverage is 0% with 54 lines in your changes missing coverage. Please review.

Project coverage is 1.53%. Comparing base (60b85b5) to head (d4e3386).

Files Patch % Lines
scraper/src/youtube2zim/scraper.py 0.00% 50 Missing ⚠️
scraper/src/youtube2zim/utils.py 0.00% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##            main    #288      +/-   ##
========================================
- Coverage   1.54%   1.53%   -0.01%     
========================================
  Files         11      11              
  Lines       1102    1105       +3     
  Branches     162     164       +2     
========================================
  Hits          17      17              
- Misses      1085    1088       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dan-niles dan-niles marked this pull request as ready for review August 7, 2024 13:53
Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code makes much more sense. I did not reproduced the problem you had on your tests you showed me.

My only concern is that this change shows that we do not delete the build_dir should a problem occur somewhere at the beginning of the function... which is a bit sad.

I suggest we encapsulate the whole method in the try block.

Since we do not call finish when an exception occurs, we can get rid of self.zim_file.can_finish = False statements, so that we do not need to take care of whether self.zim_file has been initialized or not.

I let you test this as well but on my machine it works as expected

@dan-niles
Copy link
Collaborator Author

I moved all the code inside the run method into the try block and removed the self.zim_file.can_finish = False statements in b8b9b4b.

@dan-niles dan-niles requested a review from benoit74 August 9, 2024 03:59
Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that, but I think we need to fix the CHANGELOG as well, we've done way more than only handle "too many download failed" errors. You might keep existing entry and add a new one (mentioning something around cleaning up properly on exception during scraper run) pointing to this PR number (this is what we usually do when we have a change without a linked issue).

@dan-niles dan-niles requested a review from benoit74 August 9, 2024 07:06
Copy link
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@benoit74 benoit74 merged commit 05d2b5d into main Aug 9, 2024
10 checks passed
@benoit74 benoit74 deleted the fix-exception-issue branch August 9, 2024 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Youtube videos download is failing but scraper succeeds
2 participants