Refactor scraper to exit properly when exceptions are raised #288

dan-niles · 2024-08-07T13:50:13Z

Changes:

Return 1 whenever a exception is raised.
Move self.zim_file.finish() to an else section in the try-exception block.
Move the removal of temp directory and files into the finally section in the try-exception block.
Create delete_callback in utils.py since the one in zimscraperlib.filesystem does not check for a file's existence before trying to delete it.
- Whenever an exception is raised there could be callbacks that were already initiated. Since the temp folder gets deleted now, these callbacks would throw an error saying FileNotFound. Checking for the existence of the files let's us avoid these errors.

codecov · 2024-08-07T13:52:21Z

Codecov Report

Attention: Patch coverage is 0% with 54 lines in your changes missing coverage. Please review.

Project coverage is 1.53%. Comparing base (60b85b5) to head (d4e3386).

Files	Patch %	Lines
scraper/src/youtube2zim/scraper.py	0.00%	50 Missing ⚠️
scraper/src/youtube2zim/utils.py	0.00%	4 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##            main    #288      +/-   ##
========================================
- Coverage   1.54%   1.53%   -0.01%     
========================================
  Files         11      11              
  Lines       1102    1105       +3     
  Branches     162     164       +2     
========================================
  Hits          17      17              
- Misses      1085    1088       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

benoit74

I think this code makes much more sense. I did not reproduced the problem you had on your tests you showed me.

My only concern is that this change shows that we do not delete the build_dir should a problem occur somewhere at the beginning of the function... which is a bit sad.

I suggest we encapsulate the whole method in the try block.

Since we do not call finish when an exception occurs, we can get rid of self.zim_file.can_finish = False statements, so that we do not need to take care of whether self.zim_file has been initialized or not.

I let you test this as well but on my machine it works as expected

dan-niles · 2024-08-09T03:59:05Z

I moved all the code inside the run method into the try block and removed the self.zim_file.can_finish = False statements in b8b9b4b.

benoit74

Sorry about that, but I think we need to fix the CHANGELOG as well, we've done way more than only handle "too many download failed" errors. You might keep existing entry and add a new one (mentioning something around cleaning up properly on exception during scraper run) pointing to this PR number (this is what we usually do when we have a change without a linked issue).

benoit74

LGTM, thank you

Refactor scraper.py to exit properly when exceptions are raised

ebe3fb9

dan-niles self-assigned this Aug 7, 2024

dan-niles requested a review from benoit74 August 7, 2024 13:50

dan-niles marked this pull request as ready for review August 7, 2024 13:53

benoit74 requested changes Aug 8, 2024

View reviewed changes

Move all code inside the run method into a try block

b8b9b4b

dan-niles requested a review from benoit74 August 9, 2024 03:59

benoit74 requested changes Aug 9, 2024

View reviewed changes

Update CHANGELOG

d4e3386

dan-niles requested a review from benoit74 August 9, 2024 07:06

benoit74 approved these changes Aug 9, 2024

View reviewed changes

benoit74 merged commit 05d2b5d into main Aug 9, 2024
10 checks passed

benoit74 deleted the fix-exception-issue branch August 9, 2024 07:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor scraper to exit properly when exceptions are raised #288

Refactor scraper to exit properly when exceptions are raised #288

dan-niles commented Aug 7, 2024

codecov bot commented Aug 7, 2024 •

edited

Loading

benoit74 left a comment

dan-niles commented Aug 9, 2024

benoit74 left a comment

benoit74 left a comment

Refactor scraper to exit properly when exceptions are raised #288

Refactor scraper to exit properly when exceptions are raised #288

Conversation

dan-niles commented Aug 7, 2024

codecov bot commented Aug 7, 2024 • edited Loading

Codecov Report

benoit74 left a comment

Choose a reason for hiding this comment

dan-niles commented Aug 9, 2024

benoit74 left a comment

Choose a reason for hiding this comment

benoit74 left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 7, 2024 •

edited

Loading