Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Yes" for all checkboxes does not work for all PDF rendering engines. #4055

Open
sarahkittyy opened this issue Nov 15, 2024 · 2 comments
Open

Comments

@sarahkittyy
Copy link

Description of the bug

Ran into this tricky issue. It's not true that /Yes is a valid ON state for every single checkbox widget in PDFs, like as implemented / specified in the docs.

Rather, 99% of PDF rendering engines automatically take "Yes" as an acceptable On state, even if it is not specified in the /AP/N dict of the widget.

Try iterating over all widgets in a PDF with checkboxes, like any IRS tax form and set them to "Yes". It will render correctly in browsers, but then try rendering it in SodaPDF's online editor, and the checkmarks will not appear. It's an example of a renderer that does not assume "Yes" is a valid default on state.

For my form specifically, a valid yes state is actually "/V /5". I tried setting widget.field_value = '5' and also widget.field_value = '/5' but there is some internal code that is automatically changing this value to OFF, and not respecting my input. widget.on_state() does not work since it just provides Yes and not '5' like I need.

Then I tried manually setting the XREF like pdf.xref_set_key(widget.xref, 'V', '5'). But after a widget.update(), this is changed. And without calling widget.update(), this xref change is not reflected in the resulting saved document.

In other pdf libraries like pdfrw, the change to /V is respected and checking the checkbox works as expected.

This is clearly a bug and the widget update() method needs to respect the value present in widget.field_value. In the mean time, do you have any recommendations to update a widget in the full document without calling widget.update()? Like if I do pdf.xref_set_key to change the widget's /V value manually, how can I have that change remain persistent until I call pdf.save()?

Thank you.

How to reproduce the bug

doc = pymupdf.open('f7004.pdf') # irs tax form 7004, any pdf with checkboxes will do
for page in doc:
  for widget in page.widgets():
    if widget.field_type == pymupdf.PDF_WIDGET_TYPE_CHECKBOX:
      print(pdf.xref_get_key(widget.xref, 'V')) # off state /Off
      widget.field_value = widget.button_states().get('down')[0] # for me, this is '5' or whatever. should be the set state.
      widget.update()
      print(pdf.xref_get_key(widget.xref, 'V')) # still /Off

PyMuPDF version

1.24.13

Operating system

Linux

Python version

3.12

@JorjMcKie
Copy link
Collaborator

Without a reproducing file we cannot deal with this post.
In this case we need an example where a "non-/Yes" value is provided which we do not handle as "ON".
Supporting non-/Yes as ON on input is an enhancement request, not a bug: the PDF spec clearly recommends using /Yes as the ON value:
image

@sarahkittyy
Copy link
Author

https://www.irs.gov/pub/irs-pdf/f7004.pdf will reproduce this issue with the code above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants