-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deduplication Bug in Semgrep JSON Report Causing Mitigation of the Original Finding #11227
Comments
I tested the same scenario, but this time I re-uploaded the same report using the UI (Ad Hoc Import), and the problem did not occur. It seems the issue only arises when importing the report through subsequent API imports, and possibaly related to engagement type, which are different (CI/CD vs Interactive). Here are the steps I followed:
I plan to create an engagement with the type "Interactive" and then upload the test report into that engagement. This approach will be used instead of "auto creating context" in the import-scan request to determine if it resolves the issue. Update: |
@farsheedify
HTH |
Thanks for your reply @mtesauro
I can provide more accurate test data/scenario if it can help. |
@farsheedify This is super interesting. Is it possible you can add a sanitized scan file that shows this behavior so we can reproduce the issue? |
Sure. I ran the scan on some sample files. To reproduce the issue, first upload semgrep-report_Line31.json using the following curl command: curl -X "POST" "https://demo.defectdojo.org/api/v2/import-scan/" -H "accept: application/json" -H "Authorization: Token ${{ secrets.DEFECTDOJO_TOKEN }}" -H "Content-Type: multipart/form-data" -F [email protected] -F "product_type_name=clubpay" -F "active=true" -F "verified=true" -F "close_old_findings=true" -F "engagement_name=${GITHUB_RUN_ID}(SAST)" -F "build_id=${GITHUB_RUN_ID}" -F "minimum_severity=Info" -F "close_old_findings_product_scope=true" -F "scan_date=$TODAY" -F "engagement_end_date=$TODAY" -F "commit_hash=${GITHUB_SHA}" -F "product_name=${product_name}" -F "auto_create_context=true" -F "scan_type=Semgrep JSON Report" -F "tags=${GITHUB_REF_NAME}" -F "source_code_management_uri=${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}" Next, upload semgrep-report_second_run_Line24.json using the same curl command. After the second upload, view all findings for the product. You will see two findings with the title: python.lang.security.audit.dangerous-subprocess-use-tainted-env-args.dangerous-subprocess-use-tainted-env-args The first finding (Original Finding) will be marked as "inactive, verified, mitigated" and is detected on sample/brute.py (Line 31). The second finding will be marked as "inactive, duplicate" and is detected on sample/brute.py (Line 24). |
@farsheedify Thanks this helps a ton. |
Bug description
In our organization, we are experiencing an issue with SAST scans where changes in code (the position of the vulnerable line) cause DefectDojo to mark the finding as mitigated and create a new finding for the same bug at the new line position. Currently, we are uploading the report with the scan type set to GitLab SAST Report in our CI/CD. As I understand, the default behavior for deduplication of static scans includes the line number in the algorithm.
I noticed that for Semgrep JSON Reports, DefectDojo uses the fingerprint ID provided by Semgrep for deduplication. I decided to test this change in the demo instance of DefectDojo to see if it could resolve our issue with SAST findings being closed and reopened due to line changes.
I scanned a sample repository on GitHub with Semgrep and generated a JSON report. I uploaded the report using the following curl command on the demo server:
curl -X "POST" "https://demo.defectdojo.org/api/v2/import-scan/" -H "accept: application/json" -H "Authorization: Token ***" -H "Content-Type: multipart/form-data" -F [email protected] -F "product_type_name=ptype_name" -F "active=true" -F "verified=true" -F "close_old_findings=true" -F "engagement_name=${GITHUB_RUN_ID}(SAST)" -F "build_id=${GITHUB_RUN_ID}" -F "minimum_severity=Info" -F "close_old_findings_product_scope=true" -F "scan_date=$TODAY" -F "engagement_end_date=$TODAY" -F "commit_hash=${GITHUB_SHA}" -F "product_name=${product_name}" -F "auto_create_context=true" -F "scan_type=Semgrep JSON Report" -F "tags=${GITHUB_REF_NAME}" -F "source_code_management_uri=${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}"
Originally, a bug was found on the 26th line of a file. After changing that line of code to the 21st line and rerunning the scan, the original bug was marked as mitigated, and the new bug was marked as a duplicate. Now, the bug has disappeared from the active bugs. The image below summarizes the situation. The bug and the file are the same. It seems that DefectDojo correctly marked the new finding as a duplicate of the previous finding (since only the line changed, but the fingerprints are the same). However, it incorrectly marked the original bug as mitigated, causing it to be removed from the active findings.
Steps to reproduce
Steps to reproduce the behavior:
Expected behavior
When rerunning the scan on the same file, if there is a change in the line of code, DefectDojo should:
1.Check the Fingerprint First: Use the fingerprint in the Semgrep report to identify findings.
2. Mark New Findings Correctly: Mark newly imported findings with similar instances (with different line numbers) as active and verified, since they can be assumed the latest instances.
3. Handle Original Findings Appropriately: Mark the original findings as duplicates (since they have outdated line of code).
Since the new finding contains the latest, updated line of code, it should be kept as the active finding. Ideally, DefectDojo should update the original finding's line of code without creating a new finding. This approach ensures that:
The text was updated successfully, but these errors were encountered: