Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filebeat][gcs] Added support for more mime types, offset tracking via cursor, automatic splitting at root level #34155

Merged
merged 15 commits into from
Jan 23, 2023

Conversation

ShourieG
Copy link
Contributor

@ShourieG ShourieG commented Jan 2, 2023

Type of change

  • Enhancement

What does this PR do?

This PR adds support for more mime types like ndjson, json.gz & json gzipped formats. This also adds support for the following: -

  1. Parsing multiline json files
  2. Offset tracking via cursor state
  3. Automatic splitting at root level object, if this object is an array.

Why is it important?

This greatly improves the functionality of the input and removes potential bugs that could have occurred when used at scale.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
    - [] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  1. Documentation.
  2. Config format.
  3. Code errors.
  4. Design choices.

Related issues

@ShourieG ShourieG requested a review from a team as a code owner January 2, 2023 10:58
@ShourieG ShourieG requested review from cmacknz and fearful-symmetry and removed request for a team January 2, 2023 10:58
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 2, 2023
@mergify
Copy link
Contributor

mergify bot commented Jan 2, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @ShourieG? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 2, 2023
@ShourieG ShourieG added needs_team Indicates that the issue/PR needs a Team:* label 8.7-candidate labels Jan 2, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 2, 2023
@ShourieG ShourieG added needs_team Indicates that the issue/PR needs a Team:* label backport-v8.6.0 Automated backport with mergify labels Jan 2, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 2, 2023
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 2, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-01-23T05:47:19.352+0000

  • Duration: 76 min 13 sec

Test stats 🧪

Test Results
Failed 0
Passed 2593
Skipped 168
Total 2761

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review

x-pack/filebeat/docs/inputs/input-gcs.asciidoc Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/input_test.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/input_test.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/state.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/state.go Outdated Show resolved Hide resolved
@ShourieG
Copy link
Contributor Author

@efd6 have updated most of the suggestions except two, for which have left comments.

@ShourieG
Copy link
Contributor Author

@efd6 updated the pr

x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Show resolved Hide resolved
x-pack/filebeat/input/gcs/job.go Outdated Show resolved Hide resolved
@ShourieG
Copy link
Contributor Author

@efd6 updated the PR with the latest changes

@efd6
Copy link
Contributor

efd6 commented Jan 23, 2023

This is an enhancement, so I'm not sure that it should be backported unless there is a specific reason to do so.

@ShourieG
Copy link
Contributor Author

This is an enhancement, so I'm not sure that it should be backported unless there is a specific reason to do so.

@efd6 so there are few customers who are eager to try the gcs integration/input and without the backport they would ideally have to wait till 8.7 to properly use it without issues. Since this is an early beta, I feel we can backport and with the release of 8.6.1 users will get a much more stable input with some essential features that were missing initially.

@ShourieG ShourieG merged commit 69ebd98 into elastic:main Jan 23, 2023
mergify bot pushed a commit that referenced this pull request Jan 23, 2023
…a cursor, automatic splitting at root level (#34155)

* initial commit -m

* added support for more mime type, off set tracking via cursor & root level split func

* updated asciidoc

* updated NOTICE.txt

* updated PR accroding to suggetions

* optimised code blocks as per pr suggetions

* addressed linting issues

* updated with PR suggetions

(cherry picked from commit 69ebd98)
@ShourieG ShourieG deleted the gcs/add_mime_types branch January 23, 2023 07:50
ShourieG added a commit that referenced this pull request Jan 23, 2023
…a cursor, automatic splitting at root level (#34155) (#34338)

* initial commit -m

* added support for more mime type, off set tracking via cursor & root level split func

* updated asciidoc

* updated NOTICE.txt

* updated PR accroding to suggetions

* optimised code blocks as per pr suggetions

* addressed linting issues

* updated with PR suggetions

(cherry picked from commit 69ebd98)

Co-authored-by: ShourieG <[email protected]>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
…a cursor, automatic splitting at root level (#34155)

* initial commit -m

* added support for more mime type, off set tracking via cursor & root level split func

* updated asciidoc

* updated NOTICE.txt

* updated PR accroding to suggetions

* optimised code blocks as per pr suggetions

* addressed linting issues

* updated with PR suggetions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.7-candidate backport-v8.6.0 Automated backport with mergify enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for more MIME types & multiline JSON to GCS input
3 participants