Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FLUME-3233: only use inode to identify files then taildirSource will support file rename/rotation #336

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

pphust
Copy link

@pphust pphust commented Dec 22, 2020

The issue:
When file is renamed or rotated, just as it is in log4j or other similar log system, currently Flume taildirSource will treat it as a new file then all contents will be collected again. It will cause data duplicated, which has been described in FLUME-3233FLUME-3219FLUME-3094FLUME-3216 and FLUME-2777.
The general solution is only monitor original *.log and NOT monitor the renamed *.log.xxx. But for below two reasons, we must monitor both *.log and renamed *.log.xxx:
1、 Sometimes log system uses async writting. Contents may be flushed to disk after file is renamed. If we do not monitor renamed *.log.xxx, the content will only be sent out when Flume close inactive file. Though Flume will send it out finally, but it will cause sending delay and curreny the interval is decided by idleTimeout, default 120 seconds. In many cases it is unacceptable.
2、Sometimes both service and Flume are shutdown. Service is restarted firstly then it writes something to *.log and rename it as *.log.xxx before Flume is restarted successfully. If we do not monitor renamed *.log.xxx, the data will get lost certernly.

The solution:
The PR add a new inodeOnly paramater to reslove data duplication problem when monitoring both *.log and *.log.xx. And it will bring taildirSource ability of supporting file rename/rotation. By default, inodeOnly is false and Flume just works same with now. When inodeOnly in config is set as true, Flume only use inode to identify file then taildirSource will support file rename/rotation. And the above 2 problems will be solved perfectly.

@pphust
Copy link
Author

pphust commented Dec 22, 2020

@rgoers would you help to review it?

@pphust pphust force-pushed the fileRotationSupport branch from 60d05a5 to 3b9b99d Compare December 23, 2020 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant