Config parameters that influence the log retention time.
log.roll.hours # how long to produce a new log segment.
log.retention.hours # delete log file how long ago, only when there are more than 1 log segment files, kafka will delete the old one.
log.retention.bytes # trigger log clean thread when log size up to this limit.
log.segment.bytes # the max size of log segment, it the max size is reached, new segment will be create, default is 1G.
log delete policy will be triggered once one of the the log.retention.hours and log.retention.bytes is meet or both of them are meet.
only set log.retention.hours to some value could not ensure the message in kafka be deleted after the set value.
if the log clean condition is satisfied, kafka will delete or compact the old one, only if there are more than 1 log segment file.
then, how to pooduce new log segment file?
1.set log.roll.hours to a value less than log.retention.hours, this ensures there will be a new log segment when log.retention.hours is meet.
ps: set log.segment.bytes to a relatively small value, kafka will create new log segment when the segment size is larger than the set value, this not ensure log which is expired could be deleted, because of the case that messages may be not enough to full a log segment file even the log.segment.bytes is set to a relative small value.
if the log.retention.hours needs precision control, the log.roll.hours should be at a fraction of the log.retention.hours, and log.segment.delete.delay.ms should be set to 0(default is 6000 ms), log.retention.check.interval.ms should be set to a small value(default is 300000ms, too small check interval is not recomment, it will cost too many resources to do it.)
Finally, if the log 10 hours ago should be cleaned, what the config should be?
log.roll.hours = 5 # this ensure there will be more than 1 file, when log.retention.hours is meet, and the cleaner therad is triggered. other values such as 1, 2 also works.
log.retention.hours = 10 # this controls the log retention time.
if the log.roll.hours is larger than 10, there will be only 1 file after 10 hours pased, and kafka will not delete the log which exist longer than retention hours.