-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Add retry for AccessDeniedException in AbstractFileWatchingService #128653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Hi @jfreden, I've created a changelog YAML for you. |
Looks like the failing test is a known issue: #128604 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a test for this similar to AbstractFileWatchingServiceTests#testRegisterWatchKeyRetry
?
Thanks for the review @mark-vieira ! Great idea, I've added a test for that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
…lastic#128653) * Unmute testSymlinkUpdateTriggerReload * Add retry for AccessDeniedException in AbstractFileWatchingService * Update docs/changelog/128653.yaml
…lastic#128653) * Unmute testSymlinkUpdateTriggerReload * Add retry for AccessDeniedException in AbstractFileWatchingService * Update docs/changelog/128653.yaml
This PR addresses #128619 where this appears to be happening:
settings.json
is written toconfig/operator/..TIMESTAMP_TEMP_FOLDER_1/settings.json
config/operator/..DATA -> config/operator/..TIMESTAMP_TEMP_FOLDER_1
is createdconfig/operator/settings.json -> config/operator/..DATA/settings.json
is createdsettings.json
settings.json
is written toconfig/operator/..TIMESTAMP_TEMP_FOLDER_2/settings.json
config/operator/..DATA
is removedconfig/operator/..DATA -> config/operator/..TIMESTAMP_TEMP_FOLDER_2
is createdsettings.json
In (6) the
..DATA
symlink could be attempted to be resolved in parallel with the removal (because the files inconfig/operator
are being reprocessed), if that happensWindowsFS
could throw anAccessDeniedException
and therefore theAbstractFileWatchingService
throws an exception and stops. This is roughly how things work in Kubernetes so I feel hesitant modifying the test or muting this on Windows (even though kubernetes pods running windows seem unlikely + this is currently only used internally).The proposed fix in this PR is to add retries for
AccessDeniedException
.