Skip to content

Add retry for AccessDeniedException in AbstractFileWatchingService #128653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 3, 2025

Conversation

jfreden
Copy link
Contributor

@jfreden jfreden commented May 30, 2025

This PR addresses #128619 where this appears to be happening:

  1. settings.json is written to config/operator/..TIMESTAMP_TEMP_FOLDER_1/settings.json
  2. A symlink config/operator/..DATA -> config/operator/..TIMESTAMP_TEMP_FOLDER_1 is created
  3. A symlink config/operator/settings.json -> config/operator/..DATA/settings.json is created
  4. File watcher picks up settings.json
  5. An updated settings.json is written to config/operator/..TIMESTAMP_TEMP_FOLDER_2/settings.json
  6. Symlink config/operator/..DATA is removed
  7. Symlink config/operator/..DATA -> config/operator/..TIMESTAMP_TEMP_FOLDER_2 is created
  8. File watcher picks up settings.json

In (6) the ..DATA symlink could be attempted to be resolved in parallel with the removal (because the files in config/operator are being reprocessed), if that happens WindowsFS could throw an AccessDeniedException and therefore the AbstractFileWatchingService throws an exception and stops. This is roughly how things work in Kubernetes so I feel hesitant modifying the test or muting this on Windows (even though kubernetes pods running windows seem unlikely + this is currently only used internally).

The proposed fix in this PR is to add retries for AccessDeniedException.

@jfreden jfreden added the test-windows Trigger CI checks on Windows label May 30, 2025
@jfreden jfreden changed the title Unmute testSymlinkUpdateTriggerReload Add retry for AccessDeniedException in AbstractFileWatchingService May 30, 2025
@jfreden jfreden added :Core/Infra/Settings Settings infrastructure and APIs >bug labels May 30, 2025
@jfreden jfreden marked this pull request as ready for review May 30, 2025 11:37
@jfreden jfreden requested a review from a team as a code owner May 30, 2025 11:37
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine
Copy link
Collaborator

Hi @jfreden, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label May 30, 2025
@jfreden
Copy link
Contributor Author

jfreden commented May 30, 2025

Looks like the failing test is a known issue: #128604

Copy link
Contributor

@mark-vieira mark-vieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a test for this similar to AbstractFileWatchingServiceTests#testRegisterWatchKeyRetry?

@jfreden
Copy link
Contributor Author

jfreden commented Jun 2, 2025

Thanks for the review @mark-vieira ! Great idea, I've added a test for that.

@jfreden jfreden requested a review from mark-vieira June 2, 2025 07:55
Copy link
Contributor

@mark-vieira mark-vieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jfreden jfreden merged commit 2696451 into elastic:main Jun 3, 2025
21 of 23 checks passed
joshua-adams-1 pushed a commit to joshua-adams-1/elasticsearch that referenced this pull request Jun 3, 2025
…lastic#128653)

* Unmute testSymlinkUpdateTriggerReload

* Add retry for AccessDeniedException in AbstractFileWatchingService

* Update docs/changelog/128653.yaml
Samiul-TheSoccerFan pushed a commit to Samiul-TheSoccerFan/elasticsearch that referenced this pull request Jun 5, 2025
…lastic#128653)

* Unmute testSymlinkUpdateTriggerReload

* Add retry for AccessDeniedException in AbstractFileWatchingService

* Update docs/changelog/128653.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs Team:Core/Infra Meta label for core/infra team test-windows Trigger CI checks on Windows v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants