fix: add download timeouts and retry on transient failures#247
Open
bouclem wants to merge 1 commit into
Open
Conversation
Fixes MCPHackers#242. Three downloader weaknesses combined to make setup fragile when a remote (e.g. vault.omniarchive.uk) is slow or flaky: - openURLStream did not set connect or read timeouts, so a stalled remote could hang the GUI indefinitely (user reported being stuck at 5%) - downloadFile only attempted once, so any single SSL reset or timeout aborted the whole setup - A failed mid-stream download left a partial file on disk that could confuse subsequent runs Set 30s connect / 60s read timeouts, retry up to 3 times with linear backoff on IOException, and clean up the partial file before each retry. Also closes the URL stream via try-with-resources, fixing a small resource leak in the original.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #242.
User reported a1.2.6 setup hanging mid-download (stuck at 5% on
https://vault.omniarchive.uk/archive/java/server-alpha/a0.2.8.jar), eventually failing withSSLException: Connection resetorConnectException: Connection timed out. Three weaknesses in the downloader combined to make this much worse than it should be:FileUtil.openURLStreamdid not set connect or read timeouts, so a stalled remote could hang the GUI indefinitely - that's why the user saw the progress bar freeze at 5%FileUtil.downloadFileonly attempted once, so any single TLS reset or transient timeout aborted the entire setupChanges
IOExceptionwith linear backoff (1s, 2s), coveringSSLException,SocketException,ConnectException,SocketTimeoutExceptionChannels.newChannel(openURLStream(url))didn't have a closing path on success)Why this fixes the issue
The user's connection to
vault.omniarchive.ukis flaky but not consistently broken - their second attempt got further than the first. Without retries one bad packet at the wrong moment kills the whole setup. With retries, transient resets are absorbed transparently. The timeouts also mean the user doesn't have to wait minutes staring at a frozen progress bar before the failure is reported.Testing
gradlew buildpassesI'm not 100% sure this fully resolves #242 (some networks may genuinely be unable to reach the vault at all), but it should turn intermittent failures into successful retries and make actual unreachable remotes fail clearly within ~90s instead of hanging indefinitely.