-
Notifications
You must be signed in to change notification settings - Fork 675
Rewrite matchFiles without regexp #1483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…nation - Add pathCache to avoid repeated GetNormalizedPathComponents calls - Replace tspath.CombinePaths with direct string concatenation in visitDirectory - Reduce memory allocations by 60-70% for large file systems - Improve performance by 40-50% for large workloads - Add backwards compatibility wrapper for existing code
- Allow *.min.js patterns to match .min.js files when explicitly specified - Preserve existing behavior where generic * patterns exclude .min.js files - Add regression tests for both cases
…files - Add comprehensive differential tests for MatchesExclude, MatchesInclude, and MatchesIncludeWithJsonOnly - Fix matchesExcludeNew to handle extensionless files matching directory patterns - Ensure new implementation matches old behavior without referencing old code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR rewrites the TypeScript file matching logic (matchFiles
) to eliminate regex usage for performance optimization. The regex-based pattern matching was a performance bottleneck in the critical loading path, particularly since Go's regex (especially regexp2
in ECMAScript compatibility mode) is significantly slower than JavaScript regex.
Key changes:
- Implements a new glob pattern matching algorithm without regex using string manipulation and recursive segment matching
- Maintains backward compatibility by keeping the old implementation in
old.go
for comparison - Adds comprehensive test coverage to verify behavioral equivalence between old and new implementations
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
internal/vfs/vfsmatch/vfsmatch.go |
New main API file exposing the updated file matching functions |
internal/vfs/vfsmatch/old.go |
Refactored old regex-based implementation moved from internal/vfs |
internal/vfs/vfsmatch/new.go |
New regex-free implementation with custom glob pattern matching |
internal/vfs/vfsmatch/vfsmatch_test.go |
Comprehensive test suite comparing old vs new implementations |
internal/vfs/vfsmatch/bench_test.go |
Performance benchmarks demonstrating the speed improvements |
internal/tspath/path.go |
Minor optimization to path component parsing |
internal/tsoptions/wildcarddirectories.go |
Updated to use new vfsmatch API and removed regex dependencies |
internal/tsoptions/tsconfigparsing.go |
Updated to use new vfsmatch API |
internal/project/discovertypings.go |
Updated import to use new vfsmatch API |
Baseline files | Test output changes reflecting improved file matching behavior |
This code is a hotspot in the critical loading path. In Strada, this was all regexes, which is great in JS where regexes are fast. In Go, not so much, even moreso when we use
regexp2
in ECMAScript compatibility mode. We've always wanted to rewrite this, it's just taken a bit.This PR moves the old implementation to a different file, then I had agent mode write tests and a new implementation, verifying against those tests.
The result so far:
So, like 35x faster, maybe? 15x in many.