-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add new invisible_characters rule #6424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kapitoshka438
wants to merge
8
commits into
realm:main
Choose a base branch
from
kapitoshka438:invisible_characters
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+316
−45
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
c53a277
Add new invisible_character rule
sm-eminiakhmetov 40977d7
Make the rule correctable
sm-eminiakhmetov 58f8483
Corrections optimizations
sm-eminiakhmetov ffdd71f
Added include_hex_codes configuration element to allow adding more vi…
sm-eminiakhmetov f57e9a8
Conform UnicodeScalar to AcceptableByConfigurationElement
sm-eminiakhmetov 3a57e6f
Attempt to optimize memory efficiency
sm-eminiakhmetov d1fefa8
Replace zero-width joiner with soft hyphen in examples
sm-eminiakhmetov e71d360
Fixes for Windows/Linux
sm-eminiakhmetov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
183 changes: 183 additions & 0 deletions
183
Source/SwiftLintBuiltInRules/Rules/Lint/InvisibleCharacterRule.swift
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
| import SwiftSyntax | ||
|
|
||
| @SwiftSyntaxRule(correctable: true) | ||
| struct InvisibleCharacterRule: Rule { | ||
| var configuration = InvisibleCharacterConfiguration() | ||
|
|
||
| // swiftlint:disable invisible_character | ||
| static let description = RuleDescription( | ||
| identifier: "invisible_character", | ||
| name: "Invisible Character", | ||
| description: """ | ||
| Disallows invisible characters like zero-width space (U+200B), \ | ||
| zero-width non-joiner (U+200C), and FEFF formatting character (U+FEFF) \ | ||
| in string literals as they can cause hard-to-debug issues. | ||
| """, | ||
| kind: .lint, | ||
| nonTriggeringExamples: [ | ||
| Example(#"let s = "HelloWorld""#), | ||
| Example(#"let s = "Hello World""#), | ||
| Example(#"let url = "https://example.com/api""#), | ||
| Example(##"let s = #"Hello World"#"##), | ||
| Example(""" | ||
| let multiline = \"\"\" | ||
| Hello | ||
| World | ||
| \"\"\" | ||
| """), | ||
| Example(#"let empty = """#), | ||
| Example(#"let tab = "Hello\tWorld""#), | ||
| Example(#"let newline = "Hello\nWorld""#), | ||
| Example(#"let unicode = "Hello 👋 World""#), | ||
| ], | ||
| triggeringExamples: [ | ||
| Example(#"let s = "Hello↓World" // U+200B zero-width space"#), | ||
| Example(#"let s = "Hello↓World" // U+200C zero-width non-joiner"#), | ||
| Example(#"let s = "Hello↓World" // U+FEFF formatting character"#), | ||
| Example(#"let url = "https://example↓.com" // U+200B in URL"#), | ||
| Example(""" | ||
| // U+200B in multiline string | ||
| let multiline = \"\"\" | ||
| Hello↓World | ||
| \"\"\" | ||
| """), | ||
| Example(#"let s = "Test↓String↓Here" // Multiple invisible characters"#), | ||
| Example(#"let s = "Hel↓lo" + "World" // string concatenation with U+200C"#), | ||
| Example(#"let s = "Hel↓lo \(name)" // U+200C in interpolated string"#), | ||
| Example(""" | ||
| // | ||
| // additional_code_points: ["00AD"] | ||
| // | ||
| let s = "Hello↓World" | ||
| """, | ||
| configuration: [ | ||
| "additional_code_points": ["00AD"], | ||
| ] | ||
| ), | ||
| Example(""" | ||
| // | ||
| // additional_code_points: ["200D"] | ||
| // | ||
| let s = "Hello↓World" | ||
| """, | ||
| configuration: [ | ||
| "additional_code_points": ["200D"], | ||
| ] | ||
| ), | ||
| ], | ||
| corrections: [ | ||
| Example(#"let s = "HelloWorld""#): Example(#"let s = "HelloWorld""#), | ||
| Example(#"let s = "HelloWorld""#): Example(#"let s = "HelloWorld""#), | ||
| Example(#"let s = "HelloWorld""#): Example(#"let s = "HelloWorld""#), | ||
| Example(#"let url = "https://example.com""#): Example(#"let url = "https://example.com""#), | ||
| Example(""" | ||
| let multiline = \"\"\" | ||
| HelloWorld | ||
| \"\"\" | ||
| """): Example(""" | ||
| let multiline = \"\"\" | ||
| HelloWorld | ||
| \"\"\" | ||
| """), | ||
| Example(#"let s = "TestStringHere""#): Example(#"let s = "TestStringHere""#), | ||
| Example(#"let s = "Hello" + "World""#): Example(#"let s = "Hello" + "World""#), | ||
| Example(#"let s = "Hello \(name)""#): Example(#"let s = "Hello \(name)""#), | ||
| Example( | ||
| #"let s = "HelloWorld""#, | ||
| configuration: [ | ||
| "additional_code_points": ["00AD"], | ||
| ] | ||
| ): Example( | ||
| #"let s = "HelloWorld""#, | ||
| configuration: [ | ||
| "additional_code_points": ["00AD"], | ||
| ] | ||
| ), | ||
| Example( | ||
| #"let s = "HelloWorld""#, | ||
| configuration: [ | ||
| "additional_code_points": ["200D"], | ||
| ] | ||
| ): Example( | ||
| #"let s = "HelloWorld""#, | ||
| configuration: [ | ||
| "additional_code_points": ["200D"], | ||
| ] | ||
| ), | ||
| ] | ||
| ) | ||
| // swiftlint:enable invisible_character | ||
| } | ||
|
|
||
| private extension InvisibleCharacterRule { | ||
| final class Visitor: ViolationsSyntaxVisitor<ConfigurationType> { | ||
| override func visitPost(_ node: StringLiteralExprSyntax) { | ||
| let violatingCharacters = configuration.violatingCharacters | ||
| for segment in node.segments { | ||
| guard let stringSegment = segment.as(StringSegmentSyntax.self) else { | ||
| continue | ||
| } | ||
| let text = stringSegment.content.text | ||
| let scalars = text.unicodeScalars | ||
| guard scalars.contains(where: { violatingCharacters.contains($0) }) else { | ||
| continue | ||
| } | ||
| var utf8Offset = 0 | ||
| var previousScalar: UnicodeScalar? | ||
| var previousUtf8Size = 0 | ||
|
|
||
| for scalar in scalars { | ||
| defer { | ||
| previousScalar = scalar | ||
| previousUtf8Size = scalar.utf8.count | ||
| utf8Offset += scalar.utf8.count | ||
| } | ||
| guard violatingCharacters.contains(scalar) else { | ||
| continue | ||
| } | ||
|
|
||
| let characterName = InvisibleCharacterConfiguration.defaultCharacterDescriptions[scalar.value] | ||
| ?? scalar.escaped(asASCII: true) | ||
|
|
||
| // Check if this scalar forms a grapheme cluster with the previous one. | ||
| // This is needed on Windows and Linux where NSString operations on grapheme clusters | ||
| // can delete more than intended when removing a combining character like ZWJ. | ||
| let formsCombinedCluster: Bool | ||
| if let prev = previousScalar { | ||
| let combined = String(prev) + String(scalar) | ||
| formsCombinedCluster = combined.count == 1 | ||
| } else { | ||
| formsCombinedCluster = false | ||
| } | ||
|
|
||
| let correctionStart: AbsolutePosition | ||
| let replacement: String | ||
|
|
||
| if formsCombinedCluster, let prev = previousScalar { | ||
| // Include previous scalar in the correction range and use it as replacement | ||
| correctionStart = stringSegment.content.positionAfterSkippingLeadingTrivia | ||
| .advanced(by: utf8Offset - previousUtf8Size) | ||
| replacement = String(prev) | ||
| } else { | ||
| correctionStart = stringSegment.content.positionAfterSkippingLeadingTrivia | ||
| .advanced(by: utf8Offset) | ||
| replacement = "" | ||
| } | ||
|
|
||
| let position = stringSegment.content.positionAfterSkippingLeadingTrivia.advanced(by: utf8Offset) | ||
| violations.append( | ||
| ReasonedRuleViolation( | ||
| position: position, | ||
| reason: "String literal should not contain invisible character \(characterName)", | ||
| correction: .init( | ||
| start: correctionStart, | ||
| end: position.advanced(by: scalar.utf8.count), | ||
| replacement: replacement | ||
| ) | ||
| ) | ||
| ) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
39 changes: 39 additions & 0 deletions
39
Source/SwiftLintBuiltInRules/Rules/RuleConfigurations/InvisibleCharacterConfiguration.swift
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| import SwiftLintCore | ||
|
|
||
| @AutoConfigParser | ||
| struct InvisibleCharacterConfiguration: SeverityBasedRuleConfiguration { | ||
| static let defaultCharacterDescriptions: [UInt32: String] = [ | ||
| 0x200B: "U+200B (zero-width space)", | ||
| 0x200C: "U+200C (zero-width non-joiner)", | ||
| 0xFEFF: "U+FEFF (zero-width no-break space)", | ||
| ] | ||
|
|
||
| @ConfigurationElement(key: "severity") | ||
| private(set) var severityConfiguration = SeverityConfiguration<Parent>.error | ||
| @ConfigurationElement( | ||
| key: "additional_code_points", | ||
| postprocessor: { | ||
| let defaultScalars = defaultCharacterDescriptions.keys.compactMap { UnicodeScalar($0) } | ||
| $0.formUnion(defaultScalars) | ||
| } | ||
| ) | ||
| private(set) var violatingCharacters = Set<UnicodeScalar>() | ||
| } | ||
|
|
||
| extension UnicodeScalar: AcceptableByConfigurationElement { | ||
| public init(fromAny value: Any, context ruleID: String) throws(Issue) { | ||
| guard let hexCode = value as? String, | ||
| let codePoint = UInt32(hexCode, radix: 16), | ||
| let scalar = Self(codePoint) else { | ||
| throw .invalidConfiguration( | ||
| ruleID: ruleID, | ||
| message: "\(value) is not a valid Unicode scalar code point." | ||
| ) | ||
| } | ||
| self = scalar | ||
| } | ||
|
|
||
| public func asOption() -> OptionType { | ||
| .string(.init(value, radix: 16, uppercase: true)) | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.