Skip to content

Commit 4c57e48

Browse files
authored
fix: polynomial regex on uncontrolled input (#3196)
* fix: add possessive quantifier to regex match for formatted strings (Fixes https://github.com/faker-ruby/faker/security/code-scanning/14) Adds a possessive quantifier (an extra `+`) to the regex to prevent backtracking. This is to prevent the issue 'Polynomial regular expression used on uncontrolled data code'. It seems a bit overkill, given how these formatted strings are used, but I don't think it hurts either. See reference: https://ruby-doc.org/3.4.1/Regexp.html#class-Regexp-label-Greedy-2C+Lazy-2C+or+Possessive+Matching * fix: add possessive quantifier to regex match for regexify (Fixes https://github.com/faker-ruby/faker/security/code-scanning/13) Similar to commit fa6d5df. Adds a possessive quantifier (an extra `+`) to the regex to prevent backtracking. See reference: https://ruby-doc.org/3.4.1/Regexp.html#class-Regexp-label-Greedy-2C+Lazy-2C+or+Possessive+Matching
1 parent 2258715 commit 4c57e48

File tree

1 file changed

+8
-7
lines changed

1 file changed

+8
-7
lines changed

lib/faker.rb

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -95,12 +95,12 @@ def regexify(reg)
9595
reg
9696
.gsub(%r{^/?\^?}, '').gsub(%r{\$?/?$}, '') # Ditch the anchors
9797
.gsub(/\{(\d+)\}/, '{\1,\1}').gsub('?', '{0,1}') # All {2} become {2,2} and ? become {0,1}
98-
.gsub(/(\[[^\]]+\])\{(\d+),(\d+)\}/) { |_match| Regexp.last_match(1) * sample(Array(Range.new(Regexp.last_match(2).to_i, Regexp.last_match(3).to_i))) } # [12]{1,2} becomes [12] or [12][12]
99-
.gsub(/(\([^)]+\))\{(\d+),(\d+)\}/) { |_match| Regexp.last_match(1) * sample(Array(Range.new(Regexp.last_match(2).to_i, Regexp.last_match(3).to_i))) } # (12|34){1,2} becomes (12|34) or (12|34)(12|34)
100-
.gsub(/(\\?.)\{(\d+),(\d+)\}/) { |_match| Regexp.last_match(1) * sample(Array(Range.new(Regexp.last_match(2).to_i, Regexp.last_match(3).to_i))) } # A{1,2} becomes A or AA or \d{3} becomes \d\d\d
101-
.gsub(/\((.*?)\)/) { |match| sample(match.gsub(/[()]/, '').split('|')) } # (this|that) becomes 'this' or 'that'
102-
.gsub(/\[([^\]]+)\]/) { |match| match.gsub(/(\w-\w)/) { |range| sample(Array(Range.new(*range.split('-')))) } } # All A-Z inside of [] become C (or X, or whatever)
103-
.gsub(/\[([^\]]+)\]/) { |_match| sample(Regexp.last_match(1).chars) } # All [ABC] become B (or A or C)
98+
.gsub(/(\[[^\]]++\])\{(\d+),(\d+)\}/) { |_match| Regexp.last_match(1) * sample(Array(Range.new(Regexp.last_match(2).to_i, Regexp.last_match(3).to_i))) } # [12]{1,2} becomes [12] or [12][12]
99+
.gsub(/(\([^)]++\))\{(\d+),(\d+)\}/) { |_match| Regexp.last_match(1) * sample(Array(Range.new(Regexp.last_match(2).to_i, Regexp.last_match(3).to_i))) } # (12|34){1,2} becomes (12|34) or (12|34)(12|34)
100+
.gsub(/(\\?.)\{(\d+),(\d+)\}/) { |_match| Regexp.last_match(1) * sample(Array(Range.new(Regexp.last_match(2).to_i, Regexp.last_match(3).to_i))) } # A{1,2} becomes A or AA or \d{3} becomes \d\d\d
101+
.gsub(/\((.*?)\)/) { |match| sample(match.gsub(/[()]/, '').split('|')) } # (this|that) becomes 'this' or 'that'
102+
.gsub(/\[([^\]]++)\]/) { |match| match.gsub(/(\w-\w)/) { |range| sample(Array(Range.new(*range.split('-')))) } } # All A-Z inside of [] become C (or X, or whatever)
103+
.gsub(/\[([^\]]++)\]/) { |_match| sample(Regexp.last_match(1).chars) } # All [ABC] become B (or A or C)
104104
.gsub('\d') { |_match| sample(Numbers) }
105105
.gsub('\w') { |_match| sample(Letters) }
106106
end
@@ -133,7 +133,8 @@ def fetch_all(key)
133133
# formatted translation: e.g., "#{first_name} #{last_name}".
134134
def parse(key)
135135
fetched = fetch(key)
136-
parts = fetched.scan(/(\(?)#\{([A-Za-z]+\.)?([^}]+)\}([^#]+)?/).map do |prefix, kls, meth, etc|
136+
137+
parts = fetched.scan(/(\(?)#\{([A-Za-z]+\.)?([^}]+)\}([^#]++)?/).map do |prefix, kls, meth, etc|
137138
# If the token had a class Prefix (e.g., Name.first_name)
138139
# grab the constant, otherwise use self
139140
cls = kls ? Faker.const_get(kls.chop) : self

0 commit comments

Comments
 (0)