Skip to content

Regular expression not working (all characters are shifted) #52

@dataexcess

Description

@dataexcess

Hi there,

I am using a regular expression to look for the url of the 'visually similar' button on google.
This is the regex I use:
"href=((?:(?!href).)*?)>Vis"

and it works perfectly when testing on https://regexr.com/
this is some example to-match text:


enu-panel" role="menu" tabindex="-1" jsaction="keydown:Xiq7wd;mouseover:pKPowd;mouseout:O9bKS" data-ved="2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQqR8wAXoECAMQBQ"><li class="action-menu-item" role="menuitem"><a class="fl" href="https://webcache.googleusercontent.com/search?q=cache:8lDNWm_duSMJ:https://www.trustedshops.com/+&amp;cd=2&amp;hl=en&amp;ct=clnk&amp;gl=de\" ping="/url?sa=t&source=web&rct=j&url=https://webcache.googleusercontent.com/search%3Fq%3Dcache:8lDNWm_duSMJ:https://www.trustedshops.com/%2B%26cd%3D2%26hl%3Den%26ct%3Dclnk%26gl%3Dde&amp;ved=2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQIDABegQIAxAG\">Cached<div class="IsZvec"><span class="aCOpRe">Trusted Shops is the European Trustmark for online shops with money-back guarantee for consumers. Trusted Shops offers a comprehensive service to raise ...<div class="ULSxyf"><div jsmodel="gpo5Gf" class="LnbJhc" data-count="28" style="position:relative" data-iu="1" data-hveid="CAIQAA" data-ved="2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQ8w0oAHoECAIQAA"><div class="e2BEnf U7izfe mfMhoc"><a class="ekf0x hSQtef" href="/search?tbs=simg:CAESiQIJQs8eCt9yzs0a_1QELELCMpwgaOgo4CAQSFNcy_1TP2GfwQ9zXEDcYqnTbjEq4kGhqVVToFJUcTvzott-6Sl5Qp4R6jBL5G5bKsuyAFMAQMCxCOrv4IGgoKCAgBEgTTC8hDDAsQne3BCRqdAQofCgxvZmZpY2UgY2hhaXLapYj2AwsKCS9tLzA4cTF4cAofCgxzd2l2ZWwgY2hhaXLapYj2AwsKCS9tLzBncTZreAoiCg9mdXJuaXR1cmUgc3R5bGXapYj2AwsKCS9qLzl3MHFqcwobCghmb3IgdGVlbtqliPYDCwoJL2EvNnEzMDY3ChgKBXNvbGlk2qWI9gMLCgkvYS8zbWcxY20M&q=trusted+shop&tbm=isch&sa=X&ved=2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQjJkEegQIAhAB"><div class="iv236"><span class="iJddsb" style="height:20px;width:20px"><svg focusable="false" viewbox="0 0 24 24"><path d="M14 13l4 5H6l4-4 1.79 1.78L14 13zm-6.01-2.99A2 2 0 0 0 8 6a2 2 0 0 0-.01 4.01zM22 5v14a3 3 0 0 1-3 2.99H5c-1.64 0-3-1.36-3-3V5c0-1.64 1.36-3 3-3h14c1.65 0 3 1.36 3 3zm-2.01 0a1 1 0 0 0-1-1H5a1 1 0 0 0-1 1v14a1 1 0 0 0 1 1h7v-.01h7a1 1 0 0 0 1-1V5"><div class="iJ1Kvb"><h3 class="GmE3X" aria-level="2" role="heading">Visually similar images<div style="padding-bottom:0" id="iur">

<div jsmodel="" jscontroller="IkchZc" jsaction="PdWSXe:h5M12e;rcuQ6b:npT2md" jsdata="X2sNs;;CiOOHU"><div data-h="130" data-nr="4" style="margin-right:-2px;margin-bottom:-2px"><div jsname="dTDiAc" class="eA0Zlc qN5nNb tapJqb ivg-i" data-docid="DrX4TNBpITAGoM" jsdata="XZxcdf;DrX4TNBpITAGoM;CiOOJI" data-ved="2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQ5r0BegQIIRAA"><a href="/search?q=trusted+shop&tbm=isch&source=iu&ictx=1&tbs=simg:CAESiQIJQs8eCt9yzs0a_1QELELCMpwgaOgo4CAQSFNcy_1TP2GfwQ9zXEDcYqnTbjEq4kGhqVVToFJUcTvzott-6Sl5Qp4R6jBL5G5bKsuyAFMAQMCxCOrv4IGgoKCAgBEgTTC8hDDAsQne3BCRqdAQofCgxvZmZpY2UgY2hhaXLapYj2AwsKCS9tLzA4cTF4cAofCgxzd2l2ZWwgY2hhaXLapYj2AwsKCS9tLzBncTZreAoiCg9mdXJuaXR1cmUgc3R5bGXapYj2AwsKCS9qLzl3MHFqcwobCghmb3IgdGVlbtqliPYDCwoJL2EvNnEzMDY3ChgKBXNvbGlk2qWI9gMLCgkvYS8zbWcxY20M&fir=DrX4TNBpITAGoM%252CO1KPcBx95JlNxM%252C&vet=1&usg=AI4_-


When using this exact same regex with your swift Regex expression I do not get the expected result, but I get the following:


FNcy_1TP2GfwQ9zXEDcYqnTbjEq4kGhqVVToFJUcTvzott-6Sl5Qp4R6jBL5G5bKsuyAFMAQMCxCOrv4IGgoKCAgBEgTTC8hDDAsQne3BCRqdAQofCgxvZmZpY2UgY2hhaXLapYj2AwsKCS9tLzA4cTF4cAofCgxzd2l2ZWwgY2hhaXLapYj2AwsKCS9tLzBncTZreAoiCg9mdXJuaXR1cmUgc3R5bGXapYj2AwsKCS9qLzl3MHFqcwobCghmb3IgdGVlbtqliPYDCwoJL2EvNnEzMDY3ChgKBXNvbGlk2qWI9gMLCgkvYS8zbWcxY20M&q=trusted+shop&tbm=isch&sa=X&ved=2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQjJkEegQIAhAB"><div class="iv236"><span class="iJddsb" style="height:20px;width:20px"><svg focusable="false" viewbox="0 0 24 24"><path d="M14 13l4 5H6l4-4 1.79 1.78L14 13zm-6.01-2.99A2 2 0 0 0 8 6a2 2 0 0 0-.01 4.01zM22 5v14a3 3 0 0 1-3 2.99H5c-1.64 0-3-1.36-3-3V5c0-1.64 1.36-3 3-3h14c1.65 0 3 1.36 3 3zm-2.01 0a1 1 0 0 0-1-1H5a1 1 0 0 0-1 1v14a1 1 0 0 0 1 1h7v-.01h7a1 1 0 0 0 1-1V5">

<div class="iJ1Kvb"><h3 class="GmE3X" aria-level="2" role="heading">Visually similar images<div s


As you can see it somehow captures beyond the last capture group ">Vis". And additionally there are a lot of characters missing from the start of the expected capture.. all the characters next to the "href=".

I tried a lot to rewrite my regex, but as it is confirmed to be working on regex testers I must conclude that there is something wrong with this Regex cocoapod.

Please help! Thank you
A link to the regex helper tool: regexr.com/5qfv4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions