Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Help needed with regex and html [SOLVED]

Programming languages, Coding, Executables, Package Creation, and Scripting.
Post Reply
Message
Author
MultiplexLayout
Posts: 56
Joined: 2020-09-23 19:21
Has thanked: 7 times

Help needed with regex and html [SOLVED]

#1 Post by MultiplexLayout »

I am trying to modify an epub file that uses image files instead of accented letters. I am replacing the images with unicode characters. In order to do this I need to perform a find and replace on a regex matching the image tag, from the opening angle bracket to the closing one. However, consider the following:

Code: Select all

<img alt="image" src="c0011-01.jpg"/>s</strong> because the <strong>mi</strong> is short, but <strong>a-m<img alt="image" src="c0007-01.jpg"/>-tus</strong> because the <strong>m<img alt="image" src="c0025-01.jpg"/>
I want to target the image tag containing c0007-01.jpg but it is flanked by two other image tags. Any regex I have tried targets from the first image tag (c0011-01.jpg) to the third c0025.jpg). I need a regex that:
  • starts at the "<" and ends at the ">" (so I can execute a find and replace cleanly)
    must contain c0007-01.jpg
    does not contain any additional "<" within
If I have a regex that fulfills the above criteria, I'm fairly sure that it will only target the tag I want. Any help would be greatly appreciated.
Last edited by MultiplexLayout on 2021-05-10 14:39, edited 1 time in total.

User avatar
dilberts_left_nut
Administrator
Administrator
Posts: 5346
Joined: 2009-10-05 07:54
Location: enzed
Has thanked: 12 times
Been thanked: 66 times

Re: Help needed with regex and html

#2 Post by dilberts_left_nut »

You need a 'non greedy' match.
Probably include a 'NOT <' term.
I can't spit one out ATM,, but that might help your search.
AdrianTM wrote:There's no hacker in my grandma...

MultiplexLayout
Posts: 56
Joined: 2020-09-23 19:21
Has thanked: 7 times

Re: Help needed with regex and html

#3 Post by MultiplexLayout »

dilberts_left_nut wrote:You need a 'non greedy' match.
Probably include a 'NOT <' term.
I can't spit one out ATM,, but that might help your search.
This gave me the insight I needed. Thank you. For anyone stumbling on this thread the following regex solved my problem:

Code: Select all

<img[^\/]*?c0007-01.jpg[^\/]*?\/>

Post Reply