After I had finally managed to craft this beast, I felt so proud that I just had to share it:
/(<img([^\/]|\/(?!>))*)>/U
If I were to try and translate this into common language, it would perhaps go like this: after each occurrence of <img
, skip all non-/
characters, and also all occurrences of /
immediately followed by a >
— up until the next occurrence of >
. Unless you have a match at that point — a >
without an immediately preceding /
— move on to the next occurrence of <img
.
And with each part of the expression in parenthesis, between quotation marks, following the corresponding translation: after each occurrence of <img
, skip all (“*
“) non-/
characters (“[^\/]
“), and also (“|
“) all occurrences of /
immediately followed by a >
(“\/(?!>)
“) — up until the next occurrence of >
. Unless you have a match at that point — a >
without an immediately preceding /
— move on to the next occurrence of <img
.
The outermost parentheses (/(<img...)>/U
) grab the contents of the unterminated <img>
tag apart from the closing angle bracket, so that it’s easy to pair with a “ />
” to properly terminate it. Other parentheses are for defining subpatterns.