A friend asked for a regex that matches a paragraph that contains only upper-case text inside a nested hierarchy of tags. Some examples:
Matches:
<p class="abcdefg"><a href="1.htm"><span>HELLO THERE</span></a></p> <p class="c8"><span class="c7">BY ERIC D. JAMES, MD</span></p> <p style="border:1px solid red">HELLO DARLING</p>
Fail:
<p class="c8"><span class="c7">BY Eric James, MD</span></p> <p style="border:1px solid red">Hello Darling</p> <p class="abcdefg"><a href="1.htm"><span>HELLO THeRE</span></a></p>
I came up with the following expression:
/<p[^>]<em>>(<[^>]</em>>)<em>[^a-z<]+(<\/[^p][^>]</em>>)<em><\/p[^>]</em>>/
It doesn’t handle tags interspersed with text or nested paragraph tags.