commit bd5ea6b1b3745d807d6ded9261ba49df1c292e55 Author: Dominic Mayers Date: Wed Jan 3 15:43:41 2024 -0500 parser: Find the closing tag with a balanced inner text The current code for the parser fails when the code contains two levels of nesting with an empty ref tag. For example, it fails on AB. You will not see it on a real page, because it will be hidden behind other problems. But with the series of patches that I am offering for the Cite extension, these other problems disappear and then this limitation can be seen. The faulted code is this pregmatch in Preprocessor_Hash.php : preg_match( "/<\/" . preg_quote( $name, '/' ) . "\s*>/i", $text, $matches, PREG_OFFSET_CAPTURE, $offset ) ) It expects the $offset to be the start of the inner text. In the above example, the $offset will be 5. The code computes the inner text with the expression $inner = substr( $text, $offset, $matches[0][1] - $offset ); In the above example, the returned $inner is AB. The correct value is AB. The commit defines a function pregFindClosingTag() to replace the faulty preg_match. Bug: T22707 Change-Id: I5e6c17f553623d4f80a8f73b527fb4b9a7dbb421