Why does replace-regexp backwards work so differently?
C-u - M-x replace-regexp \w+
The -
prefix arg replaces backwards but it hits one char at a time, as if the plus sign weren’t there. The same replacement forwards (without the prefix arg) does hit one word at a time. What’s going on, @[email protected]?
Because once it hits the ultimate character of a word,
\w+
matches that (single) character, next time it matches the penultimate character, etc. You’d need\W\w+
to make it look far enough back to the beginning of the word.So it’s literally looking at the regexp backwards and forwards at the same time 😵💫
The regexp itself always looks forward, the BACKWARD argument just determines which direction the point should move after a match.
Consider the string
abc
. From the end, moving backwards, when does it match\w+
, and what does it match? When it reachesc
, it matchesc
. And from the front, moving forwards? When it reachesa
, it matchesabc
. This is why it acts differently.Yes, I got that, that wasn’t the weird part. The weird part is why the matcher is searching char-by-char backwards in the first place as opposed to skipping match-by-match.
I’ll use “\b\w+”, that seems to work well. \W\w+ was not good since it caught the spaces.
(Thanks for your patient repeated replies, BTW, I don’t mean to come across as ungrateful.)