regex - Python regular expression to replace everything but specific words -
i trying following with regular expression:
import re x = re.compile('[^(going)|^(you)]') # words replace s = 'i going home now, thank you.' # string modify print re.sub(x, '_', s)
the result is:
'_____going__o___no______n__you_'
the result want is:
'_____going_________________you_'
since ^
can used inside brackets []
, result makes sense, i'm not sure how else go it.
i tried '([^g][^o][^i][^n][^g])|([^y][^o][^u])'
yields '_g_h___y_'
.
not quite easy first appears, since there no "not" in res except ^
inside [ ]
matches 1 character (as found). here solution:
import re def subit(m): stuff, word = m.groups() return ("_" * len(stuff)) + word s = 'i going home now, thank you.' # string modify print re.sub(r'(.+?)(going|you|$)', subit, s)
gives:
_____going_________________you_
to explain. re (i use raw strings) matches 1 or more of character (.+
) non-greedy (?
). captured in first parentheses group (the brackets). followed either "going" or "you" or end-of-line ($
).
subit
function (you can call within reason) called each substitution. match object passed, can retrieve captured groups. first group need length of, since replacing each character underscore. returned string substituted matching pattern.
Comments
Post a Comment