Skip to content

How to get rid of r n and elements in a string

I’ve web scraped addresses and the address strings have unwanted elements like “n” and “<br/>”, how do I remove them?

Rosemount Viaduct,<br />rnAberdeen<br />rn


You can clean these html leftovers with a regular expression:

    import re
    value = "Rosemount Viaduct,<br />rnAberdeen<br />rn"
    clean_value = re.sub(r'<brs/>rn', r'', value)