0

I want to delete all javascript found on a string with PHP, I'm trying to do it trough preg_replace like this: $text = preg_replace("/<script.*<\/script>/mis", "", $text);

But after doing this, $text is an empty string. What I'm doing wrong? Trying this regex on http://www.phpliveregex.com/ it seems to work, but using it in real life returns me nothing.

2
  • 1
    try $text = preg_replace("/<script.*?<\/script>/mis", "", $text); Commented Jun 1, 2015 at 16:37
  • That seems to work, thank you! Would you care in elaborate about why that extra question mark solved the problem? Regexes are one of the fetaures I don't seem to fully understand on development. Commented Jun 1, 2015 at 17:04

1 Answer 1

1

Typically you do not just want to remove (escape) javascript but all sorts of HTML tags which prevent a vulnerability to you. You can do so using http://docs.php.net/manual/en/function.htmlspecialchars.php or http://docs.php.net/manual/en/function.htmlentities.php

Then you dont have to bother with regex altogether.

Sign up to request clarification or add additional context in comments.

1 Comment

This is escaping them, I need to remove them. I'm indexing mail bodies into Amazon CloudSearch and for compression (and because I won't need to search javascript on my index, only the text of the mails) I need to remove them all. Also a lot of mails are having CDATA info on the body and is giving me a lot of troubles when parsing the XML.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.