Strip Html from Text in JavaScript except p tags? [duplicate]

Question

I need to change RichEditor and TextEditor modes with JavaScript, now I need to convert Html to Text which is actually still in Html editor mode, so I need just p tags, but other Html can be stripped off.

@JavaCoder: Thinking about it again, there is one situation that would make the regex break. See edited answer, and decide if this would apply to you. — Tomalak
– Tomalak, Commented May 6, 2009 at 9:23
Tomalak's answer would work in most situations, but please keep in mind that stripping HTML down to "safe" HTML is extremely difficult and has serious security implications. If you're sending the resulting HTML to the server, never count on client-side validataion. Even if you don't send it back to the server, building it based on GET or POST values can make security holes. Read up on XSS and CSRF. — Neall
– Neall, Commented May 6, 2009 at 11:32

Tomalak · Accepted Answer · 2012-06-14 21:37:46Z

11

Regex replace (globally, case-insensitively):

</?(?:(?!p\b)[^>])*>

with the empty string.

Explanation:

<          # "<"
/?         # optional "/" 
(?:        # non-capture group
  (?!      #   negative look-ahead: a position not followed by...
    p\b    #     "p" and a word bounday
  )        #   end lock-ahead
  [^>]*    #   any char but ">", as often as possible
)          # end non-capture group
>          # ">"

This is one of the few situations where applying regex to HTML can actually work.

Some might object and say that the use of a literal "<" within an attribute value was actually not forbidden, and therefore would potentially break the above regex. They would be right.

The regex would break in this situation, replacing the underlined part:

<p class="foo" title="unusual < title">
                              ---------

If such a thing is possible with your input, then you might have to use a more advanced tool to do the job - a parser.

edited Jun 14, 2012 at 21:37

answered May 6, 2009 at 8:39

Tomalak

340k68 gold badges547 silver badges635 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

JavaCoder Over a year ago

How can I make that to function ?

Tomalak Over a year ago

@JavaCoder: Is this a "give me teh codez" question? There are literally tons of JavaScript regex tutorials available on the internet. I am sure you manage to find one that tells you how to accomplish a replace, if only you went looking for one. Even Rafael's answer below shows you how to do it. But if you are asking how to make a JavaScript function, you might not yet be ready to use regular expressions at all. And your nickname would be wrong.

Lodder Over a year ago

@Tomalak out of all the strip tag answers I have seen, I think you are the first to explain what each bit of code does which I am really greatful for cause I liek to know what code does when I use it...so thanks and +1 ;)

Chandrew Over a year ago

I modified it a bit to grab the whole tag, contents and closing </p>: </?(?:(?!p\b)[^>])*>.*</p>

Rafael · Accepted Answer · 2009-05-06 09:23:02Z

3

This should help

var html = '<img src=""><p>content</p><span style="color: red">content</span>';
html.replace(/<(?!\s*\/?\s*p\b)[^>]*>/gi,'')

explanation for my regex:

replace all parts

beginning with "<",
not followed by (?!
- any number of white-space characters "\s*"
- optional "/" character
- and tag name followed by a word boundary (here "p\b")
containing any characters not equal ">" - [^>]*
and ending with ">" character

edited May 6, 2009 at 9:23

answered May 6, 2009 at 8:38

Rafael

18.5k5 gold badges60 silver badges67 bronze badges

Comments

LanreSmith · Accepted Answer · 2023-06-04 16:11:12Z

0

var input = 'b<p on>b <p>good p</p> a<a>a h1<h1>h1 p<pre>p img<img src/>img';
var output = input.replace(/(<(?!\/?p\s*>)([^>]+)>)/ig, '');
console.log(output);
output: bb <p>good p</p> aa h1h1 pp imgimg

answered Jun 4, 2023 at 16:11

LanreSmith

1612 silver badges2 bronze badges

Collectives™ on Stack Overflow

Strip Html from Text in JavaScript except p tags? [duplicate]

3 Answers 3

4 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Linked

Related