1

I have the following html parser:

from HTMLParser import HTMLParser

class MLStripper(HTMLParser):
    def __init__(self):
        self.reset()
        self.fed = []

    def handle_data(self, d):
        self.fed.append(d)

    def get_data(self):
        return ''.join(self.fed)

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()

I would like to use this on the following data.frame:

 df = pd.DataFrame([['<br> test </br>', 1]], columns=('body', 'ticketID'))

My assumption would be that it would work like this:

 for row in df.iterrows():
     input = row['body']
     print(strip_tags(input)

But this gives me a type error. Any thoughts where this goes wrong?

3
  • 2
    Can you please add whole error message? Commented Jan 25, 2017 at 13:53
  • 2
    @Frits Please be more generous, use 4 spaces for indentation. 1 space is too low. Commented Jan 25, 2017 at 13:53
  • Include an input and output. Commented Jan 25, 2017 at 13:55

1 Answer 1

1

From the (Docs):

DataFrame.iterrows()

Iterate over DataFrame rows as (index, Series) pairs.

So you get the index, along with the row.

Working Code:

for index, row in df.iterrows():
    input = row['body']
    print(strip_tags(input))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.