5

Looking for python code that can take an HTML page and insert any linked CSS style definitions used by that page into it - so any externally referenced css page(s) are not needed.

Needed to make single files to insert as email attachments from existing pages used on web site. Thanks for any help.

4
  • You can easily pull the CSS out, but what CSS path will you use to identify each piece of CSS? I'm not sure how easily you can do that programmatically. Commented Nov 22, 2010 at 15:23
  • dup? stackoverflow.com/questions/781382/… Commented Nov 22, 2010 at 15:50
  • No, not a dup. The aim here is to put everything into one file. Commented Nov 22, 2010 at 15:55
  • Note that Outlook HTML mail reader is Word, anything but inline style="..." in each tag may result very odd sometimes. Commented Nov 22, 2010 at 19:29

3 Answers 3

4

Sven's answer helped me, but it didn't work out of the box. The following did it for me:

import bs4   #BeautifulSoup 3 has been replaced
soup = bs4.BeautifulSoup(open("index.html").read())
stylesheets = soup.findAll("link", {"rel": "stylesheet"})
for s in stylesheets:
    t = soup.new_tag('style')
    c = bs4.element.NavigableString(open(s["href"]).read())
    t.insert(0,c)
    t['type'] = 'text/css'
    s.replaceWith(t)
open("output.html", "w").write(str(soup))
Sign up to request clarification or add additional context in comments.

Comments

2

You will have to code this yourself, but BeautifulSoup will help you a long way. Assuming all your files are local, you can do something like:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(open("index.html").read())
stylesheets = soup.findAll("link", {"rel": "stylesheet"})
for s in stylesheets:
    s.replaceWith('<style type="text/css" media="screen">' +
                  open(s["href"]).read()) +
                  '</style>')
open("output.html", "w").write(str(soup))

If the files are not local, you can use Pythons urllib or urllib2 to retrieve them.

3 Comments

That, and hope he doesn't use any @import rules :)
@Frédéric: Good point. I'm sure I missed quite a few cases, but that's just to get Scott going :)
Thanks. This was a great help.
2

You can use pynliner. Example from their documentation:

html = "html string"
css = "css string"
p = Pynliner()
p.from_string(html).with_cssString(css)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.