1

Basically, a page generates some dynamic content, and I want to get that dynamic content, and not just the static html. I am not being able to do this with cURL. Help please.

2
  • 1
    that's impossible using just curl... Commented Jun 12, 2013 at 22:30
  • You need to find out where to properly get it from. Likely, the js is either making an ajax call which you can curl to scrape the data or it's hardcoded in a js/html file that's loaded with the normal page load. Commented Jun 12, 2013 at 22:58

2 Answers 2

3

You can't with just cURL.

cURL will grab the specific raw (static) files from the site, but to get javascript generated content, you would have to put that content into a browser-like envirionment that supports javascript and all other host objects that the javascript uses so the script can run.

Then once the script runs, you would have to access the DOM to grab whatever content you wanted from it.

This is why most search engines don't index javascript-generated content. It's not easy.


If this is one specific site that you're trying to gather info on, you may want to look into exactly how the site gets the data itself and see if you can't get the data directly from that source. For example, is the data embedded in JS in the page (in which case you can just parse out that JS) or is the JS obtained from an ajax call (in which case you can maybe just make that ajax call directly) or some other method.

Sign up to request clarification or add additional context in comments.

Comments

0

you could try selenium at http://seleniumhq.org, which supports js.

1 Comment

This might be what I'm looking for. I'll try this. Thanks for the link.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.