3

Does anyone know of a small, fast, javascript emulator with DOM layer support? in either C/C++?

The problem: I need rudimentary support for javascript in a crawler application, and am wondering if there's any other options other than:

a) Integrating WebKit (headless) (slows down crawling tremendously). b) Integrating SpiderMonkey and writing the DOM layer myself (not looking forward to this option, not sure if its even worth it, speed wise).

Any other options?

Thanks!

1

2 Answers 2

2

Throw in my vote for WebKit (or some other existing code). Why bother reinventing the wheel, especially when the wheel is really fancy, complicated, has spent years in development.

If you really wanted, you could write some code that checks for javascript first, so you only pass off the jobs that need it. Then, write filters for common ad networks and analytics packages to ignore. If it were me though, I'd rather be consistent with how I am crawling.

Also, don't think that you only need rudimentary support, as there are some really funky websites out there that do a ton of DOM altering. If you expect your crawling to be reliable, be prepared to support what browsers support. The easiest way to do that is use the same code that the browsers are using.

Sign up to request clarification or add additional context in comments.

5 Comments

Except that the engines themselves don't provide DOM; they rely on the browser to do so.
@Ignacio, WebKit is not just the JavaScript engine (that's JavaScriptCore). It includes WebCore and JavaScriptCore. WebCore has the DOM functionality.
@Matthew: Sure, but he says "The easiest way to do that is use one of the engines browsers use". This is false, since the engine itself typically does not provide DOM support.
@Ignacio, I updated my post using less-specific language, as to not confuse.
@Ignacio, the JavaScript engine doesn't provide DOM support. But the layout engine (which WebKit includes) does, and it used by browsers. Gecko is another candidate layout engine, also used by browsers.
0

Correction: V8 does not support DOM, just JavaScript, so not what you were looking for...

V8:

3 Comments

From the link: "The DOM is not, however, typically provided by the JavaScript engine but instead by a browser. The same is true of V8—Google Chrome provides the DOM."
V8 has no DOM layer support, afaik.
@Ignacio, @John I did not catch the DOM part - thanks guys! I corrected the post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.