0

I am trying to scrape the real time values of variables of a website using my NodeJS application.

I have tried "cheerio" but it didn't worked. It only returns me the HTML of the target in a string not the runtime values of the variables.

For example the value of "google.authuser" of "www.google.com"

screenshot

Please suggest me a simple solution. Thanks.

2
  • 1
    You are looking for a headless browser (that runs the code in the website) then. Commented Jan 28, 2019 at 17:44
  • 3
    +1 @Bergi, a headless browser will do the job, you can then do something like .executeScript(...) and get the value back. Commented Jan 28, 2019 at 17:46

1 Answer 1

3

Thanks for the recommendations of the Headless browser. Headless Chrome NodeJS API Puppeteer worked for me.

async function crawl() {
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch();

const page = await browser.newPage();
await page.goto('http://www.google.com');

const dataPromise = await page.evaluate(() => {
    return Promise.resolve({
        number: google.authuser
    });
});

browser.close();
return dataPromise;}
Sign up to request clarification or add additional context in comments.

1 Comment

From page.evaluate() doc: "If the function passed to the page.evaluate returns a Promise, then page.evaluate would wait for the promise to resolve and return its value." So your code can be simplified a bit: const data = await page.evaluate(() => google.authuser);

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.