0

Why is this code able to fetch data from the following first page and insert them into an array by numbering the array, while it fails to do the same for the following second page:

http://nimishprabhu.com

https://www.fiverr.com/search/gigs?utf8=%E2%9C%93&source=guest-homepage&locale=en&search_in=everywhere&query=php

The page shows arrays numbered like the following, which is not correct:

Array ( [0] => mailto:[email protected] ) 
Array ( [0] => https://collector.fiverr.com/api/v1/collector/noScript.gif?appId=PXK3bezZfO
        [1] => https://collector.fiverr.com/api/v1/collector/pxPixel.gif?appId=PXK3bezZfO ) 
Array ( [0] => One Small Step )

Code:

<?php

/*
2.
FINDING HTML ELEMENTS BASED ON THEIR TAG NAMES

Suppose you wanted to find each and every image on a webpage or say, each 
and every hyperlink. 
We will be using “find” function to extract this information from the 
object. Doing it using Simple HTML DOM Parser :
*/

include('simple_html_dom.php');

$html = file_get_html('https://www.fiverr.com/search/gigs?utf8=%E2%9C%93&source=guest-homepage&locale=en&search_in=everywhere&query=php');

//to fetch all hyperlinks from a webpage
$links = array();
foreach($html->find('a') as $a) {
  $links[] = $a->href;
}
print_r($links);
echo "<br />";

//to fetch all images from a webpage
$images = array();
foreach($html->find('img') as $img) {
  $images[] = $img->src;
}
print_r($images);
echo "<br />";

//to find h1 headers from a webpage
$headlines = array();
foreach($html->find('h1') as $header) {
  $headlines[] = $header->plaintext;
}
print_r($headlines);
echo "<br />";

?>

Any suggestions and code samples welcome for my learning purpose. I am a self study student.

1 Answer 1

2

The reason is that the page you are trying to download (fiverr.com) is JavaScript-based with dynamically loaded content. This will not work in PHP, because it only sees the HTML that was sent by the server, it can't parse and run JavaScript. Because this is for learning purposes, you can simply try a different website.

However, if you want a working solution, you should look into Selenium. It's basically a headless web browser which does everything like other browsers, including running JavaScript. Through its web driver you will be able to fully parse websites like fiverr.com.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.