1

Recently a question has been asked, how to get the Domain of any URL available as a String.

Unfortunately the question has been closed, and the so far linked answers only pointed to solutions using Regex (which fails for special cases like .co.uk) and static solutions, considering those exceptions (which ofc. might change over time).

So, I was searching for a generic solution for this question, that will work at any time and found one. (At least a couple of tests are positive)

If you find a domain for which the attempted solution does not work, feel free to mention it, and I'll try to imrpove the snipped to cover that case as well.

1 Answer 1

3

To find the domain of any string given, a three-step solution seems to work best:

I performed only some tests and it seems like the result is as expected. The method directly generates the output, but can be modified to return the domain name instead of generating output:

<?php

getDomain("http://www.stackoverflow.com");
getDomain("http://www.google.co.uk");
getDomain("http://books.google.co.uk");
getDomain("http://a.b.c.google.co.uk");
getDomain("http://www.nominet.org.uk/intelligence/statistics/registration/");
getDomain("http://invalid.fail.pooo");
getDomain("http://AnotherOneThatShouldFail.com");


function getDomain($url){
  echo "Searching Domain for '".$url."': ";
  //Step 1: Get the actual hostname
  $url = parse_url($url);
  $actualHostname = $url["host"];

  //step 2: Top-Down approach: check DNS Records for the first valid A-record.
  //Re-Assemble url step-by-step, i.e. for www.google.co.uk, check: 
  // - uk
  // - co.uk
  // - google.co.uk (will match here)
  // - www.google.co.uk (will be skipped)

  $domainParts = explode(".", $actualHostname);
  for ($i= count($domainParts)-1; $i>=0; $i--){
    $domain = "";
    $currentCountry = null;
    for ($j = count($domainParts)-1; $j>=$i; $j--){
      $domain = $domainParts[$j] . "." . $domain;

      if ($currentCountry == null){
        $currentCountry = $domainParts[$j];
      }
    }
    $domain = trim($domain, ".");
    $validRecord = checkdnsrr($domain, "A"); //looking for Class A records

    if ($validRecord){
       //If the host can be resolved to an ip, it seems valid.
       //if hostname is returned, its invalid.  
       $hostIp = gethostbyname($domain);  
       $validRecord &= ($hostIp != $domain);

       if ($validRecord){
         //last check: DNS server might answer with one of ISPs default server ips for invalid domains.
         //perform a test on this by querying a domain of the same "country" that is invalid for sure to obtain an
         //ip list of ISPs default servers. Then compare with the response of current $domain.
         $validRecord &= !(in_array($hostIp, gethostbynamel("iiiiiiiiiiiiiiiiiinvaliddomain." . $currentCountry)));
       }
    }

    //valid record?
    if ($validRecord){
      //return $domain;
      echo $domain."<br />";
      return;
    }
  }
  //return null;
  echo " not resolved.<br />";
}


?>

Output of the example above:

Searching Domain for 'http://www.stackoverflow.com': stackoverflow.com
Searching Domain for 'http://www.google.co.uk': google.co.uk
Searching Domain for 'http://books.google.co.uk': google.co.uk
Searching Domain for 'http://a.b.c.google.co.uk': google.co.uk
Searching Domain for 'http://www.nominet.org.uk/intelligence/statistics/registration/': nominet.org.uk
Searching Domain for 'http://invalid.fail.pooo': not resolved.
Searching Domain for 'http://AnotherOneThatShouldFail.com': not resolved.

This is only a very limited set of test-cases but I cannot imagine a case, where a domain has no A-record.

As a nice side-effect, this also validates urls and does not just rely on theoretically valid formats like the last examples are showing.

best, dognose

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.