1

I'm writing a simple "httpref" program that captures search engine responses into a large text file. For example, the log file contains a lot of data, and every so often there is a ?q= where it terminates with an &:

http://www.bing.com/search?q=blogging+for+answers&qs=n&form=QBRE&pq=blogging+for+answers&sc=8-18&sp=-1&sk=

What I'd like to do is create an array using explode where it filters out everything except for the desired result: i.e. "blogging+for+answers". The text file contains a lot of junk and I only want to keep the results between each ?q= and & symbols.

From what I've seen there isn't a simple parser that extracts parameter data from a URL string.

i.e. $q[0] = "blogging+for+answers"
Every time I use explode with just ?q= it takes the entire text up until the next occurrence of ?q=, I only want it up until the &

2
  • Parse the URL using functions like parse_str available in PHP! Commented Sep 25, 2012 at 7:11
  • you are parsing a whole page or just URL. because "next occurrence of ?q=' implies a whole page. Commented Sep 25, 2012 at 7:38

4 Answers 4

4

Use parse_url to get the querystring component along with parse_str to extract the querystring variables you want.

<? 
$url = "http://www.bing.com/search?q=blogging+for+answers&qs=n&form=QBRE&pq=blogging+for+answers&sc=8-18&sp=-1&sk=";
$querystring = parse_url($url,PHP_URL_QUERY);
parse_str($querystring,$arr);
echo $arr['q'];
?>
Sign up to request clarification or add additional context in comments.

5 Comments

if you look at the last line of his question he is implying that he is parsing a whole file. Not just a url. parse_url will only parse a url.
True. So he would want to extract each url from his log with something like preg_match, then apply the parse_url and parse_str to the resultant urls.
That is what I was thinking. You can see answer below with "-2" :)
I am parsing a whole file using file_get_contents(). So how would I use preg_match to get the entire URL from each instance in the file and assign the urls to different array variables? What I'm seeking to do is create an array that you can easily print_r and quickly see all the keyword matches.
this is the pattern for urls %^((https?://)|(www\.))([a-z0-9-].?)+(:[0-9]+)?(/.*)?$%i But did you try my code below. it will just return the keyword match of each string. I tested it.
1

Try parse_url().

Following is the sameple code:

<?php
    $url = 'http://username:password@hostname/path?arg=value#anchor';
    print_r(parse_url($url));
    echo parse_url($url, PHP_URL_PATH);
?>

Comments

0

This will give you $q as an array as you described above.

$file = file_get_contents($filename);
preg_match_all("/http[^'\" ]*[?][^'\" ]*[&]q[=]([^'\"& ]*)/",$file ,$matches);

foreach($matches as $match)
     $q[] = $match[1];

EDIT

Explain what this expression does. This will find each occurrence of http*****?****&q=(result)[&'" ] and then return $matches as an array of each occurrence. The expression also check that there are no spaces or quotes then it will not add to $matches

2 Comments

I don't understand why this is being downvoted. Is it because it doesn't explain how the regex works? regular expressions are obviously non-obvious to folks new to regular expressions, but this is a perfectly valid answer.
You're right so I edited it with explanation. I was once a Perl guy so reg exps are a given.
0
<?php
    function multiexplode ($delimiters,$data) {
        $MakeReady = str_replace($delimiters, $delimiters[0], $data);
        $Return    = explode($delimiters[0], $MakeReady);
        return  $Return;
    }
    $data = "Ami jani na, amita maybe-tomar () kam";
 
    $Exploded = multiexplode(array(", ","-"," () "),$data);
 
    echo"<PRE>"; print_r($Exploded); echo"</PRE>";
 
?>



Array
(
    [0] => Ami jani na
    [1] => amita maybe
    [2] => tomar
    [3] => kam
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.