spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / programming / php / cookbook /chap11 / 1 To page 1current pageTo page 3To page 4To page 5To page 6To page 7
[previous] [next]

PHP Cookbook: Web Automation

Sr. Web Developer
Professional Technical Resources
US-OR-Portland

Justtechjobs.com Post A Job | Post A Resume
Developer News
Mandrake Linux Founder Back, Virtually
Amazon: We're a Technology Company
Sun Expands MySQL With Closed Source

Fetching a URL with the GET Method

Problem

You want to retrieve the contents of a URL. For example, you want to include part of one web page in another page's content.

Solution

Pass the URL to fopen( ) and get the contents of the page with fread( ):

$page = '';
$fh = fopen('http://www.example.com/robots.txt','r') or die($php_errormsg);
while (! feof($fh)) {
    $page .= fread($fh,1048576);
}
fclose($fh);

You can use the cURL extension:

$c = curl_init('http://www.example.com/robots.txt');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
$page = curl_exec($c);
curl_close($c);

You can also use the HTTP_Request class from PEAR:

require 'HTTP/Request.php';
 
$r = new HTTP_Request('http://www.example.com/robots.txt');
$r->sendRequest();
$page = $r->getResponseBody();

Discussion

You can put a username and password in the URL if you need to retrieve a protected page. In this example, the username is david, and the password is hax0r. Here's how to do it with fopen( ):

$fh = fopen('http://david:hax0r@www.example.com/secrets.html','r') 
    or die($php_errormsg);
while (! feof($fh)) {
    $page .= fread($fh,1048576);
}
fclose($fh);

Here's how to do it with cURL:

$c = curl_init('http://www.example.com/secrets.html');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_USERPWD, 'david:hax0r');
$page = curl_exec($c);
curl_close($c);

Here's how to do it with HTTP_Request:

$r = new HTTP_Request('http://www.example.com/secrets.html');
$r->setBasicAuth('david','hax0r');
$r->sendRequest();
$page = $r->getResponseBody();

While fopen( ) follows redirects in Location response headers, HTTP_Request does not. cURL follows them only when the CURLOPT_FOLLOWLOCATION option is set:

$c = curl_init('http://www.example.com/directory');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1);
$page = curl_exec($c);
curl_close($c);

cURL can do a few different things with the page it retrieves. If the CURLOPT_RETURNTRANSFER option is set, curl_exec( ) returns a string containing the page:

$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
$page = curl_exec($c);
curl_close($c);

To write the retrieved page to a file, open a file handle for writing with fopen( ) and set the CURLOPT_FILE option to the file handle:

$fh = fopen('local-copy-of-files.html','w') or die($php_errormsg);
$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_FILE, $fh);
curl_exec($c);
curl_close($c);

To pass the cURL resource and the contents of the retrieved page to a function, set the CURLOPT_WRITEFUNCTION option to the name of the function:

// save the URL and the page contents in a database
function save_page($c,$page) {
    $info = curl_getinfo($c);
    mysql_query("INSERT INTO pages (url,page) VALUES ('" .
                mysql_escape_string($info['url']) . "', '" .
                mysql_escape_string($page) . "')");
}
 
$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_WRITEFUNCTION, 'save_page');
curl_exec($c);
curl_close($c);

If none of CURLOPT_RETURNTRANSFER, CURLOPT_FILE, or CURLOPT_WRITEFUNCTION is set, cURL prints out the contents of the returned page.

The fopen() function and the include and require directives can retrieve remote files only if URL fopen wrappers are enabled. URL fopen wrappers are enabled by default and are controlled by the allow_url_fopen configuration directive. On Windows, however, include and require can't retrieve remote files in versions of PHP earlier than 4.3, even if allow_url_fopen is on.

See Also

Fetching a URL with the POST Method for fetching a URL with the POST method; Recipe 18.3 discusses opening remote files with fopen(); documentation on fopen( ) at http://www.php.net/fopen, include at http://www.php.net/include, curl_init( ) at http://www.php.net/curl-init, curl_setopt( ) at http://www.php.net/curl-setopt, curl_exec( ) at http://www.php.net/curl-exec, and curl_close( ) at http://www.php.net/curl-close; the PEAR HTTP_Request class at http://pear.php.net/package-info.php?package=HTTP_Request


home / programming / php / cookbook /chap11 / 1 To page 1current pageTo page 3To page 4To page 5To page 6To page 7
[previous] [next]

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info

Legal Notices, Licensing, Reprints, Permissions, Privacy Policy.
Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

webref The latest from WebReference.com Browse >
Working with the DOM Stylesheets Collection · Administering RBAC in PHP 5 CMS Framework · xref: Automatic Cross Referencing Script
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
Combine BottomCount() with Other MDX Functions to Add Sophistication · Creating a Daemon with Python · The Coming Voice-over-WiMAX Revolution

Created: March 27, 2003
Revised: March 27, 2003

URL: http://webreference.com/programming/php/chap11/1