PHP - Using Curl To Get Data Off A Website Help Please
OK, I have the initial cURL working but need to figure out how to extract data I want off that webpage to display or store in a database, I tried using dom and xpath, but because of the way the page displays using css, i think its not picking it up. Here is my cURL script:
<?php $userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)'; $target_url = "www.test.com"; $ch = curl_init(); curl_setopt($ch, CURLOPT_USERAGENT, $userAgent); curl_setopt($ch, CURLOPT_URL,$target_url); curl_setopt($ch, CURLOPT_FAILONERROR, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER,true); curl_setopt($ch, CURLOPT_TIMEOUT, 10); $html = curl_exec($ch); if (!$html) { echo "<br />cURL error number:" .curl_errno($ch); echo "<br />cURL error:" . curl_error($ch); exit; } // parse the html into a DOMDocument $dom = new DOMDocument(); $dom->loadHTML($html); // grab all the on the page $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//td"); for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); $url = $href->getAttribute('href'); storeLink($url,$target_url); echo "<br />Link stored: $url"; } ?> and here is a snippet of the source of the page I am getting: <span id="lblTest"><h1 id='surrZipTitle'>Agents in Surrounding Zip Codes</h1><table cellpadding='0' cellspacing='0' border='0' class='tblDent'><tr><td class='tdEliteTitle'><span class='caaSubHead3 addwidth'>H.K. Dent Elite</span></td></tr><tr><td class='tdEliteContent'><table cellpadding='0' cellspacing='0' border='0'><tr><td valign='top'><span class='caaAgencyName2 addwidth'>PROFESSIONAL INS ASSOC, INC.</span></td><td valign='top'> </td></tr></table><table cellpadding='0' cellspacing='0' border='0'><tr><td width='360px' valign='top'><div class='addressBlock'><span>4444 MANZANITA AVE STE 6</span><br /><span>CARMICHAEL , CA 95608-1488</span><br /><a class='faaBlueLink' id='lnkContact' href='http://www.safeco.com/portal/server.pt/gateway/PTARGS_0_20656_395_362_0_43/http%3B/por-portlets-prd.int.apps.safeco.com%3B13425/dotcom/FindAnAgent/find-an-agent/contactanagent.aspx?RequestType=agency&level=elite&Id=0415199904150295&lat=38.646142&lng=-121.327623' onclick='oOobj4.Preferences.Plugins.Events.poX=0;'>Contact & Directions</a> <a class='faaBlueLink' id='lnkWebSite' style='display: none;' href='http://' target='_blank' onclick="return trackEvent('/External-Link/AgentWebsite/ ','PROFESSIONAL INS ASSOC, INC. ');">Website</a></div></td><td valign='top'> </td></tr></table></td></tr></table><table cellpadding='0' cellspacing='0' border='0' class='tblDent'><tr><td class='tdEliteTitle'><span class='caaSubHead3 addwidth'>H.K. Dent Elite</span></td></tr><tr><td class='tdEliteContent'><table cellpadding='0' cellspacing='0' border='0'><tr><td valign='top'><span class='caaAgencyName2 addwidth'>AMERICAN AIM AUTO INS AGY, INC</span></td><td valign='top'> </td></tr></table><table cellpadding='0' cellspacing='0' border='0'><tr><td width='360px' valign='top'><div class='addressBlock'><span>5339 SAN JUAN AVE</span><br /><span>FAIR OAKS , CA 95628-3318</span><br /><a class='faaBlueLink' id='lnkContact' href='http://www.safeco.com/portal/server.pt/gateway/PTARGS_0_20656_395_362_0_43/http%3B/por-portlets-prd.int.apps.safeco.com%3B13425/dotcom/FindAnAgent/find-an-agent/contactanagent.aspx?RequestType=agency&level=elite&Id=0415911704151222&lat=38.66237&lng=-121.292429' onclick='oOobj4.Preferences.Plugins.Events.poX=0;'>Contact & Directions</a> So basically I want to extract the agency name like "<span class='caaAgencyName2 addwidth'>PROFESSIONAL INS ASSOC, INC.</span>" and the address which always use the same div class like "caaAgencyName2" and "addressBlock". How can this be accomplished? Similar TutorialsHello, I am trying to use cURL to login to a website, but I can't seem to get it working. Website I'm trying to login to: http://www.uniquearticlewizard.com/amember/member.php Here is what their form code looks like: Code: [Select] <form name="login" method="post" action="/amember/member.php"> <table class="vedit" > <tr> <th>Username</th> <td><input type="text" name="amember_login" size="15" value="" /></td> </tr> <tr> <th>Password</th> <td><input type="password" name="amember_pass" size="15" /></td> </tr> <tr> <td colspan="2" style='padding:0px; padding-bottom: 2px;'> <input type="checkbox" name="remember_login" value="1"> <span class="small">Remember my password?</span> </td> </tr> </table> <input type="hidden" name="login_attempt_id" value="1291657877" /> <br /> <span class='button'><input type="submit" value=" Login " /></span> <span class='button'><input type="button" value=" Back " onclick="history.back(-1)" /></span> </form> As you can see they are using a javascript button to submit the form, which doesn't have a name attribute. So I'm not sure how to get around this and tell cURL to submit the form. When I Googled I found something that said just submit the other information and it will submit itself, but I'm not sure if that's right. Here is my attempt, but I just get a blank screen. I think the script is working, but something on there end is exiting out due to me not supplying a required piece of information. I'm not sure what that is though. Code: [Select] <?php set_time_limit(0); $options = array( CURLOPT_RETURNTRANSFER => true, // return web page CURLOPT_HEADER => false, // don't return headers CURLOPT_FOLLOWLOCATION => true, // follow redirects CURLOPT_ENCODING => "", // handle all encodings CURLOPT_USERAGENT => "spider", // who am i CURLOPT_AUTOREFERER => true, // set referer on redirect CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect CURLOPT_TIMEOUT => 120, // timeout on response CURLOPT_MAXREDIRS => 10, // stop after 10 redirects ); $ch = curl_init( "http://www.uniquearticlewizard.com/amember/member.php" ); curl_setopt_array( $ch, $options ); $content = curl_exec( $ch ); $err = curl_errno( $ch ); $errmsg = curl_error( $ch ); $header = curl_getinfo( $ch ); curl_close( $ch ); $header['content'] = $content; preg_match('/name="login_attempt_id" value="(.*)" \/>/', $header['content'], $form_id); $value = $form_id[1]; $ch = curl_init(); // SET URL FOR THE POST FORM LOGIN curl_setopt($ch, CURLOPT_URL, 'http://www.uniquearticlewizard.com/amember/member.php'); // ENABLE HTTP POST curl_setopt ($ch, CURLOPT_POST, 1); $data = array('amember_login' => '*****', 'amember_pass' => '*****', 'login_attempt_id' => $value, 'remember_login' => '1'); // SET POST PARAMETERS : FORM VALUES FOR EACH FIELD curl_setopt($ch, CURLOPT_POSTFIELDS, $data); // IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt'); # Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL # not to print out the results of its query. # Instead, it will return the results as a string return value # from curl_exec() instead of the usual true/false. curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); // EXECUTE 1st REQUEST (FORM LOGIN) $store = curl_exec($ch); echo $store; curl_close ($ch); ?> They do have a form value that changes on every page refresh, it just tracks the login attempt (which is a long number). I was able to scrape that and put it in the form with the correct value. I thought adding that would successfully log me in, but apparently there is something else going on. Any help would be greatly appreciated! Hi! I'm into a little project where I want to retrieve data from the swedish news website DN's SOS Live page (http://www.dn.se/nyheter/soslive). On the page there is an iFrame and there is the data I want to retrieve. The iFrame address is: http://div.dn.se/dn/sos/soslive.php?id= ... er/soslive Here is the code I have and it stoped working yesterday. Code: [Select] function curl_download($Url){ // is cURL installed yet? if (!function_exists('curl_init')){ die('Sorry cURL is not installed!'); } // OK cool - then let's create a new cURL resource handle $ch = curl_init(); // Now set some options (most are optional) // Set URL to download curl_setopt($ch, CURLOPT_URL, $Url); // Set a referer curl_setopt($ch, CURLOPT_REFERER, "http://www.dn.se"); // User agent curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0"); // Include header in result? (0 = yes, 1 = no) curl_setopt($ch, CURLOPT_HEADER, 0); // Should cURL return or print out the data? (true = return, false = print) curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // Timeout in seconds curl_setopt($ch, CURLOPT_TIMEOUT, 10); // Download the given URL, and return output $output = curl_exec($ch); // Close the cURL resource, and free system resources curl_close($ch); return $output; } $sosURL = 'http://div.dn.se/dn/sos/soslive.php?id=p://www.dn.se/nyheter/soslive'; $data = curl_download($sosURL); The data variable is empty. I notice now that when I enter the webaddress "http://div.dn.se/dn/sos/soslive.php?id=p://www.dn.se/nyheter/soslive" there is no content, although this is the web address in the iFrame. How has DN solved this and how could I get around it? Best regards Stefan I am trying to create a remote login to one website using mine. The users will need to enter their username and password on my site, and if they are registered to my website, their login credentials will be sent to another website and a page will be retrieved.
I am stuck at sending the users' data to the original site. The original site's viewsource is this..
<form method=post> <input type="hidden" name="action" value="logon"> <table border=0> <tr> <td>Username:</td> <td><input name="username" type="text" size=30></td> </tr> <tr> <td>Password:</td> <td><input name="password" type="password" size=30></td> </tr> <td></td> <td align="left"><input type=submit value="Sign In"></td> </tr> <tr> <td align="center" colspan=2><font size=-1>Don't have an Account ?</font> <a href="?action=newuser"><font size=-1 color="#0000EE">Sign UP Now !</font></a></td> </tr> </table>I have tried this code, but not works. <?php $username="username"; $password="password"; $url="http://www.example.com/index.php"; $postdata = "username=".$username."&password=".$password; $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); curl_setopt ($ch, CURLOPT_TIMEOUT, 60); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_REFERER, $url); curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata); curl_setopt ($ch, CURLOPT_POST, 1); $result = curl_exec ($ch); header('Location: track.html'); //echo $result; curl_close($ch); ?>Any help would be appreciated, Thanks in advance. Hi, I'm trying to auto login to a website(created by me ;-) ) using the curl. as I am new to this I don't know how to make this possible. following is the code I tried but this is not submitting the data in the other site. the 'usr_name' and 'password' the field names in the page "http://localhost/myproject/users/login". and I have given a print_r in that site and it is displaying Array ( [loginType] => L [step] => confirmation [usr_name] => dasd@hotmail.com [password] => test123 ) but not submitting the form. please help me.... this is the code i've tried..I got this from web... $login = "http://localhost/myproject/users/login"; $param="loginType=L&step=confirmation&usr_name=dasd@hotmail.com&password=test123"; $c = curl_init(); curl_setopt($c, CURLOPT_URL, $login); curl_setopt($c, CURLOPT_COOKIEJAR, "cookies.txt"); curl_setopt($c, CURLOPT_COOKIEFILE, "cookies.txt"); curl_setopt($c, CURLOPT_POST, 1); curl_setopt($c, CURLOPT_POSTFIELDS, $param); curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); curl_setopt($c, CURLOPT_FOLLOWLOCATION, 1); echo curl_exec($c); thanks in advance.... Hello! I would like to use cURL to login to the website: lockerz.com I have some code, but it doesn't seem to work: <?php // INIT CURL $ch = curl_init(); // SET URL FOR THE POST FORM LOGIN curl_setopt($ch, CURLOPT_URL, 'http://lockerz.com/auth/login'); // ENABLE HTTP POST curl_setopt ($ch, CURLOPT_POST, 1); // SET POST PARAMETERS : FORM VALUES FOR EACH FIELD curl_setopt ($ch, CURLOPT_POSTFIELDS, 'email-email=EMAIL@hotmail.com&password-password=PASSWPRD'); // IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt'); # Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL # not to print out the results of its query. # Instead, it will return the results as a string return value # from curl_exec() instead of the usual true/false. curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); // EXECUTE 1st REQUEST (FORM LOGIN) $store = curl_exec ($ch); // SET FILE TO DOWNLOAD curl_setopt($ch, CURLOPT_URL, 'http://lockerz.com/auction'); // EXECUTE 2nd REQUEST (FILE DOWNLOAD) $content = curl_exec ($ch); // CLOSE CURL curl_close ($ch); echo $content; ?> Thank you very much if you can help! Hello, I'm using curl to grab a new solar image once an hour or so from the Solar Dynamics Observatory (example below). I'm trying to archive new images and am struggling with that. If I download an image, the filetime() function returns the current time since I downloaded it and wrote it to a fresh file. The result is that the file is always "new", even if the image hasn't changed on the SDO website. Do you have an idea on how to check the last modified time of a file through curl or other means so that I'm not downloading duplicate images? Thanks a ton! //fetch image $ch = curl_init("http://www.somewebsite.com/theimage.jpg"); curl_setopt($ch,CURLOPT_FOLLOWLOCATION,1); $file = "../images/latest/theimage.jpg"; $fp = fopen($file, "w"); curl_setopt($ch, CURLOPT_FILE, $fp); curl_setopt($ch, CURLOPT_HEADER, 0); curl_exec($ch); curl_close($ch); fclose($fp); //this part is no good... //get last modified date if (file_exists($file)) { $filetime = filemtime($file); } I am trying to go to http://lirr42.mta.info/ and get the data from the submitted form. I have rebuilt the form exactly as needed, and done a few adjustments. The posts that I am sending from MY form, are the same as the one on the site. This site allows public data mining by the way. So does someone seem something wrong with my code? I know the basics are working. I have another page that is working fine. It's just this one ends up returning something..it returns part of the data but it's something wrong with it. Any advice is appreciated. Code: [Select] <?php define( 'DEBUG', true ); define( 'DEFAULT_STOP', 'Broadway' ); // report all errors during development/debug mode if ( DEBUG ) error_reporting( E_ALL ); else error_reporting( 0 ); include( 'simplehtmldom/simple_html_dom.php' ); $mta_post_url = 'http://lirr42.mta.info/schedules.php'; // GTFS Specification File Definitions $gtfs_files = array ( 'agency' => 'agency.txt', // required 'stops' => 'stops.txt', // required 'routes' => 'routes.txt', // required 'trips' => 'trips.txt', // required 'stop_times' => 'stop_times.txt', // required 'calendar' => 'calendar.txt', // required 'calendar_dates' => 'calendar_dates.txt', 'fare_rules' => 'fare_rules.txt', 'fare_attributes' => 'fare_attributes.txt', 'shapes' => 'shapes.txt', 'frequencies' => 'frequencies.txt', 'transfers' => 'transfers.txt' ); $gtfs_pickup_codes = array ( 0 => 'Regularly scheduled pickup', 1 => 'No pickup available', 2 => 'Must phone agency to arrange pickup', 3 => 'Must coordinate with driver to arrange pickup' ); $gtfs_dropoff_codes = array ( 0 => 'Regularly scheduled drop off', 1 => 'No drop off available', 2 => 'Must phone agency to arrange drop off', 3 => 'Must coordinate with driver to arrange drop off' ); // load stops file $stopsFile = fopen( 'data/lir/' . $gtfs_files[ 'stops' ], "r" ); if ( $stopsFile ) { // get (and toss) header row $stopsHeader = fgetcsv( $stopsFile, 1000 ); // print_r( $stopsHeader ); // will hold HTML output for Select dropdown for stops $stopSelectOptions = ''; // build array from file data while ( $data = fgetcsv( $stopsFile, 1000 ) ) { $stopsData[ $data[1] ] = $data[0]; } // get a sorted array of the stop names ( yes this could be done with array_multisort or usort but I didnj't feel like it right now ) $stopNames = array_keys( $stopsData ); sort( $stopNames ); // loop through sorted array and create SELECT options with default SELECTED at Grand Central Terminal foreach( $stopNames as $stop ) $stopSelectOptions .= sprintf( '<option %s value="%d">%s</option>', ( DEFAULT_STOP == $stop ) ? 'selected="selected"' : '' , $stopsData[ $stop ], $stop ) . "\n\r"; } fclose( $stopsFile ); // load ads file $adsFile = fopen( 'ads.txt', "r" ); if ( $adsFile ) { // get (and toss) header row $adsHeader = fgetcsv( $adsFile, 1000 ); // will hold ads available for stops $ads = array(); // build array from file data while ( $data = fgetcsv( $adsFile, 1000 ) ) { $ads[ $data[0] ] = array ( 'filename' => $data[1], 'url' => $data[2], 'text' => $data[3] ); } fclose( $adsFile ); } // printout the html header require_once( 'header.php' ); if ( 'POST' == $_SERVER[ 'REQUEST_METHOD' ] ) { $from_stop = $_POST[ 'FromStation' ]; // @todo Filter! http://www.php.net/manual/en/filter.filters.validate.php $orig_station = $from_stop; $to_stop = $_POST[ 'ToStation' ]; // @todo Filter! http://www.php.net/manual/en/filter.filters.validate.php $dest_station = $to_stop; if ( $to_stop == $from_stop ) { $error = 'Originating and Destination stops are the same.'; } else { print_ad( $orig_station, $dest_station, $location='top' ); $travel_date = $_POST[ 'RequestDate' ]; $requestTime = $_POST[ 'RequestTime' ]; $am_pm = $_POST[ 'RequestAMPM' ]; $filter = $_POST[ 'sortBy' ]; /* * Lets post this to MTA.info */ $mta_curl = curl_init(); $mta_curl_options = array ( CURLOPT_FAILONERROR => true, CURLOPT_FOLLOWLOCATION => false, // CURLOPT_MUTE => true, CURLOPT_POST => true, CURLOPT_RETURNTRANSFER => true, CURLOPT_CONNECTTIMEOUT => 30, CURLOPT_TIMEOUT => 30, CURLOPT_URL => $mta_post_url, CURLOPT_REFERER => 'http://lirr42.mta.info/', CURLOPT_USERAGENT => 'MTA Info Scrape/1.0', CURLOPT_POSTFIELDS => http_build_query( $_POST ) ); curl_setopt_array( $mta_curl, $mta_curl_options ); $mta_response = curl_exec( $mta_curl ); if ( false === $mta_response ) { curl_close( $mta_curl ); die( sprintf( 'Response Code:%s, Curl Error No:%s, Curl Error Message:%s', curl_getinfo( $mta_curl, CURLINFO_HTTP_CODE ),curl_errno( $mta_curl ), curl_error( $mta_curl ) ) ); } else { curl_close( $mta_curl ); echo '<pre>' . htmlentities( $mta_response ) . '</pre>'; $html = new simple_html_dom(); $html->load( $mta_response ); $schedule_table = $html->find( 'table', 0 ); // find second table in the response $row_count = 0; // will hold HTML output for table $stopSchedule = '<table><th>Departs</th><th>Arrives</th><th>Minutes</th><th>Transfer</th><th>Fare</th></tr>'; foreach ( $schedule_table->find('tr') as $schedule_row ) { $row_count++; // check for and skip header row if ( 1 == $row_count ) continue; // check for and skip last row which first TD has rowspan attribute if ( $schedule_row->children(0)->colspan ) { continue; } // extract table data for this row $depart_time = $schedule_row->children( 0 )->innertext; $arrive_time = $schedule_row->children( 2 )->innertext; $minutes_traveled = $schedule_row->children( 4 )->innertext; $transfer = $schedule_row->children( 5 )->innertext; $fare = $schedule_row->children( 6 )->plaintext; // add table row to html output $stopSchedule .= sprintf( '<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>', $depart_time, $arrive_time, $minutes_traveled, $transfer, $fare ) . "\n\r"; } // end foreach schedule_row // add table end tag to html output $stopSchedule .= '</table>'; // output table to browser echo $stopSchedule; print_ad( $orig_station, $dest_station, $location='bottom' ); } // end if mta response } // end if not same orig and dest } // end if POST if ( 'GET' == $_SERVER[ 'REQUEST_METHOD' ] || ! empty( $error ) ) { print_ad( null, null, $location = 'default' ); ?> <form method="post"> <?php if ( ! empty ( $error ) ) { echo sprintf( '<div class="error">%s</div>', $error ); } ?> From:<br /> <select id="orig_station" name="FromStation" tabindex="2"> <?php echo $stopSelectOptions; ?> </select> <br /> <br /> To:<br /> <select id="dest_station" name="ToStation" tabindex="3"> <?php echo $stopSelectOptions; ?> </select> <br /> <br /> <input id="date1" name="RequestDate" size="12" maxlength="10" tabindex="4" value="<?php echo date( 'm/d/Y' ); ?>" /> <?php $currentHour = trim( date( 'h' ) ); $currentMinutes = trim( date( 'i' ) ); $currentMinutesFloor = sprintf( '%02d' , $currentMinutes - ( $currentMinutes % 30 ) ); $currentAMPM = trim ( date( 'A' ) ); $start_time = strtotime( '1:00 AM' ); $end_time = strtotime( '1:00 PM' ); ?> <select name="RequestTime" tabindex="5"> <?php while( $start_time < $end_time ) { $option_time = date( 'h:i', $start_time ); $option_hour = trim( date( 'h', $start_time ) ); $option_minutes = trim( date( 'i', $start_time ) ); $option_selected = ( $option_hour == $currentHour && $option_minutes == $currentMinutesFloor ); echo sprintf( '<option %s>%s</option>', ( $option_selected ) ? 'selected="selected"' : '', $option_time ) . "\r\n"; $start_time = $start_time + 30*60; } ?> </select> <select name="RequestAMPM" tabindex="6"> <option value="AM" <?php echo ( 'AM' == $currentAMPM ) ? 'selected="SELECTED"' : ''; ?>>AM</option> <option value="PM" <?php echo ( 'PM' == $currentAMPM ) ? 'selected="SELECTED"' : ''; ?>>PM</option> </select> <input name="sortBy" type="hidden" value="1" /> <input type="submit" name="submit" value="Check Schedule" /> </form> <!-- images/base/ ic_menu_back.png ic_menu_home.png --> <?php } require_once( 'footer.php' ); ?> I'm trying to make a League of Legends (a video game) community website, both as a personal project and for practice. Now the game has a lot of champions, each of whom have 5 unique abilities. Now, I thought about manually inputting all the details about each champion into a MySQL database, but that would long and tedious, and I don't really have the time for it now. Also, the game patches very oftern (like, once every 2 weeks) which changes many of the stats, etc. of the champion, and it is not possible for me to keep manually updating these every time there is a patch. Fortunately, there is a League of Legends Wiki which has all the data I need in their specific champion pages, which they keep updated per patch. So I was wondering if there was any way to get the data from the divs in the wiki, and have it display on my site. What I want to do in my website is that whenever someone types a champion's name (in a post or whatever), I want it to display a hover-over dialog with some of the champions details. And a lot of other features such as that. In plain English I need a way to : > Tell PHP to go to the wiki's source code on a specific page > Find a specific div container > Get X data from there > Pass X data into a function to display the hover-over I think this way, I would not have to maintain a database as I can leech off the wiki's data. I have not coded anything like this before, so I would like a few pointers as to how to achieve this. Any help will be appreciated! Hi there, OK, there is what I'm trying to accomplish - I want to send POST instantly when the page is loaded without "Submit" button or anything like that. Here how it looks using JavaScript: Code: [Select] <form name="postdata" action="http://www.mywebsite.com/post.php" method="post"> <input name="data" type="text" value="1"/> </form> <script type="text/javascript" language="JavaScript"> document.postdata.submit(); </script> Works fine with JavaScript, however i can't use it. Here is reference for CURL: http://php.net/manual/en/book.curl.php I found this curl_post function at php.net to simplify things up. <?php /** * Send a POST requst using cURL * @param string $url to request * @param array $post values to send * @param array $options for cURL * @return string */ function curl_post($url, array $post = NULL, array $options = array()) { $defaults = array( CURLOPT_POST => 1, CURLOPT_HEADER => 0, CURLOPT_URL => $url, CURLOPT_FRESH_CONNECT => 1, CURLOPT_RETURNTRANSFER => 1, CURLOPT_FORBID_REUSE => 1, CURLOPT_TIMEOUT => 4, CURLOPT_POSTFIELDS => http_build_query($post) ); $ch = curl_init(); curl_setopt_array($ch, ($options + $defaults)); if( ! $result = curl_exec($ch)) { trigger_error(curl_error($ch)); } curl_close($ch); return $result; } $test = curl_post("http://www.mywebsite.com/post.php", array("data" => "1")); print($test); ?> Looks like it uses GET instead POST (used sniffer to verify this). I'm trying to send POST data to other host, if that matters. Any help would be appreciated. Thanks in advance. I have json code and i want to prine only {"currency":"DOGE","available":"0","reserved":"420.000000000"} data. my json data [{"currency":"1ST","available":"0","reserved":"0"},{"currency":"DOGE","available":"0","reserved":"420.000000000"},{"currency":"ZSC","available":"0","reserved":"0"}] I use this code
<?php ?> First of all... this is a GREAT forum. Been lurking around here for a long time and searched all over but still can't find a solution to a problem I'm facing. Here's my cURL code - I am basically retrieving data from a web site: <?php $curl_handle=curl_init(); curl_setopt($curl_handle,CURLOPT_URL,'http://myprimemobile.com/update.html'); curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2); curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1); $buffer = curl_exec($curl_handle); curl_close($curl_handle); if (empty($buffer)) { print "Connection Timeout<p>"; } else { print $buffer; } ?> If you take a look at the site http://myprimemobile.com/update.html -- it will begin to update new information. I want to basically retrieve the data from that page but in my own way. I want to loop through the page and search for all the names, addresses, and phone numbers only (not store hours or anything else), and want to put it in a variable so I can create a table with all that information on my page. Based on the code, all the information is stored in the $buffer variable. I want to break this variable down to extract certain data in a 'div' or a 'table'. Please help me out. Thanks! I'm using cURL to crawl and scrape data from a website. This website contains tables with rows of data. When I send a cURL POST for the underlying data at a specific row(A), it will return the expected data. But when I move to the second row(B), the data returns blank or specifically, a tons of spaces (or &#nbsp's.) When I access the cURL's POST location by browser, I can see (B)'s data. The only difference in the 2 POST's are location ID's for the data. I don't think it's a problem with JavaScript as I can successfully return data from row (A) as I mentioned. ok i have been trying for over a week now, and i just relearning php again after a few years of being lazy. lol here is a code snippet. Quote //prep the members $myFile2 = "members.txt"; $fh2 = fopen($myFile2, 'r'); $theData2 = fread($fh2, filesize($myFile2)); fclose($fh2); //echo $theData2; //grab the stats! $url = 'http://link'; $postdata = 'players='.$theData2.'&fields=all'; $ch = curl_init($url); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $postdata); $data = curl_exec($ch); curl_close($ch); $data = json_decode($data,true); the orininal code Quote //prep the members $members = "member,member1,member two,member blue"; //grab the stats! $url = 'http://link'; $postdata = 'players='.$members.'&fields=all'; $ch = curl_init($url); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $postdata); $data = curl_exec($ch); curl_close($ch); $data = json_decode($data,true); now the problium i am having is that with the original way, listing the members indevidualy is not the way i would like it, it would be easyer if it was in .txt form so i can add top it easyer. Also i have 300 people to add and the string doesnt seem to like that many people in it.....so i made a txt file, and it all works except that when i load the page it does not display the members stats from the external link. each players stats gets posted on the page and sorted who is ranked highest. echo works for echoing the contents of the file. however thats usless to me as i want the contents of the file to work like the orininal code. sorry if its confusing.... basicaly there is data from the link for each player asociated with there name. which is seperated by a , in the code. so each time a , is reached the code repeats its self posting each player till there are no more to post. the string is to short for me or just seems to not work with 300 people. i need it to read the external .txt file or something simular and work in the same mannor. I'm trying to send a HTTPS POST request with XML data to a server using PHP. Anything sends to the server requires authentication therefore I'll use cURL. Some background info.: the XML data is to request the server to upload a file from a specific URL to its local storage. One rule of using this API is I MUST set the content type for each request to application/xml. This is what I've done but isn't working... <?php $fields = array( 'data'=>'<n1:asset xmlns:n1="http://.....com/"><title>AAAA</title><fileName>http://192.168.11.30:8080/xxx.html</fileName><description>AAAA_desc</description><fileType>HTML</fileType></n1:asset>' ); $fields_string = ""; foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; } rtrim($fields_string,'&'); $url = "https://192.168.11.41:8443/"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_POST, count($fields)); curl_setopt($ch, CURLOPT_POSTFIELDS, $fields_string); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($ch, CURLOPT_HEADER, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_USERPWD, "admin:12345678"); curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/xml', 'Content-length: '. strlen($fields_string)) ); $result = curl_exec( $ch ); curl_close($ch); echo $result; ?> I am expected to get an XML reply of either upload successful or upload failed. But instead I am getting this error message. HTTP/1.1 415 Unsupported Media Type Server: Apache-Coyote/1.1 Content-Type: text/xml;charset=UTF-8 Content-Length: 0 Date: Thu, 02 Dec 2010 03:02:33 GMT I'm sure the file type is correct, the XML format is correct. I've tried urlencode the fields but it didn't work. What else I might have done wrong? I'm getting data from a remote XML file using the following as a result of a form: Code: [Select] <?php header('Content-type: text/xml'); //extract data from the post extract($_POST); //set POST variables $url = 'https://www.*******'; $fields = array( 'ESERIES_FORM_ID'=>urlencode($ESERIES_FORM_ID), 'MXIN_USERNAME'=>urlencode($MXIN_USERNAME), 'MXIN_PASSWORD'=>urlencode($MXIN_PASSWORD), 'MXIN_VRM'=>urlencode($MXIN_VRM), 'MXIN_TRANSACTIONTYPE'=>urlencode($MXIN_TRANSACTIONTYPE), 'MXIN_PAYMENTCOLLECTIONTYPE'=>urlencode($MXIN_PAYMENTCOLLECTIONTYPE), 'MXIN_CAPCODE'=>urlencode($MXIN_CAPCODE) ); //url-ify the data for the POST foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; } rtrim($fields_string,'&'); //open connection $ch = curl_init(); //set the url, number of POST vars, POST data curl_setopt($ch,CURLOPT_URL,$url); curl_setopt($ch,CURLOPT_POST,count($fields)); curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string); //execute post $result = curl_exec($ch); //close connection curl_close($ch); ?> Which is displaying what was the remote XML page, but has brought it to my domain. How would I go about taking the results of the displaying XML page and displaying them on the next page rather than just displaying the XML data? Many thanks in advanced! I've looked at various cURL tutorials and the PHP manual, but I don't see what I'm doing wrong here. My goal is to be able to send data to test.php and then have that page run a MySQL query to insert the TITLE and MESSAGE into the database. Any comments? :/ post.php <?php if(isset($_GET['title']) && isset($_GET['message']) && isset($_GET['times'])) { $array = array('title' => urlencode($_GET['title']), 'message' => urlencode($_GET['message'])); foreach($array as $key => $value) { $fields = $key.'='. $value .'&'; } rtrims($felds); //set our init handle $curl_init = curl_init(); //set our URL curl_setopt($curl_init, CURLOPT_URL, 'some url'); //set the number of fields we're sending curl_setopt($curl_init ,CURLOPT_POST, count($array)); for($i = 0; $i < $_GET['times']; $i++) { //send the POST data curl_setopt($curl_init, CURLOPT_POSTFIELDS, $fields); curl_exec($curl_init); echo 'post #'.$i. ' sent<br/>'; } //complete echo '<b>COMPLETE</b>'; //close session curl_close($curl_init); } else { ?> <form action="post.php" method="GET"> <table> <tr><td>Title</td><td><input type="text" name="title" maxlength="30"></td></tr> <tr><td>Content</td><td><textarea name="message" rows="20" cols="45" maxlength="2000"></textarea></td></tr> <tr><td>Process Amount</td><td><input type="text" name="times" size="5" maxlength="3"></td></tr> <tr><td>START</td><td><input type="submit" value="Initiate Posting"></td></tr> </table> </form> <?php } ?> test.php <?php mysql_connect('host', 'user', 'pass'); mysql_select_db('somedb'); if($_POST['title'] && $_POST['message']) { mysql_insert("INSERT INTO tests VALUES (null, '{$_POST['title']}', '{$_POST['message']}')"); } echo '<hr><br/>'; //query $query = mysql_query("SELECT * FROM tests ORDER BY id DESC"); while($row = mysql_fetch_assoc($query)) { echo $row['id'].'TITLE: '. $row['title'] .'<br/>MESSAGE: '. $row['message'] .'<br/><br/>'; } ?> Friends, I want to check a site if there is any data avaliable for a given Domain Name.... For example Quote http://www.keywordspy.com/research/search.aspx?tab=domain-organic&market=uk&q=abc.co.uk In this URl the Domain is the "&q=" POST Parameter which is: abc.co.uk In the Output, Observe the Columns Keyword, Position, Volume etc.... So for any Domain we want to check if there is one or more keyword returned. Here is the Example of a Domain for which there is no Data: Quote http://www.keywordspy.com/research/search.aspx?tab=domain-organic&market=uk&q=notfoundforme.co.uk As you Observe there is no data in Keyword, Position or Volume Columns for this Domain (i.e. notfoundforme.co.uk) All i want is to Echo All the Keywords, their Position and Volumes for any domain for which data is avaliable. Code: Here is the Code i have which scrapes the Whole data with Curl, but i am not able to make getkwspydata function work to get the Keyword, Position or Volume data for the domain, i know some regex and preg_match needs to be done but its too technical for me, may someone help me with this? <?php function getkwspydata($host) { $request = "http://www.google.com/search?q=" . urlencode("site:" . $host) . "&hl=en"; $request = "http://www.keywordspy.com/research/search.aspx?tab=domain-organic&market=uk&q=". urldecode($host); $data = getPageData($request); // preg_match('/<div id=resultStats>(About )?([\d,]+) result/si', $data, $p); // $value = ($p[2]) ? $p[2] : "n/a"; // $string = "<a href=\"" . $request . "\">" . $value . "</a>"; //return $string; print_r($data); } function getDomainName($host) { $hostparts = explode('.', $host); // split host name to parts $num = count($hostparts); // get parts number if(preg_match('/^(ac|arpa|biz|co|com|edu|gov|info|int|me|mil|mobi|museum|name|net|org|pp|tv)$/i', $hostparts[$num-2])) { // for ccTLDs like .co.uk etc. $domain = $hostparts[$num-3] . '.' . $hostparts[$num-2] . '.' . $hostparts[$num-1]; } else { $domain = $hostparts[$num-2] . '.' . $hostparts[$num-1]; } return $domain; } function getPageData($url) { if(function_exists('curl_init')) { $ch = curl_init($url); // initialize curl with given url curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // add useragent curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable if((ini_get('open_basedir') == '') && (ini_get('safe_mode') == 'Off')) { curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any } curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); // max. seconds to execute curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error return @curl_exec($ch); } else { return @file_get_contents($url); } } $sitehost = ($_POST['sitehost']) ? $_POST['sitehost'] : $_SERVER['HTTP_HOST']; $sitedomain = getDomainName($sitehost); ?> <html> <head> <title>SEO Report for <?=$sitedomain;?></title> </head> <body> <form method="post" action="<?$_SERVER['PHP_SELF'];?>"> <p><b>Domain/Host Name:</b> <input type="text" name="sitehost" size='30' maxlength='50' value="<?=$sitehost;?>"> <input type="submit" value="Grab Details"></p> </form> <ul> <li>Google indexed pages: <?=getkwspydata($sitehost);?></li> </ul> </body> </html> Cheers Natasha T. I have looked up tutorials on Google. I am still confused. I built a php script that receives the data from my html form then filters out bad words. I am trying to get the script to resend the data to pm.cgi. The form was originally targeting pm.cgi but it needed filtered. This is why i am using php in the first place. Please help me pass these variables from the php script to the cgi script the same way that a html form would. Thank you. Here is my code, it prints correctly. I cannot get cURL to work. <?php $action = $_GET['action'] ; $login = $_REQUEST['login'] ; $ID = $_REQUEST['ID'] ; $text = $_REQUEST['Unused10'] ; $submit = $_REQUEST['submit'] ; function filterBadWords($str){ // words to filter $badwords=array( "badword1", "badword2"); // replace filtered words with random spaces $replacements=array( "****", "***", "****" ); for($i=0;$i < sizeof($badwords);$i++){ srand((double)microtime()*1000000); $rand_key = (rand()%sizeof($replacements)); $str=eregi_replace($badwords[$i], $replacements[$rand_key], $str); } return $str; } $text = filterBadWords($text); print_r($action); print_r($login); print_r($ID); print_r($text); print_r($submit); ?> I need to send the variables that i printed above to pm.cgi. Can you point me in the right direction? Thanks again! Hi guys Can someone help me about this: The php code can be revise username and password with CURL then check database and if username & password is correct return true else false. Thanks This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=334273.0 |