PHP - Recursively Fill Array By Scraping
Hello everyone I'm very new here but i hope you could help me with a tricky problem I no longer know how to approach it because it's difficult to visalize the solution.
Anyway, I have a script that goes to the root of a site (with cURL) and picks up categories (links on the site) via regex. All the links are placed into the big array I have. The first layer (dimension) I've managed to create but the problem comes to when I need my script to delve into deeper dimensions. I want, for each link it finds, go to that page and find those subcategories and place it in my array in the correct subarray. If the regex returns 0 matches, go up one step and go to the next node's site, until the whole big array has been exhausted. Is this possible? Please help out guys and gals. I'll provide more info and code if requested. Similar TutorialsCan anyone let me know what I am doing wrong. I am sure it will (after the fact) be obvious, but I don't see it right now. Wish to remove all array elements which do not implement ValidatorCallbackInterface. Thanks <?php interface ValidatorCallbackInterface{} class ValidatorCallback implements ValidatorCallbackInterface{} function array_filter_recursive($input) { foreach ($input as &$value) { if (is_array($value)) { $value = array_filter_recursive($value); } } return array_filter($input, function($v) { return $v instanceOf ValidatorCallbackInterface; }); } function recursive_unset(&$array) { foreach ($array as $key => $value) { if (is_array($value)) { recursive_unset($value); if(empty($value)) { unset($array[$key]); } } elseif(!$value instanceOf ValidatorCallbackInterface) { unset($array[$key]); } } } $validatorCallback = new ValidatorCallback(); $rules=[ 'callbackId'=>"integer", 'info'=>[ 'arrayofobjects'=>[$validatorCallback], 'foo1'=>'bar1' ], 'foo2'=>'bar2', 'bla'=>[ 'a'=>'aa', 'b'=>'bb', ], 'singleobject'=>$validatorCallback ]; echo('original rules'.PHP_EOL); var_dump($rules); $desiredrules=[ 'info'=>[ 'arrayofobjects'=>[$validatorCallback] ], 'singleobject'=>$validatorCallback ]; echo('desired rules'.PHP_EOL); var_dump($desiredrules); echo('array_filter_recursive'.PHP_EOL); var_dump(array_filter_recursive($rules)); echo('recursive_unset'.PHP_EOL); recursive_unset($rules); var_dump($rules);
original rules array(5) { ["callbackId"]=> string(7) "integer" ["info"]=> array(2) { ["arrayofobjects"]=> array(1) { [0]=> object(ValidatorCallback)#1 (0) { } } ["foo1"]=> string(4) "bar1" } ["foo2"]=> string(4) "bar2" ["bla"]=> array(2) { ["a"]=> string(2) "aa" ["b"]=> string(2) "bb" } ["singleobject"]=> object(ValidatorCallback)#1 (0) { } } desired rules array(2) { ["info"]=> array(1) { ["arrayofobjects"]=> array(1) { [0]=> object(ValidatorCallback)#1 (0) { } } } ["singleobject"]=> object(ValidatorCallback)#1 (0) { } } array_filter_recursive array(1) { ["singleobject"]=> object(ValidatorCallback)#1 (0) { } } recursive_unset array(2) { ["info"]=> array(2) { ["arrayofobjects"]=> array(1) { [0]=> object(ValidatorCallback)#1 (0) { } } ["foo1"]=> string(4) "bar1" } ["singleobject"]=> object(ValidatorCallback)#1 (0) { } }
Hi, I have a multidimensional array where each array has a parent id node to create a hierarchical tree. $hierarchy[] = array('id' => 1, 'parent_id' => 0, 'name' = 'root1'); $hierarchy[] = array('id' => 2, 'parent_id' => 0, 'name' = 'root2'); $hierarchy[] = array('id' => 3, 'parent_id' => 1, 'name' = 'root1-1'); $hierarchy[] = array('id' => 4, 'parent_id' => 1, 'name' = 'root1-2'); $hierarchy[] = array('id' => 5, 'parent_id' => 3, 'name' = 'root1-1-1'); $hierarchy[] = array('id' => 6, 'parent_id' => 2, 'name' = 'root2-1'); I'm trying to come up with a recursive key/value search routine that will return all ancestor arrays of the found item without knowing the depth of the tree. All nodes with 0 for the parent_id are root level nodes. Basically I want to search for something like "where key = name and value = xxx" and have it return all ancestors of that node. So if I wanted to search for "key = name and value = root1-1", it should return and array like: array[0] = array('id' => 1, 'parent_id' => 0, 'name' = 'root1'); //parent node first array[1] = array('id' => 3, 'parent_id' => 1, 'name' = 'root1-1'); //first child after parent if I was to search for "key = name and value = root1-1-1", it should return: array[0] = array('id' => 1, 'parent_id' => 0, 'name' = 'root1'); //parent node first array[1] = array('id' => 3, 'parent_id' => 1, 'name' = 'root1-1'); //first child after parent array[2] = array('id' => 5, 'parent_id' => 3, 'name' = 'root1-1-1'); //first grandchild So the main problem comes in the iteration and keeping track of parents. If I just want the array with the answer I can get that node, but I can't get it with all of the ancestors attached. How would you go about this? Any good ideas out there? Thanks! I have a form that saves the input data as such: Quote bass inshore offshore Array, regional, mid, 70, 40, 2 x yr, all, fellowship knowledge information learnig, one, one, three, four, one, one, two, three, three, four, TPWD round table discussions conservation biology, the guide talking about fishing the venturi, Jack Ellis and TAG fishing in the East Texas creeks., none at present bass inshore offshore Array, regional, mid, 70, 40, 2 x yr, all, fellowship knowledge information learnig, one, one, three, four, one, one, two, three, three, four, TPWD round table discussions conservation biology, the guide talking about fishing the venturi, Jack Ellis and TAG fishing in the East Texas creeks., none at present bass inshore offshore Array, regional, mid, 70, 40, 2 x yr, all, fellowship knowledge information learnig, one, one, three, four, one, one, two, three, three, four, TPWD round table discussions conservation biology, the guide talking about fishing the venturi, Jack Ellis and TAG fishing in the East Texas creeks., none at present This php program reads it back to the screen as it is writen and as you see it above. Code: [Select] <?php $YourFile = "meeting.survey"; $handle = fopen($YourFile, 'r'); while (!feof($handle)) { $Data = fgets($handle); print $Data; print "<hr>"; } fclose($handle); ?> The code below fails. I feel it is because I am using 'explode' incorrectly. Code: [Select] <?php $YourFile = "meeting.survey"; $handle = fopen($YourFile, 'r'); while (!feof($handle)) { $Data = fgets($handle); $answers = explode(',', $Data); print $answers[0]; print $answers[1]; print $answers[2]; print $answers[3]; print $answers[4]; print $answers[5]; print $answers[6]; print $answers[7]; print $answers[8]; print $answers[9]; print $answers[10]; print $answers[11]; print $answers[12]; print $answers[13]; print $answers[14]; print $answers[15]; print $answers[16]; print $answers[17]; print $answers[18]; print $answers[19]; print $answers[20]; print $answers[21]; print "<hr>"; } fclose($handle); ?> My PHP coding skills are minimal at best, any help, direction, comments you have will be greatly appreciated. If you wish to view the survey html go here If you wish to view the php code used to produce the data, Code: [Select] wget http://texasflyfishers.org/quiz/php_tutorial/survey.php If you wish to view a readback of the data go here Everything I've been able to find over the web, tells me this is what I need to do if I want to add some text that tells what each responce means. Hello, In m script, I need to get the content of another php file as a string, including the content of all the files which are included in it and in lower levels. Any idea how to do it? I tried output buffer+include but it doesn't get the content of the included files. Thanks Hi all. The title pretty much says it all. I really have no idea if this is even possible or where to start. A recent dilemma caused me to have to change a field name with some spaces in it. I have a database with ~65 tables, a majority of which were converted from Access. So I'm curious as to how many others have bad field names. Is there a way to use PHP to go through all the field names in a database and remove spaces? Hello,
Currently my webscraper signs into the site and pulls all the html -> perfect.
What I need to do is to loop only specific information (horses that ran)
here is my current php code
<? $url = 'site'; $postdata = array('username' => "username", 'password' => "password"); $ch = curl_init(); if($ch){ curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 15); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $postdata); curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt'); // set cookie file to given file curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt'); // set same file as cookie jar $content = curl_exec($ch); $headers = curl_getinfo($ch); curl_close($ch); // Debug option // print_r($headers); if($headers['http_code'] == 200){ echo $content; } } ?>here is the html im pulling <table width=100% border=1><tr><td class=instruction6 colspan=4><b>My Race Notes</b></td></tr> <tr><td width=90%><form action='races.php?id=7456132' method=post> <textarea name='comments' rows=2 cols=38>Type notes & press Add</textarea></td> <td width=5%><input type=submit class='weestatbutton' value='Add'></form></td></tr></table></td></tr></table><table width=100%><tr class=databreakdown2253><th><a href='races.php?id=7456132&sortby=1'>Place</a></th><th>Dist Bt</th><th>Stall</th> <th>Horse</th><th>Age</th><th><a href='races.php?id=7456132&sortby=3'>Weight</a></th><th>Headgear</th><th>OR</th><th>Trainer</th> <th><a href='races.php?id=7456132&sortby=2'>Odds</a></th><th>Jockey (Claim)</th></tr><tr><td class=databreakdown2253>1st</td><td class=databreakdown2253></td><td class=databreakdown2253>4</td> <td class=databreakdown2253><a href='horses.php?id=298745'>Telegraph (IRE)</a></td> <td class=databreakdown2253>3</td><td class=databreakdown2253>9-3</td><td class=databreakdown2253></td> <td class=databreakdown2253>57</td> <td class=databreakdown2253><a href='trainers.php?id=2448'>Evans, P D</a></td> <td class=databreakdown2253>28/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=694'>Egan, John</a> </td></tr><tr class=databreakdown18><td colspan=12>soon led, brought field stands side from 3f out, headed 2f out, rallied inside final furlong, bumped and led again towards finish</td></tr><tr><td class=databreakdown2253>2nd</td><td class=databreakdown2253>0.5</td><td class=databreakdown2253>3</td> <td class=databreakdown2253><a href='horses.php?id=305855'>Ecliptic Sunrise</a></td> <td class=databreakdown2253>3</td><td class=databreakdown2253>8-12td><td class=databreakdown2253></td> <td class=databreakdown2253>52</td> <td class=databreakdown2253><a href='trainers.php?id=4516'>Donovan, D</a></td> <td class=databreakdown2253>10/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=3414'>Cosgrave, Pat</a> </td></tr><tr class=databreakdown18><td colspan=12>chased leaders, challenged 2f out, led 2f out, edged right inside final furlong, rider lost whip and headed towards finish</td></tr><tr><td class=databreakdown2253>3rd</td><td class=databreakdown2253>1.5</td><td class=databreakdown2253>1</td> <td class=databreakdown2253><a href='horses.php?id=300316'>Bookmaker</a></td> <td class=databreakdown2253>4</td><td class=databreakdown2253>9-6</td><td class=databreakdown2253><a title='Blinkers worn'>Blnk</a></td> <td class=databreakdown2253>59</td> <td class=databreakdown2253><a href='trainers.php?id=933'>Bridger, J J</a></td> <td class=databreakdown2253>6/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=3848'>Carson, William</a> </td></tr><tr class=databreakdown18><td colspan=12>prominent, took keen hold, led 2f out, headed over 1f out, not much room inside final furlong, stayed on same pace</td></tr><tr><td class=databreakdown2253>4th</td><td class=databreakdown2253>1</td><td class=databreakdown2253>2</td> <td class=databreakdown2253><a href='horses.php?id=261986'>Night Trade (IRE)</a></td> <td class=databreakdown2253>7</td><td class=databreakdown2253>8-8</td><td class=databreakdown2253><a title='Cheekpieces worn'>CkPc</a></td> <td class=databreakdown2253>50</td> <td class=databreakdown2253><a href='trainers.php?id=2653'>Harris, R A</a></td> <td class=databreakdown2253>6/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=7348'>Hardie, Cameron</a> (3)</td></tr><tr class=databreakdown18><td colspan=12>prominent, ridden over 2f out, switched left inside final furlong, no extra close home</td></tr><tr><td class=databreakdown2253>5th</td><td class=databreakdown2253>1.5</td><td class=databreakdown2253>6</td> <td class=databreakdown2253><a href='horses.php?id=299296'>Trigger Park (IRE)</a></td> <td class=databreakdown2253>3</td><td class=databreakdown2253>8-10</td><td class=databreakdown2253></td> <td class=databreakdown2253>50</td> <td class=databreakdown2253><a href='trainers.php?id=2653'>Harris, R A</a></td> <td class=databreakdown2253>20/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=3422'>Dobbs, Pat</a> </td></tr><tr class=databreakdown18><td colspan=12>chased leaders, ridden over 2f out, one pace over 1f out, no impression</td></tr><tr><td class=databreakdown2253>6th</td><td class=databreakdown2253>2.25</td><td class=databreakdown2253>7</td> <td class=databreakdown2253><a href='horses.php?id=300337'>Port Lairge</a></td> <td class=databreakdown2253>4</td><td class=databreakdown2253>8-11</td><td class=databreakdown2253><a title='Blinkers worn'>Blnk</a></td> <td class=databreakdown2253>50</td> <td class=databreakdown2253><a href='trainers.php?id=914'>Gallagher, J</a></td> <td class=databreakdown2253>33/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=193'>Catlin, Chris</a> </td></tr><tr class=databreakdown18><td colspan=12>slowly into stride, in rear, stayed on inside final furlong, never dangerous</td></tr><tr><td class=databreakdown2253>7th</td><td class=databreakdown2253>NK</td><td class=databreakdown2253>11</td> <td class=databreakdown2253><a href='horses.php?id=289934'>Lionheart</a></td> <td class=databreakdown2253>4</td><td class=databreakdown2253>8-13</td><td class=databreakdown2253></td> <td class=databreakdown2253>59</td> <td class=databreakdown2253><a href='trainers.php?id=4910'>Crate, Peter</a></td> <td class=databreakdown2253>10/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=7375'>Crouch, Hector</a> (7)</td></tr><tr class=databreakdown18><td colspan=12>reared start and slowly away, held up in rear, headway over 1f out, weakened inside final furlong</td></tr><tr><td class=databreakdown2253>8th</td><td class=databreakdown2253>2.75</td><td class=databreakdown2253>14</td> <td class=databreakdown2253><a href='horses.php?id=289421'>Koharu</a></td> <td class=databreakdown2253>4</td><td class=databreakdown2253>9-4</td><td class=databreakdown2253><a title='Cheekpieces worn'>CkPc</a></td> <td class=databreakdown2253>60</td> <td class=databreakdown2253><a href='trainers.php?id=2495'>Makin, P J</a></td> <td class=databreakdown2253>9/4 (Fav) </td> <td class=databreakdown2253><a href='jockeys.php?id=5952'>Bates, Mr D J</a> (3)</td></tr><tr class=databreakdown18><td colspan=12>in rear, ridden over 3f out, no impression</td></tr><tr><td class=databreakdown2253>9th</td><td class=databreakdown2253>3</td><td class=databreakdown2253>5</td> <td class=databreakdown2253><a href='horses.php?id=269827'>Saskias Dream</a></td> <td class=databreakdown2253>6</td><td class=databreakdown2253>9-6</td><td class=databreakdown2253><a title='Visor worn'>Vsor</a></td> <td class=databreakdown2253>59</td> <td class=databreakdown2253><a href='trainers.php?id=2002'>Chapple-Hyam, Jane</a></td> <td class=databreakdown2253>4/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=3544'>Hughes, Richard</a> </td></tr><tr class=databreakdown18><td colspan=12>mid-division, headway and switched left over 1f out, edged left entering final furlong, soon eased</td></tr><tr><td class=databreakdown2253>10th</td><td class=databreakdown2253>1.75</td><td class=databreakdown2253>12</td> <td class=databreakdown2253><a href='horses.php?id=304248'>Crafty Business (IRE)</a></td> <td class=databreakdown2253>3</td><td class=databreakdown2253>9-2</td><td class=databreakdown2253><a title='Visor worn'>Vsor</a></td> <td class=databreakdown2253>59</td> <td class=databreakdown2253><a href='trainers.php?id=695'>Moore, G L</a></td> <td class=databreakdown2253>14/1 </td> <td class=databreakdown2253><a href='jockeys.php?id=6669'>Bishop, Mr C</a> (3)</td></tr><tr class=databreakdown18><td colspan=12>towards rear, pushed along over 3f out, well beaten 2f out</td></tr></table><br><hr></td></tr></table>*note I'm using this for personal reasons Hi, Im trying to work out a way to get the New York Lottery's Take 5 results. Theres a few sites that list the winning numbers, i assume automatically as there is alot of lottery games on these sites. what would be the best way to get this? http://www.myfreepost.com/lottery/index.php/us/newyorklottery/takefive/result/ http://www.elite-lottery-results.com/?action=view_game&gid=NY2 Ok, I know how to screen scrape, but I don't know how to screen scrape when there is a login. I've looked this up for awhile, but no luck. I'd like to also make it so I can execute a url when I am logged in on the script for the script, for an example execute this url: http://site.com/data.php?id=9912&submit=1 Thanks in advanced. Okay so I am scraping websites for their descriptions keywords and titles. I noticed that a lot of websites use the same keywords and descriptions on every page.. so my idea is to scrape the index and find all the links in there and scrape them all then after they been scraped check all of the descriptions and if the descriptions match then pull some text unique to each page and use that. I can't seem to wrap my head around it.. how would I accomplish this? I scrape with curl then find keywords description and title then find all links on the site and scrape those. soo I was thinking making an array of the descriptions and then checking and inserting to the db but doesn't seem like it would work. Any ideas? Oh also.. how would I grab just text from each page that is different from every other page? lol very confusing I need to scrape pages - I only need one page at a time I'm only looking for 2/3 bits of data within each page Can someone give me some pointers where to start? I've searched and see names like DOMXpath and Xpath mentioned - do I need these? It's important that I can run the script on a standard Linux hosting with nothing extra installed like packages - I'd like to have something I can just use immediately using standard php and functions I've seen plenty of tutorials + youtube videos - just looking for recommendations and pointers for recommended practices Thanks OM More information on the job posting. I am looking to fetch information from daily deal website, Such as tuango.ca, socialliving.com, groupon.com...ect I want to retrieve data from different daily deal sites, and I want to retrieve all the deals of the day from each different city in the website. For example www.tuango.ca Has a deal a day in Montreal, Toronto,...ect I want to be apply to retrieve data from all the different location within the site. I want the script to fetch the data of deals. To be more clear I want the script to fetch What site the deal was on What location was it for What's the tittle of the deal What price is the deal What's the value of the deal What's the saving in percentage of the deal How much were sold What's the minimum amount of the deal before it becomes activated What's the company who did the deal Company address Company postal code Company phone number (there might be more categories..will talk more if you pass this stage of the interview process) Ones all this data is fetched I need it to automatically be store in a database. Every morning at 4:am (eastern time) I need it to run the script, because the days deals finish at midnight and it's the only way of getting a number of the total number of coupons sold. you'll usually see the final stats of the deal on their recent deals page of the website. I want to know how a site like http://onespout.com/deals/montreal did it.. I'm not asking somebody to do it for me I'm just asking someone to guide me in takeing the right steps i need some help to scrape a link from specified page. for example if i have a page like this http://br.4ce.info/ i want to scrape all link on that page and i want to show all link in that page on my wordpress widget in another blog ? can you help me with this ? dont use iframe i think better using cURL thanks I am a newbie and I am trying to do a site scraping project to obtain all the following fields: Test Year, Test Name, Grade Level, Question #, Question Type, Reporting Category, Standard #, Standard description, Example Question (with image) for this web page that has a page for each question. http://www.doe.mass.edu/mcas/search/question.aspx?mcasyear=2010&QuestionSetID=1&grade=8&subjectcode=MTH&questionnumber=36 I am a newbie at PHP and would love if you could point me in the right direction. The page uses tables and I need to extract the data from the body of the page as well as some of the info from the url and then have it inserted into a MySQL database. Thank you so much for your help. I have a form which lets the user put in the URL to their twitter account. When the enter their URL I am trying to create a screen scraping script that scrapes that page to get basic information like their twitter name and number of tweets. I'm not sure how I am going to do this, I don't think there is a twitter API for this so I may have to use something like cURL. I was just wondering if anyone has done this and could give me any advice about the best method? Thanks for any help Hi, I have the written the following code which scrapes price info from a website: $url = 'http://www.mydomain.com'; $html = file_get_contents($url); $pattern = '/<span class="price">(.*?)<\/span>/'; preg_match_all($pattern, $html, $matches); print_r($matches); It works well however I need to add in the delivery cost to each array element with a different pattern: /<span class="delivery">(.*?)<\/span>/'; Any idea how i can do this so each array element has both the price and delivery costs in a two dimensional array? Thanks for your advice I'm trying to pull the stock quotes Beta from yahoo finance since the yahoo query language doesn't support it. My code returns an empty array. Any ideas why? Code: [Select] <?php $content = file_get_contents('http://finance.yahoo.com/q?s=NFLX'); preg_match('#<tr><th width="48%" scope="row">Beta:</th><td class="yfnc_tabledata1">(.*)</td></tr>#', $content, $match); print_array($match); ?> Hello, I have checked out many of the scripts and tried implementing them to help me scrape 1 single image from a url. Example www.123.com/333.png Getting a script to scrape that image isnt the problem. Im not sure on how to implement the simple curl to save the image every 30mins and name it in successive order so it appears as , 1.jpg, 2.jpg, 3.jpg I am working with a debian 6 server and php would be the easiest way to do this that i can work with. I have searched the web endlessly and still cant produce such thing. Any help is appreciated. I'm looking to scrape the schedule details for any particular class at my university as part of a school project. I have been able to log a student into the university site, grab their name and course information. In order to grab the schedule for a particular class I now have to visit a different area of the university site, the registrar. The course schedule section of the registrar is coded in ASP .net and I'm having trouble making HTTP requests to this area of the site. I understand the need to make post requests to mimic the Viewstate but I'm running into an issue before I even get to that part. I am able to load the page via an HTTP request almost every time. But it always takes almost exactly 2 minutes. I have tried simple get requests, post requests with the Viewstate, and other variations to one of a few different pages on the site. Each time it works. But each time it takes 2 minutes. Any ideas why it takes so long? Any suggestions on what I can possibly do differently? Here is the basic site I'm using to test my code on before implementing it fully into my program: University Site Here is my link that takes 2 minutes to load the same page: My Site Here is my latest code I've tried: Code: [Select] <?php $postdata = "__VIEWSTATE=/wEPDwULLTIwNjY2MzUzMDEPZBYCAgUPDxYCHgRUZXh0BRNNYXIgMjMgMjAxMSAgNzoxNVBNZGQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFDmN0bDEwJGltZ0xvZ2luaYy4H4gz+Bjb4GVdsO1ecd9c9EA="; $postdata .= "&__EVENTVALIDATION=/wEWAgKs/IaWBAKpyP2zAXWcNEO0tMqDX53r6m+Hzo/nKHwZ"; $postdata = urlencode($postdata); $host = 'courseschedules.njit.edu'; $path = '/index.aspx'; $fp1 = fsockopen($host,80,$errno,$errstr,30); if(!$fp1) die($_err.$errstr.$errno); else { fputs($fp1, "POST $path HTTP/1.1\r\n"); fputs($fp1, "Host: $host\r\n"); fputs($fp1, "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15 ( .NET CLR 3.5.30729)\r\n"); fputs($fp1, "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n"); fputs($fp1, "Accept-Language: en-us,en;q=0.5\r\n"); fputs($fp1, "Accept-Encoding: gzip,deflate\r\n"); fputs($fp1, "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n"); fputs($fp1, "Keep-Alive: 115\r\n"); fputs($fp1, "Connection: keep-alive\r\n"); fputs($fp1, "Content-length: ".strlen($postdata)."\r\n\r\n"); fputs($fp1, $postdata."\r\n\r\n"); $response = ''; while(!feof($fp1)) $response .= fgets($fp1,2000); fclose($fp1); echo $response; } ?> Like I said, I've also tried a standard get request which works as well, just takes 2 minutes. I am writing a sql dump file and some of my fields have ' in it. Like the name is "Joe's Cake Shop". How should i add ' infront of ' to make it look like Joe''s Cake Shop.Also, I got an idea about adding ' infront of ' by seeing other database dump.Can someone please enlighten me why should i do it. My Code :- Code: [Select] <?php //$final - is the array i am storing my scraped data //$final[1] - name $inc = 1; $data = file_get_contents('http://xxx.com'); $regex = '~<td\s+colspan="2"\s+width="350"><font\s+size="2">\s+<b>\s+(.*?) <\/b><br>(.*?) <br>(.*?),\s+(.*?)\s+<br>(.*?), (.*?)\s+<BR><BR><font\s+size="2"><img\s+src="\.\.\/images\/phone1.gif"\s+align="left"\s+hspace="4"\s+alt\s+=(.*)>\s+-\s+Phone\s+#\s+(.*?)\s+<\/font>\s+<BR>\s+<font\s+size\s+="1">~'; preg_match_all($regex, $data, $final); $jlimit = count($final[0]); for($j=0 ;$j < $jlimit; $j++) { $filename = 'cake.sql'; $somecontent = "(".$inc.", '".$final[1][$j]."', '".$final[2][$j]."', '".$final[3][$j]."', '".$final[4][$j]."', '".$final[6][$j]."', '".$final[8][$j]."'),\n"; if (is_writable($filename)) { if (!$handle = fopen($filename, 'a')) { echo "Cannot open file ($filename)"; exit; } if (fwrite($handle, $somecontent) === FALSE) { echo "Cannot write to file ($filename)"; exit; } echo "Success, wrote ($somecontent) to file ($filename)"; $inc = $inc + 1; fclose($handle); } else { echo "The file $filename is not writable"; } } ?> |