PHP - Extract Text From Page
Hi All,
Bit of a strange one but i would like to be able to supply a URL to a page. This page will always contain an image and the copyright that goes with it for example http://www.geograph.org.uk/photo/693325 The copyright lies undearneath I would like to get some php code that would automatically grab the image and copy this to a directory on mt site and also take the creative commons copyright notice as a string ( which i will then display along side the image when i add it to my site) How can i do this through php I know that the word "copyright" only ever appears once on the page ( as part of the bit im trying to grab) so can i use this somehow to grab the whole string? Basically im being lazy and would like to automate the process of grabbing the image and copywrite without having to download it to my computer first and reload to my server ( as i will be doing this quite a lot) Any ideas much appreciated Thanks Similar TutorialsHey folks, I am trying to create a small script that will retrieve content from a site, strip it of everything but human readable words, then remove numbers, single letters, and words that I specify. I have the following code which is live on http://salesleadhq.com/tools/crawler/meta.php?url=http://www.cooking.com. My problem is that it is not removing all of the the words I specify, only some... ?? I think i would rather an external word list as well... if anyone can assist me with that. Thank you! Code: [Select] <?php $url = (isset($_GET['url']) ?$_GET['url'] : 0); $str = file_get_contents($url); ####################################################################3 function get_url_contents($url){ $crl = curl_init(); $timeout = 5; curl_setopt ($crl, CURLOPT_URL,$url); curl_setopt ($crl, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($crl, CURLOPT_CONNECTTIMEOUT, $timeout); $ret = curl_exec($crl); curl_close($crl); return $ret; } #--------------------------------------Strip html tag---------------------------------------------------- function StripHtmlTags( $text ) { // PHP's strip_tags() function will remove tags, but it // doesn't remove scripts, styles, and other unwanted // invisible text between tags. Also, as a prelude to // tokenizing the text, we need to insure that when // block-level tags (such as <p> or <div>) are removed, // neighboring words aren't joined. $text = preg_replace( array( // Remove invisible content '@<head[^>]*?>.*?</head>@siu', '@<style[^>]*?>.*?</style>@siu', '@<script[^>]*?.*?</script>@siu', '@<object[^>]*?.*?</object>@siu', '@<embed[^>]*?.*?</embed>@siu', '@<applet[^>]*?.*?</applet>@siu', '@<noframes[^>]*?.*?</noframes>@siu', '@<noscript[^>]*?.*?</noscript>@siu', '@<noembed[^>]*?.*?</noembed>@siu', // Add line breaks before & after blocks '@<((br)|(hr))@iu', '@</?((address)|(blockquote)|(center)|(del))@iu', '@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu', '@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu', '@</?((table)|(th)|(td)|(caption))@iu', '@</?((form)|(button)|(fieldset)|(legend)|(input))@iu', '@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu', '@</?((frameset)|(frame)|(iframe))@iu', ), array(' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0",),$text ); // Remove all remaining tags and comments and return. return strtolower( $text ); } function RemoveComments( & $string ) { $string = preg_replace("%(#|;|(//)).*%","",$string); $string = preg_replace("%/\*(?:(?!\*/).)*\*/%s","",$string); // google for negative lookahead return $string; } $html = StripHtmlTags($str); ###Remove number in html################ $html = preg_replace("/[0-9]/", " ", $html); #replace by ' ' $html = str_replace(" ", " ", $html); ######remove any words################ $remove_word = array("amp","carry","serious","for","re","looking","accessories","you","used","wright","none","selection","come","second","you","new","a","able","about","across","after","all","almost","also","am","among","an","and","any","are","as","at","be","because","been","but","by","can","cannot","could","dear","did","do","does","either","else","ever","every","for","from","get","got","had","has","have","he","her","hers","him","his","how","however","i","if","in","into","is","it","its","just","least","let","like","likely","may","me","might","most","must","my","neither","no","nor","not","of","off","often","on","only","or","other","our","own","rather","said","say","says","she","should","since","so","some","than","that","the","their","them","then","there","these","they","this","tis","to","too","twas","us","wants","was","we","were","what","when","where","which","while","who","whom","why","will","with","would","yet","you","your"); foreach($remove_word as $word) { $html = preg_replace("/\s". $word ."\s/", " ", $html); } ######remove space $html = preg_replace ('/<[^>]*>/', '', $html); $html = preg_replace('/\s\s+/', ', ', $html); $html = preg_replace('/[\s\W]+/',', ',$html); // Strip off spaces and non-alpha-numeric #remove white space, Keep : . ( ) : & //$html = preg_replace('/\s+/', ', ', $html); ###process######################################################################### $array_loop = explode(",", $html); $array_loop1 = $array_loop; $arr_tem = array(); foreach($array_loop as $key=>$val) { if(in_array($val, $array_loop1)) { if(!$arr_tem[$val]) $arr_tem[$val] = 0; $arr_tem[$val] += 1; if ( ($k = array_search($val, $array_loop1) ) !== false ) unset($array_loop1[$k]); } } arsort($arr_tem); ###echo top 20 words############################################################ echo "<h3>Top 20 words used most</h3>"; $i = 1; foreach($arr_tem as $key=>$val) { if($i<=20) { echo $i.": ".$key." (".$val." words)<br />"; $i++; }else break; } echo "<hr />"; ###print array##################################################################### echo (implode(", ", array_keys($arr_tem))); ?> For example: I am using this code: Code: [Select] $myFile = "newuser.txt"; $fh = fopen($myFile, 'r'); $theData = fread($fh, 5); fclose($fh); echo $theData; and it displays: Code: [Select] Bob 2 Which I am reading from my newuser.txt file! Which corresponds to the username bob, and he has the ID of 2. Now I want to make that linkable like this: Code: [Select] <a href=.?act=Profile&id=$IDFROMTEXTFILE(2)>$NAMEFROMTEXTFILE(BOB)</a> this is possible? If so, Thanks! Hello. I have one programming problem. I have this log, from witch i have to read specific area of text: webtopay.log OK 123.456.7.89 [2012-03-15 09:09:59 -0400] v1.5: MIKRO to:"1398", from:"865458961", id:"13525948", sms:"MCLADM thing" So i need the script to extract word "thing" from that log. Also that script has to check if there is new entries in the log, and extract text from the last one. (Explaining in other words, that script should extract word AFTER MCLADM. Every time its a different word) p.s. I need that script to be integrated here (this has to send command to server "/manuadd (text from log)" : Code: [Select] <?php try{ $HOST = "178.16.35.196"; //the ip of the bukkit server $password = "MCLietuva"; //Can't touch this: $sock = socket_create(AF_INET, SOCK_STREAM, 0) or die("error: could not create socket\n"); $succ = socket_connect($sock, $HOST, 4445) or die("error: could not connect to host\n"); //Authentification socket_write($sock, $command = md5($password)."<Password>", strlen($command) + 1) or die("error: failed to write to socket\n"); //Begin custom code here. socket_write($sock, $command = "/Command/ExecuteConsoleCommandAndReturn-SimpleBroadCast:broadcast lol;", strlen($command) + 1) //Writing text/command we want to send to the server or die("error: failed to write to socket\n"); sleep(2); // This is example code and here has to be that script i want to make. //while(($returnedString = socket_read($sock,50000))!= ""){ $returnedString = socket_read($sock,50000,PHP_NORMAL_READ); print($returnedString) //} print("End of script"); socket_close($sock); }catch(Exception $e){ echo $e->getMessage(); } ?> I hope i made things clear and you will help me Thanks Is it possible to write a php script that would extract data from an external web page. I've been working with php for a few years as a hobby, but I've never seen this done or needed a reason to until now. Thanks in advance for any responses. This topic has been moved to Third Party PHP Scripts. http://www.phpfreaks.com/forums/index.php?topic=321546.0 $text = "wow {one|two|three}fsasfa happy ness"; preg_match('/\b{*+}\b/i', $text, $matches); print_r($matches); Basically, $matches will contain "one|two|three" - but all I got is an array with "}" So I have been working on my website for a while which all is php&mysql based, now working on the social networking part building in similar functions like Facebook has. I encountered a difficulty with getting information back from a link. I've checked several sources how it is possible, with title 'Facebook Like URL data Extract Using jQuery PHP and Ajax' was the most popular answer, I get the scripts but all of these scripts work with html links only. My site all with php extensions and copy&paste my site links into these demos do not return anything . I checked the code and all of them using file_get_contents(), parsing through the html file so if i pass 'filename.php' it returns nothing supposing that php has not processed yet and the function gets the content of the php script with no data of course. So my question is that how it is possible to extract data from a link with php extension (on Facebook it works) or how to get php file executed for file_get_contents() to get back the html?
here is the link with code&demo iamusing: http://www.sanwebe.c...-php-and-jquery
thanks in advance.
Hi people, I really hope you guys can help me out today. I'm just a newbe at php and i'm having real trouble. Bassically all I want to do is have a user type in a company name in a html form. If what the user types in the form matches the company name in my php script i want the user to be sent to another page on my site. If what the user types in the form doesnt match the company name in my php script i want the user to be sent to a differnt page like an error page for example. this is my html form: Code: [Select] <form id="form1" name="form1" method="post" action="form_test.php"> <p>company name: <input type="text" name="company_name" id="company_name" /> </p> <p> <input type="submit" name="button" id="button" value="Submit" /> </p> </form> And this is the php code I'm trying to process the information on: Code: [Select] <?php $comp_name = abc; if(isset ($_POST["company_name"])){ if($_POST["company_name"] == $comp_name){ header("Location: http://www.hotmail.com"); exit(); } else{ header("Location: http://www.yahoo.com"); exit(); } } ?> The thing is i'm getting this error when i test it: Warning: Cannot modify header information - headers already sent by (output started at D:\Sites\killerphp.com\form_test.php:10) in D:\Sites\killerphp.com\form_test.php on line 17 Please can some one help me out, i'm sure this is just basic stuff but i just cant get it to work Cheers. Folks, from the below code, i want to extract value of the [template_path] in a Variable. The value that i want to extract is " /home/ae1df/public_html/master/proae2/gdfcart/templates/default-black/". I tired to do $this->template_path but seems not working, Can anyone help please? Here is the Code >>> Savant3_Error: Array ( [code] => ERR_TEMPLATE [info] => Array ( [template] => sidebar-left.tpl ) [level] => 256 [trace] => Array ( [0] => Array ( [file] => /home/ae1df/public_html/master/proae2/gdfcart/includes/template.php [line] => 1298 [function] => __construct [class] => Savant3_Error [object] => Savant3_Error Object ( [code] => ERR_TEMPLATE [info] => Array ( [template] => sidebar-left.tpl ) [level] => 256 [trace] => Array *RECURSION* ) [type] => -> [args] => Array ( [0] => Array ( [code] => ERR_TEMPLATE [info] => Array ( [template] => sidebar-left.tpl ) [level] => 256 [trace] => 1 ) ) ) [1] => Array ( [file] => /home/ae1df/public_html/master/proae2/gdfcart/includes/template.php [line] => 1121 [function] => error [class] => Savant3 [object] => Savant3 Object ( [__config:protected] => Array ( [b][template_path] => Array ( [0] => /home/ae1df/public_html/master/proae2/gdfcart/templates/default-black/ [/b][1] => ./ ) [resource_path] => Array ( [0] => /home/ae1df/public_html/master/proae2/gdfcart/includes/tmpl/resources/ ) [error_text] => template error, examine fetch() result [exceptions] => [autoload] => [compiler] => [filters] => Array ( ) [plugins] => Array ( ) [template] => [plugin_conf] => Array ( ) [extract] => [fetch] => /home/ae1df/public_html/master/proae2/gdfcart/templates/default-black/error.tpl [escape] => Array ( [0] => htmlspecialchars ) ) [banner] => stdClass Object ( [header] => stdClass Object ( [banner] => [count] => 0 ) [left_box] => stdClass Object ( [banner] => [count] => 0 ) [right_box] => stdClass Object ( [banner] => [count] => 0 ) [hometop] => stdClass Object ( [banner] => [count] => 0 ) [homebottom] => stdClass Object ( [banner] => [count] => 0 ) ) [template] => default-black [site] => stdClass Object ( [name] => Paintball Mall [slogan] => This is Master Installation [url] => http://gdfcartophily.co.uk/ [disclaimer] => CERTAIN CONTENT THAT APPEARS ON THIS SITE COMES FROM AMAZON EU SARL. THIS CONTENT IS PROVIDED "AS IS" AND IS SUBJECT TO CHANGE OR REMOVAL AT ANY() Many Thanks [/code] So I make a colum in my table called "friends" Is there a way to update using mysql that colum for each user, so let's say friend 1 adds friends 2 with the id of 25 so the query puts the id "25" into the friends column, then if friend 1 adds friend 5 with the id of 26 it puts "26" into the friends column and so on... So it would have like commas in the colum, 25,26,30,31,31 and all those represent the id's of the person who is wanting to add people to his friends list! If so how do I accomplish that, and then what If I want to use mysql to list all those id's and Code: [Select] SELECT name,avatar,etc from userstable WHERE id = "25,26,3,31,31" Is this even possible or am i thinking to harD? or need to extract all h1 tags and insert into database You can modify this code to make it work? ////////////////////////////////////////////////////////////////// function getTextBetweenTags($tag, $get, $strict=0) { /*** a new dom object ***/ $dom = new domDocument; /*** load the html into the object ***/ if($strict==1) { $dom->loadXML($get); } else { $dom->loadHTML($get); } /*** discard white space ***/ $dom->preserveWhiteSpace = false; /*** the tag by its tag name ***/ $content = $dom->getElementsByTagname($tag); /*** the array to return ***/ $out = array(); foreach ($content as $item) { /*** add node value to the out array ***/ $out[] = $item->nodeValue; } /*** return the results ***/ return $out; } $content = getTextBetweenTags('h1', $get); foreach( $content as $item ) { $h1 = $item.'<br />'; } $query="UPDATE sitis SET hh = '$h1' WHERE id = '$a'"; //My problem and that puts only a h1 regards Hello, I seem to be having a problem. I am trying to extract the year from a date Code: [Select] 2012-03-01 echo "2012"; I have tried this and it only displays 1969 $dateorig = "2012-03-01"; $new_year = date("Y", strtotime($dateorig)); echo $new_year; I have just noticed that I m allowed to use variables without using the extract function. like ; before : your name is $_POST['name']; now I m allowed to use ; your name is $name // I m not using here extract What can cause this ? How can I switch it off ? Any security problems I can face ? Folks, Quote http://natty.com/p/bh-fitness-class-indoor-magnetic-exercise-bike-2-years-parts-/detail/b004r2wuak/fitness-spinning.html From this url String, i want to extract the last part of string which is, "fitness spinning". This URL is dynamic an can have any value in that last bit, so how to extract anything btween Two Forward slashes just before .html? Note: It can not be extracted with GET as its not how its designed. Thanks Natasha Hi, I got 1 warning form the server, It said:
Warning: extract() expects parameter 1 to be array, null given in /home/tz005/public_html/COMP1687/edit.php on line 113
extract($row);
how to fix the warning,should I replace it with extract($array);?
Okay I now have a working extract ZIP archive script. What I am now looking to do is have a loop which checks the percentage complete the extraction is and at 100% (with no erros) carry out a PHP function. The code so far is (and includes comments on how the new function would be placed): $dir = opendir('temp'); while(false !==($file=readdir($dir))){ if(strpos($file, '.zip',1)){ extractupdate($file); } } function extractupdate($file){ $zip=new ZipArchive; if($zip->open('temp/'.$file) == TRUE){ $update=rtrim($file, ".zip"); $zip->extractTo($_SERVER['DOCUMENT_ROOT']."/update/temp/$update"); $zip->close(); echo "Extraction started."; // Place loop here to run untill 100% extraction completed and then run function "intsallupdate($update);" } else { echo "Failed to start extraction."; } } function installupdate($update){ // installupdate() will now shift the files around as necessary. // NB to PHPFREAKS, no assistance with code for installupdate() is required, only the loop. Cheers. } Many thanks in advance. How to extract the last row a query and store it in a variable ? Thanks for helping. Noob apppreciate. Hello, I am creating zip file of multiple files using PHP and downloading it. Problem is that: zip file is getting extracted by only WinRAR, its not getti ng extracted by default windows extracter or other software. Here is code which I have written -
$file_folder = 'referral-resume/'; I can t seem to get this. I need to extract some data from the following xml file. http://api.twitter.com/1/users/show/seobpo.xml I just want to get seobpo's tweet count. The data will be used in a wordpress blog. Thank you http://www.abc.com/sports/more/others/Younger-people/show/7321382.xml http://www.abc.com/news/head/Elder-people/show/7321302.xml In the above string,please tell me how to extract the portion, and store "sports/more/others" and "news/head" in another string. |