PHP - Extracting Text From A String
I have managed to get this to work but it seems like it is a very long and messy solution. I was wondering if anyone had an idea of how this can be done better. I am new to php and don't know a lot.
It shows the text between the tags <h1> and </h1> from the content of a different file Basically I had to start the substr() from the fourth position so it would actually skip the "<h1>" being included, and because I started on the fourth postion I then had to finish four places back to skip the "</h1>" being included. Code: [Select] <?php $id = $_GET['id']; $homepage = file_get_contents("./".$id.".php"); $title = stristr($homepage,"<h1>"); $titlepos = strpos($homepage,"</h1>"); $endpos = $titlepos - 4; echo "Title " . substr($title,4,$endpos); ?> Similar TutorialsThis topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=326004.0 hey guys i need to extract all the IP's of a string and loop them for more operations but for some reason i only get the first one <?php $string = '80.37.14.13 80.37.14.14 80.37.14.15'; preg_match("/\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/", $string, $matches); foreach ($matches as $ip){ echo "$ip<br>"; } ?> The string is not really seperated by spaces ... it can actualy be messy and have ips rapped arround a lot of code. The regex works because i do get the first one ... What did i miss? I have a string Code: [Select] <p>Bollywood romcom: a pair of wouldbe suicides choose the same bridge at the same time.</p><p><a href="http://www.dailyinfo.co.uk/reviews/venue/1320/Vue_Cinema">Vue Cinema</a> (<a href="http://www.myvue.com/cinemas/index.asp?SessionID=0D86903A19334C93A7491563A77862A3&cn=1&ln=1&ci=66" target="_blank">www.myvue.com/cinemas/index.asp?SessionID=0D86903A19334C93A7491563A77862A3&cn=1&ln=1&ci=66</a>), Ozone Leisure Park, Grenoble Road (next Kassam Stadium), Oxford OX4 4XP; Tel. 08712 240 240 (10p per min from BT).<br /> </p> Using php i want to extract Code: [Select] http://www.myvue.com/cinemas/index.asp?SessionID=0D86903A19334C93A7491563A77862A3&cn=1&ln=1&ci=66 from the <a> tag. I have already tried using <?php $pattern = "/<a href=\"([^\"]*)\">(.*)<\/a>/iU"; preg_match_all($pattern, $link, $matches); var_dump($matches); ?> The output I get is array 0 => array 0 => string '<a href="http://www.dailyinfo.co.uk/reviews/venue/1320/Vue_Cinema">Vue Cinema</a>' (length=81) 1 => array 0 => string 'http://www.dailyinfo.co.uk/reviews/venue/1320/Vue_Cinema' (length=56) 2 => array 0 => string 'Vue Cinema' (length=10) so I'm working on this system which requires me to inject anime show names into a DB for an index. I get them in this form: [Kira-Fansub]_HIGHSCHOOL_OF_THE_DEAD_-_04_(BD_1920x1080_x264_AAC) [F0E73009].mkv I need to get it in this form: Highschool Of The Dead sometimes the words are space seperated. the main part is to extract the text in between the ] and ( and with that i mean [stuff]extract this(other stuff) and sometimes it looks like this: [stuff]extract[stuff] I have NO CLUE how to do this preg_match_all doesn't get me any further.. maybe in stages? I have a string that looks like /index.php?g1=111&g2=222&g3=333. How can I obtain the value of g1 (i.e. 111)? It does not represent the current state of the server thus I cannot just use $_GET. It also is not necessarily the first item. The script below appears to work, however, http://php.net/manua...n.parse-url.php states This function is intended specifically for the purpose of parsing URLs and not URIs. However, to comply with PHP's backwards compatibility requirements it makes an exception for the file:// scheme where triple slashes (file:///...) are allowed. For any other scheme this is invalid. It appears that my string is a URI and not a URL, but I might be wrong. How should this be accomplished <?php $str='/index.php?g1=111&g2=222&g3=333'; $array=parse_url($str); parse_str($array['query'],$get); echo("<p>{$get['g1']}</p>"); ?> hi there - hello dear PHP-Friends, good evening! - i want to extract some data ouf of a large html-file. i have - a very very large amount of data: approx 5000 x the following line-sheme!: Quote 67003 Cato Bontjes Vice Versum House 1 28832 Achim 62042 Cato Bontjes Vice Versum House 2 28832 Achim 41798 Cato Bontjes Vice Versum House 3 37139 Adelebsen 40034 Cato Bontjes Vice Versum House 4 21365 Adendorf 46218 Cato Bontjes Vice Versum House 5 31855 Aerzen 42481Cato Bontjes Vice Versum House 6 21702 Ahlerstedt 49761 Cato Bontjes Vice Versum House 7 26197 Ahlhorn Question: how can i extract the first 5 first digits...!? I have allready some solutions here - i need a very very robuste solution diblertone1 What I'm trying to do is take two arrays and combine them using array_combine. After they are combined I want to take each key and value pair and use them in a string to build a MySQL query. See below for some pseudo-code =D <?php $fields = array('first_name', 'last_name'); //the fields that will be set with an UPDATE query. $newvals = array('Joe', 'Blow');//The values that will go into $fields $arr = array_combine($fields, $newvals);//Each value now has a key of the field it will update. /** This is where I want to return my SET string with something like 'SET ' . $arr['first_name'] ' = ' . $arr['first_name_value'] ', etc etc. */ There it is, I suppose. Are there any easier ways of doing what I'm trying to accomplish, that is, am I on the right track or just making things harder for myself? Both arrays will have dynamic values throughout my application, so I need to be able to get both each field and value for each query. Any links to some constructs, etc, etc? That's all I really need and I can post the solution when I've figured it out. I'm normally fairly proficient with PHP, but I haven't done any coding in quite a while, so I'm a little rusty. I have an entire page of text from which I need to extract a single value. Here is a small portion of the page in question: Code: [Select] Total Rank: 128 Total Points: 4,978 Next Rank: 20 For instance, I need to extract the values "128" "4978" and "20" and store them in variables. These values change all the time, so I'm not sure what the best way to go about this is... maybe a regular expression ? If that's the case, I've never been too good with them, so any help would be appreciated. Folks, I tired all my PHP skills to extract domain name strings from a RSS Feed and put each domain name as an Array element, but all in vain: Here is the RSS: http://bulliesatwork.co.uk/master/dev/domp/expdom/domains.php() What i want to extract: Quote Do you see a list of domain names, which are Anchored, all i need is to extract these domain names llik "abc.co uk" (observe there is a space between .co and uk, which can be removed with str_replace()) Here is my first try: (Using SimpleHTMLDomParser) Code: [Select] require_once('simple_html_dom.php'); $html = file_get_html('http://bulliesatwork.co.uk/master/dev/domp/expdom/domains.php'); $domains = $html->find('div[class="entry"] a', 0); foreach($domains as $dom) { echo str_replace(' ', '.', $dom->plaintext); } $html->clear(); unset($html); Here is my another try with DOM Document: Code: [Select] $scrapeurl = 'http://bulliesatwork.co.uk/master/dev/domp/expdom/domains.php'; $keywords = file_get_contents($scrapeurl); $keywords = json_decode($keywords); foreach( $keywords->responseData->results as $keyword) { echo str_replace("...",".",$keyword->title).'<br/>'; } In both the cases, DOM document is created but it seems the Document has all information except the Domain names i want to extract. Please help me out to extract the doamin names. Cheers Hi there, In my attached PHP script, I extract text between two strings in the input file and write the extracted text to an output file. Everything seems to work fine, except I can't figure out how to include the row that says "Richland" (after the row that says "Creighton") in the extracted text. If someone could guide me how to do this, I'd greatly appreciate it. The PHP script is attached. The input file is in htm format and I can't attach that here so I will provide a link to the file I'm calling: http://www.afws.net/data/pa/savedata/109/06/2009060920.pa.htm Many thanks!!! I have a large text file that I need to search and extract text from. I have some code that somewhat works but is not good for what I need because it only reads one line at a time. I need to be able to echo all code between two strings and continue scanning the entire document. I am attaching the TXT file that is being read by the script: Here is the script: Code: [Select] <? $searchthis = "Problem:"; $search="Check:"; $matches = array(); $handle = @fopen("1numbers.txt", "r")or die("can't open file"); if ($handle) { while (!feof($handle)) { $buffer = fgets($handle); if(strpos($buffer, $searchthis) !== FALSE) echo "<br>". $buffer."<br>"; if(strpos($buffer, $search) !== FALSE) echo "<br>". $buffer."<br>"; } fclose($handle); } ?> you can see what this script outputs by visiting this link: http://yourautofix.com/data/data.php but my problem is it only outputs one line of text that finds the search match. I need it to output all lines of text between two matches for example any text between "Problem:" and "Check:" should be Echo'd and any text between "Check:" and "Likely:" should be echo'd there may be 1 line or 20 lines of text between the tags... I need to print all lines between the 2 determined search strings and then continue through the text file displaying all matches between the search strings in a large file. any thoughts on how I can get this done or point me in the right direction? Thanks for any input on this Paul Dear all, is there any library that supports text extraction from docx,doc, excel, pdf, etc formats like Apache POI does on Java? Or should I port Apache POI classes to PHP code? best regards, ethereal1m I have html files in which, there are lines of urls starting with http:// (simple text, not hyperlink) without a tag. What is the simplest way to extract them? Hi I'm learning php and trying to write a script to extract registration information from a large text file. Sadly my meagre knowledge of php is letting me down a bit. It's a case of knowing what you want the script to do but not having the knowlege of how to 'say it'. So i was hoping that if I posted my code here someone could either give me a few pointers on where i am going wrong or suggest a better way. The text file data luckily has a recurring format as follows (for brevity i've only included one entry, which contains made up information): From: bella_done@yahoo.co.uk Sent: 02 February 2011 22:50 To: Jonny tum, patsy fells, dingly bongo Subject: Subject: Fun Run 2010 Categories: Fun Run Name: Bella Donna Address: 14 brondle avenue Postcode: cd83 1rg Phone: 0287343510 Email: bella_don@yahoo.co.uk DOB: 15/11/1945 Half or Full: Full fun run How did you hear: Took part in 2010 As you can see the data has a convenient boundary at the 'from' field and the colon (or so it occurred to me) so I created my script as follows: // the string being analysed $the_string = " From: bella_done@yahoo.co.uk Sent: 02 February 2011 22:50 To: Jonny tum, patsy fells, dingly bongo Subject: Subject: Fun Run 2010 Categories: Fun Run Name: Bella Donna Address: 14 brondle avenue Postcode: cd83 1rg Phone: 0287343510 Email: bella_don@yahoo.co.uk DOB: 15/11/1945 Half or Full: Full fun run How did you hear: Took part in 2010"; // remove all formatting to work with a clean string $clean_string = strip_tags($the_string); // remove form field entries from the data and replace with commas and a ZZZ boundary $remove_fields = array("Categories:" => "","Name:" => ",","Address:" => ",","Postcode:" => ",","Phone:" => ",","Email:" => ",","DOB:" => ",","Half or Full:" => ",","How did you hear:" => ",","From:" => "ZZZ","Sent:" => ",","To:" => ",", ); $new_string = strtr("$clean_string",$remove_fields); // split the data at the boundary ZZZ $string_to_array = explode("ZZZ", $new_string); $new_string2 = implode("</br>",$string_to_array); echo $new_string2; $myFile = "address_list.csv"; $fh = fopen($myFile, 'w') or die("can't open file"); $stringData = $new_string2; fwrite($fh, $stringData); fclose($fh); One major problem is when i write the new data to a csv file the csv contains spacings that cause it to be reproduced in a column form rather than as separate fields for each comma boundary. So can anyone suggest either a) a better way of extracting the data from the text file (doesn't need to be 100% clean and perfect) b) How can i stop the spaces in the csv (i thought i would have fixed this when i stripped the tags from the string at the start??). Any help would be greatly received by a newbie phper. It's my first shot at performing anything moderately taxing so if I've made some blaring oversites I would very much welcome your wisdom! Thank you Drongo Hi all just a bit like twitter i have a message system where when one recieves a mail i want there username to be like @user in blue but when sending a message the @user is in the message itself along with the actual message itsself, so i am trying to get the @ and everything after it so i can then change its color, can anyone help with this? I have the following variable: $text = "javascript:openimage('http://images.icecat.biz/img/norm/high/5966342-254.jpg',850,850)" Now I want to put the string "http://images.icecat.biz/img/norm/high/5966342-254.jpg" into a variable called $url ( $url = "http://images.icecat.biz/img/norm/high/5966342-254.jpg" ) How do I do this? I do want to validate a password input field to only allow numbers and letters without special character and spaces, and I also do want the same for the name input field. Since the old fashioned ways are deprecated. How would one approach to solve this issue the new way? Hey, I need a simple bit of code to do the following: I am trying to strip out a username in a string. This string will distinguish the username because the string will contain a * right before the username. For example: $string = "blah blah *username blah blah blah" The username can only contain numbers and letters. no spaces. And from that string all I want is the username, for example: $string = username (which will vary of course) Hope this makes sense, made it as clear as I possibly could. Thanks ahead! I am trying to wrap a string with an anchor tag if it finds a match with a block of text. Here is an example text block: CONOCO 1'10x8 VC DF TP SGN||PRINCIPAL ILLUMINATION||ENG: CO3028TP_0VPR||DWG: CO200428||TO BE: DYED DIESEL (SPEC)|| The string I would want to wrap with a link would be "CO200428". The next problem is the drawings (what I'm searching the text for) has over 115,000 possibilities. The text blocks to search are over 1300. I have the drawing names stored in a simple mysql table...but doing a foreach takes forever...I imagine it will take even longer when looking in the text blocks... Is there a way to easily to do the anchor wrap? I don't know regex very well... Hi, I am trying to take a string from a database and replace everything within {} with code... similar to how posting in a forum works. so say I have "...Lorem ipsom {gallery:1} sit imet..." it will take that string (from a DB) and replace "{gallery:1}" with "<?php gallery('1'); ?>". How can this be done? Or is there keywords I can search on to find the answer? Thank you in advance. |