PHP - Preg_match_all Regex
I am trying to use preg_match_all to find some information on a webpage.
Here is what I am currently using <?php $homepage = "http://www.example.com"; $page_contents1 = file_get_contents($homepage); $names1 = preg_match_all('/<span class="video_date">(.*)</span> - <a class="b" href="/(.*)/">(.*)</a><br/>\/', $page_contents1, $matches1); echo implode(", ", $matches1[1]); ?> I am trying to match this piece of html: <span class="video_date">Oct 21</span> - <a class="b" href="/meanwhilezealand/"> Meanwhile in New Zealand...</a><br/> Thanks for looking! Similar TutorialsHello again, I have some form data, which I then search through for particular code data like so: $html2 = $_POST['fname']; preg_match_all("/<bla>(.*)<\/bla>/", $html2, $matches40); So the above searches for all the data between <bla>XXXXXX</bla> from $POST Which I then print to my page using: (Only so I can see while developing) print_r($matches40); This displays HTML output like so: Code: [Select] Array ( [0] => Array ( [0] => Hello [1] => My [2] => Name [3] => Is [4] => Tom ) [1] => Array ( [0] => Hello [1] => My [2] => Name [3] => Is [4] => Tom ) ) What I am trying to do is again use the preg_match_all function to look through the array output and find data that I want to remove. E.g. If one of the variables from $matches40 is 'Tom' I want to find and replaces this with 'Ben'. I spent a day searching Google but to not success. Any help? i want to find the text between "{:" and ":}", may be 1 or more instances of this i'm using this php: $str = "hello {:first_name:} ha, this is {:awesome:} haha"; $do = preg_match_all("/{:(.*):}/", $str, $matches); which works if theres just one instance, but when you use more than 1 instance (like the above example) it returns: first_name:} ha, this is {:awesome But i want it to return a value of first_name, AND a separate value of "awesome" ideas? thanks Hi there i have this code: Code: [Select] $str = "<i><font color="800080"> man </font></i><p><font color="9898989"> hi </font></p><p><font color="1111111"> cheers </font></p>"; $pattern = '/<font .*?>(.*?)<\/font>/'; if(preg_match_all($pattern, addslashes($str), $posts)){ $i=0; for($i; $i < count($posts[0]); $i++){ echo "content: " . $posts[0][$i] . "<br/>"; echo "colour: " . $posts[1][$i] . "<br/>"; echo "<br />"; } } and it doesn't work apparently because of the addslashes but its really needed as double quotes needs to be escaped, consider that i'm applying this code to a larger html file with hundreds of double quotes to be escaped.... error msg i get is Parse error: syntax error, unexpected T_LNUMBER in thanks in advance.. This is rather bothering as I know if you use the delimiter / regex pattern s it should ignore newlines preg_match_all("%<p><b>(.*?)</b>%s", $html, $data); Returns a blank array the page data is like so <p> <b>41,910</b><br/> Total Points </p> Never had a problem before that i can recall but for some reason with this page it's giving me issues. Maybe i'm missing something? Hello all! So I am working on screen scraping a site for my son's rec league. I seem to be having problem with the pre_match_all syntax. Here is my code Code: [Select] <?php $url = "http://www.mywebsite.com"; $raw = file_get_contents($url); $newlines = array("\t","\n","\r","\x20\x20","\0","\x0B"); $content = str_replace($newlines, "", html_entity_decode($raw)); $start = strpos($content,'table border="1" cellpadding="1" cellspacing="0"'); $end = strpos($content,'</table>',$start) + 8; $table = substr($content,$start,$end-$start); preg_match_all("|<tr(.*)</tr>|U",$table,$rows); foreach ($rows[0] as $row){ if ((strpos($row,'<th')===false)){ preg_match_all("|<td(.*)</td>|U",$row,$cells); $game_date = strip_tags($cells[0][0]); $game_time = strip_tags($cells[0][1]); $rink = strip_tags($cells[0][2]); $home_team = strip_tags($cells[0][3]); $home_score = strip_tags($cells[0][4]); $visiting_team = strip_tags($cells[0][5]); $visiting_score = strip_tags($cells[0][6]); echo "{$game_date} @ {$game_time} : [{$home_team}] - {$home_score} vs. [{$visiting_team}] - {$visiting_score} <br>\n"; } } ?> My issue is that I am trying to get it to only display the data if the team name = x. I tried to replace the preg_match_all("|<td(.*)</td>|U",$row,$cells); with preg_match_all("|Posse|U",$row,$cells); (Posse is one of the team names). No luck. Any input/thoughts?! Thank you!! I have noticed that if I run the preg_match_all function and use PREG_OFFSET_CAPTURE option to start capture somwhere in the middle of the string the second half of the string will be searched first returning the matching sections along with positions, then it goes up to the top half and returns matches from there too. Is there way to parse only between start point and end of string? This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=348635.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=328802.0 I have code: $proname1 = preg_match_all('/<div class=("|\')agentContainer("|\')>(\n\s)<div class="strong">(\n\s)(.*?)(\n\s)<\/div>/', $html, $name1);() Which is putting everything between these tags into an array, but the info contains new lines and whitespace, thus displaying empty entries in the array. How do I strip the whitespace and newlines prior to getting to the array? The data Im getting looks like... Code: [Select] <div class="agentContainer"> <div class="strong"> Blah Blah Company </div> And blah blah company isnt showing up in the array, but I know the regex is working. Hi, I have the written the following code which scrapes price info from a website: $url = 'http://www.mydomain.com'; $html = file_get_contents($url); $pattern = '/<span class="price">(.*?)<\/span>/'; preg_match_all($pattern, $html, $matches); print_r($matches); It works well however I need to add in the delivery cost to each array element with a different pattern: /<span class="delivery">(.*?)<\/span>/'; Any idea how i can do this so each array element has both the price and delivery costs in a two dimensional array? Thanks for your advice Hello, i am trying to pull the innerHTML out of this: Code: [Select] <a href="(.*?)">(.*?)</a> here is what I have: Code: [Select] <?php $html = file_get_contents("http://www.businessinvestingsource.com/blcheck2.html"); preg_match_all('/<a href="(.*?)">(.*?)<\/a>/', $html, $links, PREG_SET_ORDER); foreach ($links as $link) { $linkto = $link[1]; $anchor = $link[0]; echo "<b>Link:</b> ".$linkto."<br /><b>Anchor:</b> ".$anchor."<br /><br /> "; } ?> Now this code works but the innerHTML is coming out as a link I want it to come out as plaintext you can view he http://businessinvestingsource.com/anchorcheck2.php Can anyone help? Thank you. For example. I have the following: Andrew (Age 19) How would I get the content between the brackets, Age 19 using preg_match_all or a similar function? Thanks very much Hello All, I have been wrestling with a regex for a couple of hours now and I finally had to give in and ask for help. The weird thing is that it works if there are no new lines in the text, it fails if there is a new line(s) present. The code: $matches = array(); $pattern = '~\[CUSTOM_TAG(.*?)\](.*?)\[/CUSTOM_TAG\]~'; preg_match_all($pattern, $html, $matches); if (!empty($matches[0])){ foreach($matches[0] as $code){ $parameter = preg_replace($pattern, '$1', $code); $content = preg_replace($pattern, '$2', $code);//get the content between the pattern }//foreach($matches[0] as $code){ }else{ echo 'Match failed'; }//if (!empty($matches[0])){ So with that code in mind, if the $html variable (the text to be processed) is: $html = '<h1>Hello, world!</h1><p style="color:#ff0000;">Some red text</p>';A match is found. If the $html variable is: $html = '<h1>Hello, world!</h1> <p style="color:#ff0000;">Some red text</p>';Match not found Hopefully I'm just missing something simple in my regex. Thanks in advance! Twitch preg_match_all('/(www.DOMAIN.com\/([^"]+))\"/i', $html, $matches); How do you make this match any URL on the domain including URLs with ? = & type of characters. First time post, be easy on me...
I'm using preg_match_all to return an array with all the matches. I know I'm missing something fundamental, but I either keep looking past it or am more screwy than I know.
Sample String
CC-BY-ND-NCI'm using the following code preg_match_all("/cc|creative commons|copyright|by|sa|nc|nd/i",$exifmeta['copyright'],$cmeta)I would expect to see Array ( [0] => Array ( [0] => CC [1] => BY [2] => ND [3] => NC ) )What I get is Array ( [0] => Array ( [0] => CC [1] => BY [2] => ND [3] => NC [4] => sa ) ) hey guys! Im trying to get hashtags out of a string. The function works so far- but i cant transfer the insides of the preg_match_all array into the string. A hint would be fine already. Thanks in advance- and here is some code Code: [Select] <?php //example string $strcontent = "ima string... #wat #taggy #taggytag im in your stringz, stealing your charz!"; //find hashtags preg_match_all("/(#\w+)/", $strcontent, $matches); echo $matches; //the output is just "array" -> why? foreach ($matches as $match) { // $tempmatch=$match[1]; #####like this? //hiding the hashtags via span $strtemp="<br>ima span<br>" . $match . "<br>ima /span<br>" . $strcontent; $strcontent = $strtemp; echo $strcontent; } #echo $strcontent; # <span style="display:none;"></span> ?> This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=334273.0 I have this function I use to simplify things. function search_string( $needle, $haystack ) { if ( preg_match_all( "/$needle/im", $haystack ) || strpos( $haystack, $needle ) ) { return TRUE; } return FALSE; } I keep getting this error in my PHP logs, and it comes in a sequence: [07-Nov-2020 05:34:14 America/Los_Angeles] PHP Warning: preg_match_all(): Unknown modifier 'G' in /home/baser-b/public_html/include/functions.php on line 791 [07-Nov-2020 05:34:14 America/Los_Angeles] PHP Warning: preg_match_all(): Unknown modifier 'g' in /home/baser-b/public_html/include/functions.php on line 791 Meaning, it will come with one with the small g, then three with the big G, then one with the small g, then five with the big G, and so on.... My question is, how can I stop getting this error. It won't show me the functions being called to arrive at this answer, as this is likely an error generated by another function calling this one. I was wondering if anyone knew what to change in the search_string function to stop getting this error, why this error is happening, or why the strange repetitive sequence. Is it someone trying to do a hack? The only variable that would be changeable by a visitor would be the $needle variable, so what could they type that has something to do with 'g' to get this? Anyway, thanks. I am looking for a date within larger string, lets say the date is December 4, 2010. To find it I use pattern and function below: $Pattern='/[(January|February|March|April|May|June|July|August|September|October|November|December)] \d, \d\d\d\d/i'; preg_match_all($Pattern, $String, $Matches, PREG_OFFSET_CAPTURE, $NumberPosition); The function finds the dates within the string but to my supprise the result I get in $Matches is: r 4, 2010 What I would like to get is: December 4, 2010 but don't know how it should be fixed. I thought that with the pattern I am using but obviously that is not the case. |