PHP - Regex To Extract The Integer From These Source Code
Hi, I need to extract the integers from the source code:
<span class=CatLevel1><a onclick="Javascript:ShowMeu('21');">Arts</a> (9768)</span> <span class=CatLevel1><a onclick="Javascript:ShowMeu('271');">Industrial Products</a> (9321)</span> <span class=CatLevel1><a onclick="Javascript:ShowMeu('1273');">Baby</a> (11407)</span>What are the pattern that I can use to retrieve all the integer in the bracket (9768, 9321, 11407)? This is my php code: <!DOCTYPE html> <html> <body> <?php $file_string = file_get_contents('http://www.lelong.com.my/'); preg_match('/<title>(.*)<\/title>/i', $file_string, $title); //pattern $title_out = $title[1]; echo $title_out; ?> </body> </html>Thanks Edited by Raex, 22 August 2014 - 12:49 AM. Similar TutorialsThis topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=350672.0 Geek Friends, I have list of URLs in an array called '$domains' 1- I want to run through each and every element of $domains in an API call. API: Quote http://lightapi.majesticseo.com/api_command.php?app_api_key=API_KEY&cmd=GetIndexItemInfo&items=1&item0=http://www.majesticseo.com (Note: In this API "http://www.majesticseo.com" will be replaced with each element of $domain array in a loop) 2- The resulting output is XML: <?xml version="1.0" encoding="utf-8"?> <Result Code="OK" ErrorMessage="" FullError=""> <GlobalVars IndexBuildDate="17/04/2010 16:43:46" MostRecentBackLinkDate="2010-03-25"/> <DataTables Count="1"> <DataTable Name="Results" RowsCount="1" Headers="ItemNum|Item|ResultCode|Status|ExtBackLinks|RefDomains|AnalysisResUnitsCost|ACRank|ItemType|IndexedURLs|GetTopBackLinksAnalysisResUnitsCost|RefIPs|RefSubNets|RefDomainsEDU|ExtBackLinksEDU|RefDomainsGOV|ExtBackLinksGOV|RefDomainsEDU_Exact|ExtBackLinksEDU_Exact|RefDomainsGOV_Exact|ExtBackLinksGOV_Exact"> <Row>0|majesticseo.com|OK|Found|28381|1229|28381|-1|1|4774|28381|1074|954|1|5|0|0|0|0|0|0</Row> </DataTable> </DataTables> </Result> I want to extract the below Elements out of this XML Output for each element of $domains in loop: Quote ACRank ExtBackLinks RefDomains AnalysisResUnitsCost RefIPs RefSubNets RefDomainsEDU ExtBackLinksEDU RefDomainsGOV ExtBackLinksGOV RefDomainsEDU_Exact ExtBackLinksEDU_Exact RefDomainsGOV_Exact ExtBackLinksGOV_Exact LastCrawlDate Title How can i achieve this with PHP? P.S.: I have a scrapper code, that will scrape data for the array $domains, it best if we can merge this code with that code, so all run in one code only. Please Letme know and i will PM you that scrapper Code, as i do not want to share that Publically. Cheers & Enjoy the Snow , Natasha T
How to edit my source code so the output prints correctly ?
looking for a way to view the PHP source code of any PHP page on the web. Looking for a way to obtain PHP source code in order to help me with a certain issue. Advise. So I am completely done with my forum after several posts here and a lot of time. But crap! I just realized that the way my avatar system works it will give away the password! I REALLY don't want to redo that system because truth is it is about 40 percent of the entire sites coding. It works by making pictures in a directory named the usersname.thepassword . Whatever filesystem. Now when I echo the path everyone can see the password and username in the source code! And thy can click it to see the picture! Is there a way to hide the paths or the source codeM thanks! Does any one have a better idea to protect PHP files so that you can distribute a 'release' without the customer being able to read the source files. There are tools on the internet which costs money, BUT your are dependant on their software and if its not open source, its not trustworthy. What I've done so far is writing an ISAPI DLL in borland cpp and installed it under iis6. Basically you call this isappi dll and it decrypts the encrypted php files and executes them respectively. It is thread safe (as is php). There are other methods available on the net that you use to encrypt your pages, BUT the decryption algorithm is found in your main php file and duh, if you can read the main file, you can easily decrypt all other files, so that is a bad idea. Any other ideas? Possibly to write a PHP extension perhaps but I have not been able to get that working on borland cpp. So I am completely done with my forum after several posts here and a lot of time. But crap! I just realized that the way my avatar system works it will give away the password! I REALLY don't want to redo that system because truth is it is about 40 percent of the entire sites coding. It works by making pictures in a directory named the usersname.thepassword . Whatever filesystem. Now when I echo the path everyone can see the password and username in the source code! And thy can click it to see the picture! Is there a way to hide the paths or the source codeM thanks! This topic has been moved to Third Party PHP Scripts. http://www.phpfreaks.com/forums/index.php?topic=321546.0 Currently testing a site thats almost built, am going to be including php on a sidebar on all pages so thought it'd be easier to just make all pages .php, however when i upload and try and view, in Firefox it just displays the source code on screen! It does it on other computers, but works fine in IE! ARRGGGHH Anyone have a clue? Kind Regards, Chris Hello all, i have a feeling im doing something wrong, but i have no idea what. being server-side code, php should not show when you 'view page source' in your browser (or rather it should display as html), correct ? why, then am i seeing php code when i view page source? see attached image $text = "wow {one|two|three}fsasfa happy ness"; preg_match('/\b{*+}\b/i', $text, $matches); print_r($matches); Basically, $matches will contain "one|two|three" - but all I got is an array with "}" well basically im trying to do that the 'subject says. ive done my homework and had around 10 examples of using curl, but none of them worked in my case. this is the final code i'm using <?php $cookiefile = '/temp/cookies.txt'; #2 ways ive tried doing #$data = array('edit[username]' => 'REMOVED', 'edit[password]' => 'REMOVED', 'edit[submit]' => 'Login'); $data = array('username] => 'REMOVED', 'password' => 'REMOVED', 'submit' => 'Login'); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, 'http://pokerrpg.com'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiefile); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiefile); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $data); curl_exec($ch); curl_setopt($ch, CURLOPT_URL, 'http://pokerrpg.com/furniture_store.php'); $contents = curl_exec($ch); $headers = curl_getinfo($ch); echo $contents; curl_close($ch); unlink($cookiefile); ?> im not sure about the cookie file, but i just made a txt file to that location. and empty txt file. hope it's fine. the page i'm trying is http://pokerrpg.com, you can even look the source code that both of these fields do exist. when i run it, the output is a login page without logging in, so it does not log in. Ok this has been driving me absolutely crazy. I have a script that gets called by Prototype's Ajax.Updater() class which should then update a div. It does update the div but a portion of what gets outputted doesn't render. Code: [Select] if (is_array($PROCESSED["associated_proxy_ids"]) && count($PROCESSED["associated_proxy_ids"])) { foreach ($PROCESSED["associated_proxy_ids"] as $student) { if ((array_key_exists($student, $STUDENT_LIST)) && is_array($STUDENT_LIST[$student])) { ?> <li class="user" id="audience_student_<?php echo $STUDENT_LIST[$student]["proxy_id"]; ?>" style="cursor: move;"><?php echo $STUDENT_LIST[$student]["fullname"]; ?><img src="<?php echo ENTRADA_URL; ?>/images/action-delete.gif" onclick="removeAudience('student_<?php echo $STUDENT_LIST[$student]["proxy_id"]; ?>', 'students');" class="list-cancel-image" /></li> <?php } } } I've echoed from within the foreach and the following if and the $student variable does have a value, however I'll point out that it also doesn't render but that may be because its not a <li>, I'm not sure. Now, if I take that li element lin end output it outside of the foreach loop it renders just fine(once I substitute the $student variable for a known index of course). Inside the loop it doesn't. I can see it in the source code of the page but for some reason its not rendering. I'll put the output from the source of the page below, the top is the code output from within the loop and the bottom is from outside of the loop. The bottom renders as expected, the top doesn't. Code: [Select] <!--foreach li element:doesn't render--> <li class="user" id="audience_student_2" style="cursor: move;">Hudson, Tom<img src="http://localhost/entrada/www-root/images/action-delete.gif" onclick="removeAudience('student_2', 'students');" class="list-cancel-image" /></li> <!--non foreach li element:renders--> <li class="user" id="audience_student_2" style="cursor: move;">Hudson, Tom<img src="http://localhost/entrada/www-root/images/action-delete.gif" onclick="removeAudience('student_2', 'students');" class="list-cancel-image" /></li> They're identical. I can't figure out for the life of me why this isn't working. Any help would be great. I've been developing a php application that runs my entire company for the last 4 years. One of the things I never thought of until now is that the server guys or anyone else could copy the source code and db and be able to start up another company which brings up my question to you.... How would you protect your application? My thought is to create one small php file that is encrypted with something that is required to make the entire site run (not sure at this point what it would be that they couldn't just rebuild). Then if this file sees it's on a different domain/ip it requests data from my site which logs the info for me to look at. If I find out it's something not approved, it would then not allow the program to run and will give a error. What is your idea? Hey guys, First post here, so feel free to flame me if im violating the rules somehow. So, the issue is this: i built an ebay listing creator for a customer. it conssists of a form with several fields being posted to a page that assembles everything into a listing (text, images, radio buttons etc.). now, what i want to do is to easily allow the customer to copy the compiled source code into the clipboard (or a txt file, doesnt really matters) - in order to easily copy it into ebay. I tried it with CURL, but all i get is the source without the posted information. I must be missing something there. Any help would be appreciated, if you need links or codes iv's used, ill provide. Thanks in advance! I found the code below which checks to make sure something the user entered into a text field is an actual URL. I am looking to confirm that something a user enters is an actual link to an IMAGE, so does anyone know how I could adjust this code below to do that? Basically, this code below would accept http://monkeys.com AND http://monkeys.com/images/monkey.jpg. I want to alter the code so that http://monkeys.com would NOT be accepted since it's not an image. Essentially, I want to make sure whatever the user enters ends with .jpg, .gif or .png. Anyone know how to do this? Code: [Select] url": { "regex": /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*@)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|\/|\?)*)?$/, "alertText": "* Invalid URL" } Hi, I'm trying to retrieve the integer value between the <span> tag from a HTML source code.
HTML source code:
<span> (3861822) </span>This is the php code: <!DOCTYPE html> <html> <body> <?php //use curl to get html content function getHTML($url,$timeout) { $ch = curl_init($url); // initialize curl with given url curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"]); // set useragent curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); // max. seconds to execute curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error return @curl_exec($ch); } $html=getHTML("http://www.alibaba.com/Products",10); preg_match("/<span>(.*)<\/span>/i", $html, $match); $title = $match[1]; echo $title; ?> </body> </html>Whenever I try to run it, this error will come out: Notice: Undefined offset: 1 in C:\xampp\htdocs\myPHP\index.php on line 19. How to correct it so that it will display all the integer value within the tag name but without the bracket? Thanks Edited by Raex, 20 August 2014 - 01:38 AM. Hi guys, I am making a bot which only scrapes the source code of the site AFTER logging into the site.The script to login is : Code: [Select] <?php $username="xxx"; $password="iwonttellyou"; $url="http://internet.com/login.php"; $cookie="cookie.txt"; $postdata = "name=".$username."&password=".$password; $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); curl_setopt ($ch, CURLOPT_TIMEOUT, 60); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie); curl_setopt ($ch, CURLOPT_REFERER, $url); curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata); curl_setopt ($ch, CURLOPT_POST, 1); $result = curl_exec ($ch); echo $result; ?> I can see different SESSION ID's in cookie.txt everytime i compile this code, which makes me believe its working.However what next? How should i go to that site again, already logged in and scrape the data ? Some suggestions would be nice. Hey guys,
I'm facing an issue compiling the above stack from a source code inside lxc using centos 6.5 as a domain OS.
[lxc@lxc1 httpd-2.4.9]$ ./configure --with-included-apr checking for chosen layout... Apache checking for working mkdir -p... yes checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking target system type... x86_64-unknown-linux-gnu configu configu Configuring Apache Portable Runtime library... configu configuring package in srclib/apr now checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking target system type... x86_64-unknown-linux-gnu Configuring APR library Platform: x86_64-unknown-linux-gnu checking for working mkdir -p... yes APR Version: 1.5.1 checking for chosen layout... apr checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... configu error: in `/home/lxc/httpd-2.4.9/srclib/apr': configu error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details configure failed for srclib/apr This problem has been detected by me when I replaced my desktop machine with new one and installed a centOS again. This such a problem never happened before using my old machine with the same version of OS and libvirt. Just to be clear, a new selinux policy into a "domain machine" has been created to be able to use the "dbus daemon" to all containers and if I try to complile this stack from source using the "domain os" this problem never happens at all. All "Development tools" is installed to this particular container, in case someone asks me why I get the following error message - "configu error: cannot run C compiled programs" Any ideas? Edited by jazzman1, 08 June 2014 - 01:37 PM. So I have been working on my website for a while which all is php&mysql based, now working on the social networking part building in similar functions like Facebook has. I encountered a difficulty with getting information back from a link. I've checked several sources how it is possible, with title 'Facebook Like URL data Extract Using jQuery PHP and Ajax' was the most popular answer, I get the scripts but all of these scripts work with html links only. My site all with php extensions and copy&paste my site links into these demos do not return anything . I checked the code and all of them using file_get_contents(), parsing through the html file so if i pass 'filename.php' it returns nothing supposing that php has not processed yet and the function gets the content of the php script with no data of course. So my question is that how it is possible to extract data from a link with php extension (on Facebook it works) or how to get php file executed for file_get_contents() to get back the html?
here is the link with code&demo iamusing: http://www.sanwebe.c...-php-and-jquery
thanks in advance.
|