PHP - Sorry First Post Here, Help Request For Preg_match Page Scrape
Hi everyone,
I am making a screen scraper in php which scrapes the usernames from forum posts and stores them in an SQL database. I need some help with part of the preg_match code if possible please? The code and also the pseudo code I have so far is:(the pseudocode I am having trouble with but will try to solve my self if possible). Edit: sorry, editted the page as was confusing to read. Please ask for clarificaiton if there is anything I have failed to explain properly. thank you. //I will be placing the following php in the confirmation page people see after making a new post, so for this example lets say the referrer header says: http://www.mysite.com/showthread.php?tid=1' Code: [Select] $threadurl=$_SERVER['HTTP_REFERER']; // scrape the page Code: [Select] $content = file_get_contents($threadurl); // find the pattern in source which makes it easy to find the username- the only things that change are the uid and the color Code: [Select] if (preg_match("/\b uid=792"><span style="color:#ffcc00">fapafap</span></a>\b /i", $content)) { //extract username from this string search Don't know! //copy the username (in this case 'fapafap') to the database along with the referral ID Code: [Select] $query_insert="INSERT INTO newpostersdatabase(username,referrerurl) VALUES('$username','$threadurl')" ; $result=mysql_query ( $query_insert); if(!$result){ die(mysql_error()); } Thank you so much for any guidance, I know that I have totally messed up with the string search also but my brain is too small it seems! Similar TutorialsAlright, so I play a browser game called Politics and War. I run an alliance that has 74 members. In that alliance we offer a bank service for all our members, but I - being the leader - am the only one who can access the bank. I have been building a site that works with the game API to gather data for members and create a dashboard. One of the features I am trying to build is allowing them to withdraw from their account instantly.
So, what I need: To be able to submit a POST request to login to the site (specifically on this page --> https://politicsandwar.com/login) with my username and password, but then I need to keep the session active and navigate to a different page (the alliance bank page). On that page I first need to scrape a value from a hidden input (token) and then I need to submit a POST request to this same page while still being logged in.
I am not asking someone to do it for me, but rather someone to help me know how to go about this. I have never submitted post requests with PHP, but I have used PHP cURL in the past. I also have made POST requests with JS, but never PHP.
Thank you so much for anyone that is able to help! For my site I need to screenscrape a page on a site. The problem is, to access the page that contains the data I need, I have to login to my account first. I know there are ways to simulate a form submission with ASP, but my server is Linux and can't use ASP. I'm wondering if any of you know how I would be able to simulate a POST with something like cURL? And possibly write an example script? Thanks in advance. (This may be in the wrong section, please move it if it is. Thanks) Code: [Select] <?php if ($_SERVER['REQUEST_METHOD'] == 'POST'){ $db = mysql_connect("localhost", "*******" , "*****")or die("Error connecting to database: " . mysql_error()); $db_used = mysql_select_db("pskkorg_drp1", $db)or die("Could not select database: " . mysql_error()); $user_name = mysql_real_escape_string($_POST['username'],$db); $query = mysql_query("SELECT * FROM student WHERE Username = '$user_name'",$db) or die(mysql_error()); if(mysql_num_rows($query) == 1){ echo "Login successful, welcome back " . $user_name . ""; }else{ echo "Login unsuccessful, please ensure you are using the correct details"; } }else{ echo "Error"; } ?> take a look at this code, is there anything wrong?... it always come out the error output when i test it. when i enter this url, http://www.pskk.org/LMS/LMSscripts/FirstTimeUser10.php?Username=149090 it come out Error. suppose it will appear Login Successful since the username 149090 exist in the database. Hi all, I'm trying to scrape the contents of a page that is behind a login screen; namely: http://my.mail.ru/apps. Here's my code. It almost works, but doesn't appear to be properly logging in -- I just get a login screen on the url download. Any ideas? Thanks much. Here's my code <?php $ch=login(); $html=downloadUrl('http://my.mail.ru/apps', $ch); echo $html; function downloadUrl($Url, $ch){ curl_setopt($ch, CURLOPT_URL, $Url); curl_setopt($ch, CURLOPT_POST, 0); curl_setopt($ch, CURLOPT_REFERER, "http://my.mail.ru/cgi-bin/login?noclear=1&page=http%3a%2f%2fmy.mail.ru%2fapps%2f"); curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0"); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, 10); $output = curl_exec($ch); return $output; } function login(){ $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, 'http://my.mail.ru/cgi-bin/login?noclear=1&page=http%3a%2f%2fmy.mail.ru%2fapps%2f'); //login URL curl_setopt ($ch, CURLOPT_POST, 1); $postData=' page=http%3A%2F%2Fmy.mail.ru%2Fapps%2F &Login=username &Domain=mail.ru &Password=password'; curl_setopt ($ch, CURLOPT_POSTFIELDS, $postData); curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt'); curl_setopt ($ch, CURLOPT_FOLLOWLOCATION,1); curl_setopt ($ch, CURLOPT_MAXREDIRS, 10); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); $store = curl_exec ($ch); return $ch; } ?> I am testing a url for http post request. It connects to the site then it is disconnected. In fire fox it is giving error that the "connection was reset". In google chrome it is giving error that "Unable to load the webpage because the server sent no data". Following is the script. I am new to curl and also new to http request response. The provider of this site tells me that everything is ok and it is sending data but i am not getting any data or any header of this site. <?php $data = '<?xml version="1.0" encoding="UTF-8"?><Request><Source ClientID="test_xml" Password="1234" /><RequestDetails Language="En"><SearchHotelPriceRequest><ServiceDestination DestinationType="city" DestinationCode="477" /><ImmediateConfirmationOnly>1</ImmediateConfirmationOnly><PeriodOfStay><CheckInDate>2012-03-01</CheckInDate><Duration>3</Duration></PeriodOfStay><Rooms> <Room NumberOfRooms="1" NumberOfAdults="2" /> </Rooms></SearchHotelPriceRequest></RequestDetails></Request>'; $ch = curl_init(); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_URL, 'https://62.218.28.13:8443/monWebService/Request/v2'); curl_setopt($ch, CURLOPT_POST, $data); curl_setopt($ch, CURLOPT_POSTFIELDS, $data); $data = curl_exec($ch); curl_close ($ch); echo $data; ?> Hey! Can anyone help with this, is it possible to make a post to a remote website, so when the use runs the script it will post some information to a remote server and then grab some info it gets back in a get request or something along those lines? Hope someone can guide me to more info thanks! Hey people of php freaks recently, I've been having issues changing a discord webhook [curl] post request to a website [curl] post request. My knowledge in PHP is very limited so, I really would appreciate some help This is the function: function postToDiscord($message) { $data = array("content" => $message, "username" => "Payouts Bot"); $curl = curl_init("https://discordapp.com/api/webhooks/591662537293168650/eyme87xmCuKne4njucoDSqzojq78NaS9x7t4sgP9m-l5EmrvUeZegkM4wok-L-UdBDxo"); curl_setopt($curl, CURLOPT_CUSTOMREQUEST, "POST"); curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($data)); curl_setopt($curl, CURLOPT_RETURNTRANSFER, true); return curl_exec($curl); } Currently the discord bot transfers the message into a discord channel and, I do not want that. I want it to transfer it onto a website page. As, I said i'm not very keen on this so please help me.
Edited July 3, 2019 by LegitSpiderman How do I send multiple headers with this code? Code: [Select] <?php $postdata = http_build_query( array( 'var1' => 'some content', 'var2' => 'doh' ) ); $opts = array('http' => array( 'method' => 'POST', 'header' => 'Content-type: application/x-www-form-urlencoded', 'content' => $postdata ) ); $context = stream_context_create($opts); $result = file_get_contents('http://example.com/submit.php', false, $context); ?> *this header: Code: [Select] User-Agent: Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7A341 Safari/528.16 I need to make my client's site distinguish among different tabs in the same browser. (See http://www.phpfreaks.com/forums/index.php?topic=357772.0 for background.) I create a tab ID when the user first visits the site in a particular tab, and I'm passing it from page to page as a parameter. This seems to be the only way to do what I need. The tab ID is always passed through POST so that the user won't see it. If it were visible, users could make trouble (or more likely get in trouble) by adding or removing the ID themselves. This is not simple where a page is loaded by a hyperlink. The anchor tag passes parameters via GET, period. The solution I found is to link to a script named post.php, which takes the real link target and the tab ID as parameters, sends the browser a form that passes the tab ID to the real link target via POST, and auto-submits the form via JavaScript. This works, but it requires two round trips to the browser to load a page. It's also a pain to code. Is there a more efficient way to do this... perhaps a way to make the server send the browser a POST response, even though it made a GET request? If it matters, the server is Apache 2.x. This topic has been moved to Third Party PHP Scripts. http://www.phpfreaks.com/forums/index.php?topic=359184.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=345218.0 Some code from my pages ,
Page1 ( Redirecting page )
<html> <title>login_redirect.</title> body> <form name="redirect" action="http://mysite/page2.php" method="post"> <input type="hidden" name="mac" value="$(mac)"> </form> <script language="JavaScript"> <!-- document.redirect.submit(); //--> </script> </body> </html>Page 2 ( select product ) <?php session_start(); ini_set('display_errors',1); error_reporting(E_ALL); include '../lib/config.php'; include '../lib/opendb.php'; // get user mac adres from redirect post page1 $_SESSION['macid'] = $_POST['mac']; // set $macid for other use ( maybe not needed, am learning ) $macid = $_SESSION['macid']; // echo $macid does show mac adress, so variable is not empty here if (!empty($_POST["submit"])) { $product_choice = $_POST['accounttype']; $query= "SELECT AccountIndex, AccountCost, AccountName FROM AccountTypes WHERE AccountIndex='$product_choice'"; $result = mysql_query($query) or die('Query failed. ' . mysql_error()); while($row = mysql_fetch_array($result)) { $_SESSION['AccountIndex'] = $row['AccountIndex']; $_SESSION['AccountCost'] = $row['AccountCost']; $_SESSION['AccountName'] = $row['AccountName']; } header('Location: page3.php'); } // did leave out the other/html/form stuff herePage 3 ( show Session variables ) <?php ini_set('display_errors',1); error_reporting(E_ALL); session_start(); print_r($_SESSION); ?>Now, on page 3 i do see the right session varables, only the "macid" is empty. why ? I currently have a mailing list that my boss uses, when he adds an email there is no 'sucess message' to let him know it has worked.
The mailing list just redirects back to the add subscribers page. I wondered if its possible to add a request so when he adds an email I can redirect the url to this:
?page=addemail&message=your email was added
and then echo $message where I want it to appear.
I have added $message = $_REQUEST['message']; to my page but nothing is happening. Can I not use request like this?
Hi, suppose the user requests a page www.foo2.php. Now if it is a link embedded in another page, say www.foo1.php, then Javascript can send the server browser info before furnishing the page. But is it possible to let the server know which browser is being used when the user requests the page by just typing the url http://www.foo2.php and hitting enter? Hello, I need to execute a remote web page (distant server) every X seconds. How is it possible and is there any way to make it look exactly like a normal user request through a standard browser like Firefox, Chrome, etc. ? I've heard of cURL, but it seems that servers may block it, so there must be somehow a difference between a usual user request and a cURL execution. Thank you for any help ! :-) Matthew Doing something wrong, but don't see it. How should one retrieve a POST parameter? My $request->toArray()['html'] works, but I am sure it is not the "right way". <?php namespace App\DataPersister; use ApiPlatform\Core\DataPersister\DataPersisterInterface; use Symfony\Component\HttpFoundation\RequestStack; class ArchivePersister implements DataPersisterInterface { public function __construct(RequestStack $requestStack) { $request = $requestStack->getCurrentRequest(); syslog(LOG_ERR, '$request->getMethod(): '.$request->getMethod()); syslog(LOG_ERR, '$request->getContent(): '.$request->getContent()); syslog(LOG_ERR, '$request->request->get(html): '.$request->request->get('html')); syslog(LOG_ERR, '$request->query->get(html): '.$request->query->get('html')); syslog(LOG_ERR, '$request->get(html): '.$request->get('html')); syslog(LOG_ERR, '$request->toArray(): '.json_encode($request->toArray())); syslog(LOG_ERR, '$request->toArray()[html]: '.$request->toArray()['html']); } } output $request->getMethod(): POST $request->getContent(): {"project":"/projects/1","description":"","html":"<p>{{ project_name }}</p>"} $request->request->get(html): $request->query->get(html): $request->get(html): $request->toArray(): {"project":"\/projects\/1","description":"","html":"<p>{{ project_name }}<\/p>"} $request->toArray()[html]: <p>{{ project_name }}</p> Hi all, I'm php stupid but from what I read its what I need. I am looking to grab just the number this page outputs http://api.radioreference.com/audio/listeners.php?feedId=2798 and put it on a page for some tracking software. When you view the source page of the page it needs to show the number and not the coding for it so Javascripting is out of the question. Can anyone help me? Hey all, What's the most efficient way to wait until a page on your own website is done being rendered, and then parse it for something specific? The reason I'm having to scrape it rather than just generate it myself is because the part being scraped if being generated in an iframe on my site via another site, and the data inside of it is dynamic. Thanks Hi everyone, I need help with scraping. I have script for scraping IMDB, but I need a few more thing. I dont know how to scrape more Budget, Opening Weekend, and Gross. Rest information that I need I scrape on this way: Code: [Select] //code removed to discourage people from scraping IMDB Can somebody help me to scrape Budget, Opening Weekend and Gross also? Dear all, I write this code to extract the widget from this page:http://www.widgetbox.com/widget/accuwidget The widget information is hidden under the tag <iframe> and is inside the src. I try using this code and it always show me error of: Fatal error: Call to undefined method DOMNodeList::getAttribute() Code: [Select] <?php get(); function get(){ $url = "http://www.widgetbox.com/widget/accuwidget"; $tidy = new tidy(); $repaired = $tidy->repairfile($url); //The code is dirty, so it need to be tidy $xml = new DOMDocument(); $xml->loadHTML($repaired); $xpath = new DOMXpath($xml); $cloud = $xpath->query("//div[@id='preview-div']/div/iframe"); $widget = $cloud->getAttribute("src"); echo $widget; } ?> Sorry that I didn't input the code of the page i want to scrape the information. It's just that the code is so long. Thank you all in advance |