PHP - Slow Performance Function. With Large Files Memory Blows Up! How Can I Refactor?
Hi
I have a function that strips out lines from files. I'm handling with large files(more than 100Mb). I have the PHP Memory with 256MB but the function that handles with the strip out of lines blows up with a 100MB CSV File. What the function must do is this: Originally I have the CSV like: Code: [Select] Copyright (c) 2007 MaxMind LLC. All Rights Reserved. locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode 1,"O1","","","",0.0000,0.0000,, 2,"AP","","","",35.0000,105.0000,, 3,"EU","","","",47.0000,8.0000,, 4,"AD","","","",42.5000,1.5000,, 5,"AE","","","",24.0000,54.0000,, 6,"AF","","","",33.0000,65.0000,, 7,"AG","","","",17.0500,-61.8000,, 8,"AI","","","",18.2500,-63.1667,, 9,"AL","","","",41.0000,20.0000,, When I pass the CSV file to this function I got: Code: [Select] locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode 1,"O1","","","",0.0000,0.0000,, 2,"AP","","","",35.0000,105.0000,, 3,"EU","","","",47.0000,8.0000,, 4,"AD","","","",42.5000,1.5000,, 5,"AE","","","",24.0000,54.0000,, 6,"AF","","","",33.0000,65.0000,, 7,"AG","","","",17.0500,-61.8000,, 8,"AI","","","",18.2500,-63.1667,, 9,"AL","","","",41.0000,20.0000,, It only strips out the first line, nothing more. The problem is the performance of this function with large files, it blows up the memory. The function is: public function deleteLine($line_no, $csvFileName) { // this function strips a specific line from a file // if a line is stripped, functions returns True else false // // e.g. // deleteLine(-1, xyz.csv); // strip last line // deleteLine(1, xyz.csv); // strip first line // Assigna o nome do ficheiro $filename = $csvFileName; $strip_return=FALSE; $data=file($filename); $pipe=fopen($filename,'w'); $size=count($data); if($line_no==-1) $skip=$size-1; else $skip=$line_no-1; for($line=0;$line<$size;$line++) if($line!=$skip) fputs($pipe,$data[$line]); else $strip_return=TRUE; return $strip_return; } It is possible to refactor this function to not blow up with the 256MB PHP Memory? Give me some clues. Best Regards, Similar TutorialsI have the following simple code to test against collision on a primary key I am creating: Code: [Select] $machine_ids = array(); for($i = 0; $i < 100000; $i++) { //Generate machine id returns a 15 character alphanumeric string $mid = Functions::generate_machine_id(); if(in_array($mid, $machine_ids)) { die("Collision!"); } else { $machine_ids[] = $mid; } } die("Success!"); Any idea why this is taking minutes to run? Anyway to speed it up? Hi All, I have written the following to validate my dynamic includes, one question is i will be using sessions to control user access to certain pages. Obviously the session_start() has to go into my index.php file. Can anyone see any problems with this or my dynamic include validation code. My page varilable is populated using the mod_rewirte function in appache. <?PHP include('inc/settings.inc.php'); if(isset($_GET['page'])) { //remove slashes $page = stripslashes($_GET['page']); //rebuild the extension and file name $filename = 'lib/'.$page.'.php'; //Check to see if the file exists in lib if (file_exists($filename)) { //Dynamic Switch $allowed = array( array("test", "New Customers"), array("home", "Home Page"), ); $iffed = false; $get_section = $_GET['page']; //Create a dynamic switch to check for files being in my allowed array foreach($allowed as $rd) { if($get_section == $rd[0]) { $iffed = $rd; $content = $filename; foreach($rd as $value) { $page_title = $value; } } } if($iffed === false) { //File is not in my include list. die( "Page does not pass the validated inclusion list." ); } } else { //Page does not exist in my lib folder. die('Page does not exist, please contact the administrator.'); } } else { // If no page is requested then default home. $filename = 'lib/home.php'; $content = '1'; $page_title = 'Home'; } ?> Thanks in advance. Sam I need to process a large CSV file (40 MB - 300,000 rows). While I have been working with smaller files my existing code is not able to work with large files. All I need to do is - read a particular column from file and then count total number of rows and add all the values from column. My exisitng piece of code imports whole CSV file into an array (a class is used) and then using 'ForEach' loop, reads the required column and values into another array. Once the data is in this array i can simply sum or count it. While this served me well for smaller files, i am not able to use this approach to read a larger php file. I have already increased the memory allocated to php and max_execution_time but the script just keeps on running I am no php expert but usualy get around with trial and error.......your thoughts and help will be greatly appreciated Exisiting code: Once data has been processed by initial class (Class used is freely available and known as 'parsecsv', available at http://code.google.com/p/parsecsv-for-php/) After calling the class and processing the csv file: ?php ini_set('max_execution_time', 3000); $init = array(0); //Initialize a dummy array, which hold value '0' foreach ($csv->data as $key => $col): //Get value from array, 'Data' is an array processed by class and holds csv data $ColValue = $col[SALARY']; //retrieves the column you want { $SAL= $col['SALARY']; //Column that you want to process from csv array_push ($init, $SAL); // Push value into dummy array created above echo "<pre>"; } endforeach; $total_rows = (Count($init) -1); //Count total number of value, '-1' to remove the first initilaized value in array echo "Total # of rows: ". $total_rows . "\n"; echo "Total Sum: ". array_sum($init) . "\n"; ?> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- I have a flash application that talks to upload.php Say I upload a 500mb file; it will obviously take a little while to upload. Will the max_execution_time settings cause this to fail? Its set at 60 right now and the upload is obviously taking longer than 1 minute. Hey Guys, I need a solution for uploading very large files. As I found PHP has some memory limits. Is it even possible to upload files with a size of 4GB? I need to parse large XML files ranging in size from ~ 500 to ~ 1700 Mb. I use XMLReader Code: [Select] set_time_limit(0); $start_time = microtime(true); include_once 'inc/Misc.php'; include_once 'inc/Database.php'; $files = array('xml/large_file.xml'); foreach($files as $file) { echo "\n"; echo 'Filename: '.basename($file)."\n"; echo 'Filesize: '.convert(filesize($file))."\n"; echo 'Start parsing...'."\n"; echo "\n"; $reader = new XMLReader(); $reader->open($file); while ($reader->read()) { switch ($reader->nodeType) { case (XMLREADER::ELEMENT): if ($reader->localName == "element-name") { $dom = new DomDocument(); $n = $dom->importNode($reader->expand(),true); $dom->appendChild($n); $sxe = simplexml_import_dom($n); $tess->file_big->insert($sxe); echo "Insert done! "; benchmark(); } } } } Everything is fine in the beginning ... Parsed file and slowly inserted my desired data, but is gradually growing memory consumption and has run out of resources. That is, I took the file to 400 Mb and as long as it is spent parsing of 2000 Mb of RAM and all the resources ran out and the script is stopped. How to deal with large files? ~ 500 to ~ 1700 Mb. Will there XML Parser? Yes, and how to apply it to my problem? Another option could have? <!doctype html> <html> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"> <title>Untitled Document</title> </head> <body> <form action="upload_image.php" method="post" enctype="multipart/form-data"> <label>select page<input type="file" name="image"> <input type="submit" name="upload"> </label> </form> <?php if(isset($_POST['upload'])) { $image_name=$_FILES['image']['name']; //return the name ot image $image_type=$_FILES['image']['type']; //return the value of image $image_size=$_FILES['image']['size']; //return the size of image $image_tmp_name=$_FILES['image']['tmp_name'];//return the value if($image_name=='') { echo '<script type="text/javascript">alert("please select image")</script>'; exit(); } else { $ex=move_uploaded_file($image_tmp_name,"image/".$image_name); if($ex) { echo 'image upload done "<br>"'; echo $image_name.'<br>'; echo $image_size.'<br>'; echo $image_type.'<br>'; echo $image_tmp_name.'<br>'; } else { echo 'error'; } } } ?> </body> </html>I make this simple script for upload files like photo , It's work correctly ,but not upload the large files ex 8MB images 12MB images Edited by Ch0cu3r, 04 October 2014 - 09:50 AM. Modified topic title - changed download to upload I have an xml file thats about 20MB, I want to parse it all into an array. SimpleXML seems to have trouble processing it, I just get "Fatal error: Balloc() failed to allocate memory". I've googled this topic and can't seem to find a straight forward solution. There must be someone here who has had to do this before. Is there an easy way to parse this? Hi I am running debian lenny, apache2 php5. My fail upload fails for large file sizes. A 6.2 MB file uploads file, but a 10.2Mb file fails. I have set the Max file size to 50MB and max_execution to 600 etc in php.ini, but still have the same problems. I have noticed many others having similar problems. Is there a solution? Hello, I'm using php's mail function to send sms messages to my sprint phone. Sometimes I receive the text right away and other times it comes in as much as an hour or so later. My server is on GoDaddy. Does anyone have any ideas about what could be causing the bottleneck? To debug this I tried sending sms messages from my comcast email. So far those are always received very quickly. Perhaps the issue is on the GoDaddy server... I might try contacting them. By the way, it seems like the first text I send is received quickly, but then if I send other text messages within some short time frame, those don't always show up immediately. Has anyone else experienced anything similar? Can someone please tell me the right way to do this? Code: [Select] $q = "SELECT * FROM scores WHERE student_id = {$student_id} ORDER BY test_type ASC"; $r = mysql_query($q) or die(); $r2 = mysql_fetch_array($r); $data[$r2['student_id']] = array( 'last_name' => $r2['last_name'], 'first_name' => $r2['first_name'], 'grad_year' => $r2['grad_year'] ); mysql_data_seek($r, 0); while($row=mysql_fetch_assoc($r)) { $data[testName($row['test_type'])] = array( 'eng' => $row['eng'], 'math' => $row['math'], 'read' => $row['read'], 'sci' => $row['sci'], 'comp' => round(($row['eng'] + $row['math'] + $row['read'] + $row['sci'])/4 ) ); } My Database Structure - All the different test-scores are in the same table, labeled with the test type &test-id.. Code: [Select] id-student_id-last_name-first_name-eng-math-read-sci-grad_year-test_id-test_type 1-99999-smith-john-12-12-12-12-2012-1-3 I hope this makes sense! Thanks-In-Advance would Java handle it faster? can PHP be made to handle it as fast? I really need help. check out my website and search cars for sale, then any distance and then 37919 zip. it takes too long and there are only 3351 cars on there now. before, when it only had about 2200 cars it was really fast. only about a 1000 cars made it slow. what could it be? nabacar.com It was returning results in about 15 seconds or less with 2200 cars. Now it is over a minute with only 1000 cars on it. I will have 150,000 on it when I get this sorted. http://nabacar.com some java was written to speed things up, but it didn't seem to help. Is it the server or the code causing this? I have a function to read terms from an array stored in language file as Code: [Select] function _language($term) { include('fr.php'); if(!empty($lang[$term])) { $lterm=$lang[$term]; } else { $lterm = $term;} return $lterm; } The problem is that I get the language code from url as ?lang=en or ?lang=fr; but I cannot bring this string into the function. How I can tell the function to read appropriate language file (fr.php or en.php or es.php) upon choosing language in ?lang=XX Ok so here's what I'm trying to do and what I'm having to do since I can not figure out how to script this.
I create my GUI for scripting in HTML. When I script I create a full HTML page for my input forms, one full HTML page for the error code and lastly one HTML page for the receipt when the script runs correct. I then create my php scripting (basic template where I can cut and paste my HTML for the GUI output once it has been encapsulated in FULL be the php print function after having the quotation marks escaped and replaced by a backslash follow by a quotation mark).
Currently and over the past Decade. I have done this encapsulation the Hard way. Meaning I build my HTML file in FrontPage. I save the HTML file and then I open it in Microsoft Notepad. First thing I do is use the replace All function from the edit menu to Find every instance of " and replace them all with \" then I manually goto the first line of code and place a Tad then type print " I then hold control and shift to copy the ( tab print ") and begin to paste manually using Ctrl V and then I hit the down arrow on my keyboard quickly followed by the Home button to drop my cursor down a line and input the beginning of my PHP print function encapsulation one frigging line at a time. Once I've reach the very bottom, I go up the back right hand side of the HTML document with my ending quotation followed by semi colon manually with Ctrl V and hitting up arrow followed by end button.
This process of encapsulation the hard way with note pad takes up 90% of my dev time when programming. It makes stuff easy to read in my code and is perfect for outputting. What I would like to do is find a way to create a php script with a file upload form field that would take the input of any and all html files and parse the full document automating the exact process that I run on all of my HTML GUI files so that I can turn the encapsulation work into nothing and focus on the actual scripting.
You wouldn't believe how many hours of Ctrl V and up n down end and home I have pressed. Spent 8 hours last night alone. When form fields have multiple drop down selectors for country code ect. the HTML files to encapsulate manually become Ridiculously long.
I have never seen or heard of anyone doing exactly what I've explained here as I know the print function used like this is not normal. Every HTML file or snippet in any of my scripting is always encapsulated in the print function to output. And I have no intentions on changing my style. I would like to automate my process. And if it is too complex for me to build it in a decent amount of time I am willing to pay good money to get this finished for me ASAP.
Please message me back ASAP as I'm working on a personal project that has over 2,000 HTML files for me to encapsulate a the minimum.
~PJ
Im going to use a large array of arrays, each of one having a lot of values and some sub arrays. My question is... is faster to use arrays or is better to have a object to acces using methods and all? i suppose objects are slower... Also i was planing in use arrays with string keys in nearly all places, normally these are slower, but in php hashes and arrays are the same tipe so i dont know... I want to do a test while refreshing the page 100 times with 500 rows in a table, and then without 500 rows in a table and with different kind of php code, i need to do some type of testing to get results back in to show which way is faster for mysql/php. Any idea how to do this any scripts out there or a built in php/mysql function? Thanks I am very new to PHP and have tried various techniques but I am getting a 500 error when clicking on the export button to download a csv report. I'm not sure why the previous developer did it this way. Is there a better why in PHP to make this code better? Willing to understand and learn from an PHP expert. The database is MYSQL. $coursefilterid = $_GET['course']; $conn = new mysqli($host, $username, $password, $database); if ($conn->connect_error) { die("Connection failed: " . $conn->connect_error); } $sqluserenrolled = "select mdl_user.username, mdl_user_enrolments.userid as enrolleduserid, mdl_enrol.courseid from mdl_user_enrolments Inner Join mdl_enrol on mdl_enrol.id = mdl_user_enrolments.enrolid Inner Join mdl_user on mdl_user.id = mdl_user_enrolments.userid where mdl_enrol.courseid = '" . $coursefilterid . "' order by mdl_user.username "; $queryenrolleduser = mysqli_query($conn, $sqluserenrolled); ?> <html> <head> </head> <body> <form method="post" action="<?php echo "userlistssiexport.php?id=$coursefilterid"?>"> <input type="hidden" name="exportcourseid" value="<?php echo $coursefilterid;?>"> <input type="hidden" name="sessid" value="<?php echo $USER->sesskey;?>"> <input class="btn btn-primary" type="submit" name="submit" value="<?php echo "Export";?>"> </form> <?php $noteid = ""; $cmId = ""; ?> <table class="data-table"> <caption class="title">User info</caption> <thead> <tr> <th>Username</th> <th>Firstname</th> <th>Lastname</th> <th>Email</th> <th>Last login</th> <th>Createddate</th> <th>Position</th> <th>Organization</th> <th>Certificate Request Date</th> <th>Role1</th> <th>Role2</th> <th>Role3</th> </tr> </thead> <tbody> <?php while ($row = mysqli_fetch_array($queryenrolleduser)) { $enrolleduserid = $row['enrolleduserid']; $sql = "select mdl_user.username as username, mdl_user.firstname as firstname, mdl_user.lastname as lastname, mdl_user.email as email, mdl_user.lastlogin as lastaccess, mdl_user.timecreated as createddate, mdl_user_info_data.data as position from mdl_user Inner Join mdl_user_info_data on mdl_user_info_data.userid = mdl_user.id Inner Join mdl_user_info_field on mdl_user_info_field.id = mdl_user_info_data.fieldid Inner Join mdl_user_lastaccess on mdl_user_lastaccess.userid = mdl_user.id where mdl_user_info_field.id = 1 and mdl_user.deleted = 0 and mdl_user.id = '" . $enrolleduserid . "' group by mdl_user.username order by mdl_user.username "; $query = mysqli_query($conn, $sql); if (! $query) { die('SQL Error: ' . mysqli_error($conn)); } else {} ?> <?php $no = 1; $total = 0; $username = ''; $coursename = ''; $content = ''; $modulename = ''; $organization = ''; $userid = ''; $certificatedate = ''; $userrole = ''; $enrolleduserid = ''; while ($row = mysqli_fetch_array($query)) { // Do something here $username = $row['username']; $coursename = $row['coursename']; $content = $row['content']; $noteid = $row['noteid']; // $notedatetime = date("d/m/y g:i (A)", $row['notedate']); $notedatetime = date("D M j Y G:i A", $row['notedate']); $lastaccess = date("D M j Y G:i A", $row['lastaccess']); $createddate = date("D M j Y G:i A", $row['createddate']); $datafile = $username . $coursename . $content; echo '<tr> <td>' . $row['username'] . '</td> <td>' . $row['firstname'] . '</td> <td>' . $row['lastname'] . '</td> <td>' . $row['email'] . '</td> <td>' . $lastaccess . '</td> <td>' . $createddate . '</td> <td>' . $row['position'] . '</td> '; $modid = $row['contextid']; // Get module name $sqlmodule = "select mdl_user.username as username, mdl_user.firstname as firstname, mdl_user.lastname as lastname, mdl_user.email as email, FROM_UNIXTIME(mdl_user_lastaccess.timeaccess) as lastaccess, FROM_UNIXTIME(mdl_user.timecreated) as createddate, mdl_user_info_data.data as organization from mdl_user Inner Join mdl_user_info_data on mdl_user_info_data.userid = mdl_user.id Inner Join mdl_user_info_field on mdl_user_info_field.id = mdl_user_info_data.fieldid Inner Join mdl_user_lastaccess on mdl_user_lastaccess.userid = mdl_user.id where mdl_user_info_field.id = 3 and mdl_user.deleted = 0 and mdl_user.username ='" . $username . "'"; $querymodule = mysqli_query($conn, $sqlmodule); ?> <?php $modulenamelink = ""; while ($row = mysqli_fetch_array($querymodule)) { $organization = $row['organization']; } echo '<td>' . $organization . '</td>'; $sqlCertificateDateuid = "select id from mdl_user where username = '" . $username . "'"; $queryCertificateDateuid = mysqli_query($conn, $sqlCertificateDateuid); while ($row = mysqli_fetch_array($queryCertificateDateuid)) { $userid = $row['id']; } $sqlcertificatedate = "select * from mdl_certificateemail where userid = '" . $userid . "' and courseid = '" . $coursefilterid . "'"; $querycertificaterequestdate = mysqli_query($conn, $sqlcertificatedate); while ($row = mysqli_fetch_array($querycertificaterequestdate)) { $certificatedate = date("D M j Y g:i:s A", $row['unixdatetimecertificate']); } echo '<td>' . $certificatedate . '</td>'; $sqluserrole = "select mdl_role_assignments.userid, mdl_role_assignments.roleid,mdl_course_modules.course, mdl_role.shortname as rolename,FROM_UNIXTIME(mdl_role_assignments.timemodified) from mdl_role_assignments Inner Join mdl_context on mdl_context.id = mdl_role_assignments.contextid Inner Join mdl_course_modules on mdl_course_modules.instance = mdl_context.instanceid Inner Join mdl_role on mdl_role.id = mdl_role_assignments.roleid where mdl_course_modules.course = '" . $coursefilterid . "' and mdl_role_assignments.userid = '" . $userid . "' group by mdl_role_assignments.userid, mdl_role_assignments.roleid, mdl_course_modules.course, mdl_role.shortname, mdl_role_assignments.timemodified order by mdl_role_assignments.timemodified "; $userlistrole = ''; $queryuserrole = mysqli_query($conn, $sqluserrole); while ($row = mysqli_fetch_array($queryuserrole)) { $userrole = $row['rolename']; $userlistrole = array( array( $userrole ) ); // echo '<td>'.$userrole.'</td>'; } foreach ($userlistrole as $listrole) { // echo $listrole; } $teacherrole = array( 'student' ); foreach ($teacherrole as $rolename) { $role = $DB->get_record('role', array( 'shortname' => $rolename )); $context = get_context_instance(CONTEXT_COURSE, $coursefilterid); // $context = context_course::instance($cid1); $teachers = get_role_users($role->id, $context); foreach ($teachers as $teacher) { $teacherid = $teacher->id; if ($teacherid == $userid) { echo '<td>student</td>'; } } } $teacherrole = array( 'editingteacher' ); foreach ($teacherrole as $rolename) { $role = $DB->get_record('role', array( 'shortname' => $rolename )); $context = get_context_instance(CONTEXT_COURSE, $coursefilterid); // $context = context_course::instance($cid1); $teachers = get_role_users($role->id, $context); foreach ($teachers as $teacher) { $teacherid = $teacher->id; if ($teacherid == $userid) { echo '<td></td>'; echo '<td>editingteacher</td>'; } } } $teacherrole = array( 'manager' ); foreach ($teacherrole as $rolename) { $role = $DB->get_record('role', array( 'shortname' => $rolename )); $context = get_context_instance(CONTEXT_COURSE, $coursefilterid); // $context = context_course::instance($cid1); $teachers = get_role_users($role->id, $context); foreach ($teachers as $teacher) { $teacherid = $teacher->id; if ($teacherid == $userid) { echo '<td>manager</td>'; } } } echo '</tr>'; } } ?> </tbody> <tfoot> </tfoot> </table> </body> </html> <?php } } else { header("Location:/index.php"); // echo "something"; die(); } } else { header("Location:/index.php"); die(); } ?>
I am wondering since in php when you write string in " " quotes php will look if there is any variable and if it is it will read that variable and replace variable name with that value inside the string. However when i use ' ' quotes php will not look for any variables inside that string. So my question is when you write a really big application is it good to always use ' ' quotes when you can instead of " " ones. Does that have an impact on performance. Thanks A few questions that if I was more knowledgeable about MySQL (such as the query log that I've heard of). - Does mysqli::multi_query() make multiple requests to the server or just one? - If it is just splitting the queries on the semicolon and then making multiple requests, is it still faster than looping and querying in PHP? - Does it take less memory than, lets say, mysql_query() foreach query? - Is it faster than, lets say again, mysql_query() looped? The last 2 I can probably just benchmark myself but I'm hoping someone will know off had. Background (blah blah blah) I'm working on a lean rapid development framework for myself (and maybe others eventually) to use. Much like Cake and other frameworks it does a lot of queries, often times more then you need. The sacrifice of course is performance VS ease of development, but I'm trying to not make a martyr out of my framework :-) I've observed on many occasions that simple queries can take more time than a complex query on seemingly random occasions. My understanding is that this is because of the connection, not the real amount of time it took for the server to query. I figure if I avoid multiple requests to the server I will cut down on the time everything takes. My theory is based on the same concept of combining all your CSS and JS files into single files to avoid dozens of HTTP requests which slow down page load time. Alright guys, I see people recommending prepared statements (PDO/MySQLi) and saying that they are way to go these days. However upon doing a bit more research, I've found that prepared statements, PDO in particular, is lacking in terms of performance, especially using SELECT statements. Now I'm starting a new project, which is basically a text based game with lots of queries and DB interaction, so I'm really interested in knowing what's the best approach for me. I was leaning towards PDO but I don't want it crawling my server under heavy load. I appreciate any advices or first hand experiences on this. |