PHP - Moved: How To Remove .html
This topic has been moved to mod_rewrite.
http://www.phpfreaks.com/forums/index.php?topic=307587.0 Similar TutorialsThis topic has been moved to mod_rewrite. http://www.phpfreaks.com/forums/index.php?topic=325974.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=328353.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=316055.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=326002.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=306153.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=322184.0 This topic has been moved to MySQL Help. http://www.phpfreaks.com/forums/index.php?topic=317444.0 This topic has been moved to Editor Help (Dreamweaver, Zend, etc). http://www.phpfreaks.com/forums/index.php?topic=349934.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=356719.0 This topic has been moved to PHP Regex. http://www.phpfreaks.com/forums/index.php?topic=347065.0 This topic has been moved to mod_rewrite. http://www.phpfreaks.com/forums/index.php?topic=359558.0 This topic has been moved to MySQL Help. http://www.phpfreaks.com/forums/index.php?topic=349563.0 I have a function that removes html, javascript and php from a text box that users of my site can write in. The problem is when a user types say; hello. My name is Danny And I love php freaks into my text box. Its also removing the new line entry's. I would very much stop to stop it doing that so it looks like I have paragraphs. Can someone please help me. Here is how I call the function. $description = strip_word_html($_POST['description'], $allowed_tags = '<b><i><sup><sub><em><strong><u><br><br/><br />'); And the function itself with notes. //remove html java and php function strip_word_html($text, $allowed_tags = '<b><i><sup><sub><em><strong><u><br><br/><br />') { mb_regex_encoding('UTF-8'); //replace MS special characters first $search = array('/‘/u', '/’/u', '/“/u', '/”/u', '/—/u'); $replace = array('\'', '\'', '"', '"', '-'); $text = preg_replace($search, $replace, $text); //make sure _all_ html entities are converted to the plain ascii equivalents - it appears //in some MS headers, some html entities are encoded and some aren't $text = html_entity_decode($text, ENT_QUOTES, 'UTF-8'); //try to strip out any C style comments first, since these, embedded in html comments, seem to //prevent strip_tags from removing html comments (MS Word introduced combination) if(mb_stripos($text, '/*') !== FALSE){ $text = mb_eregi_replace('#/\*.*?\*/#s', '', $text, 'm'); } //introduce a space into any arithmetic expressions that could be caught by strip_tags so that they won't be //'<1' becomes '< 1'(note: somewhat application specific) $text = preg_replace(array('/<([0-9]+)/'), array('< $1'), $text); $text = strip_tags($text, $allowed_tags); //eliminate extraneous whitespace from start and end of line, or anywhere there are two or more spaces, convert it to one $text = preg_replace(array('/^\s\s+/', '/\s\s+$/', '/\s\s+/u'), array('', '', ' '), $text); //strip out inline css and simplify style tags $search = array('#<(strong|b)[^>]*>(.*?)</(strong|b)>#isu', '#<(em|i)[^>]*>(.*?)</(em|i)>#isu', '#<u[^>]*>(.*?)</u>#isu'); $replace = array('<b>$2</b>', '<i>$2</i>', '<u>$1</u>'); $text = preg_replace($search, $replace, $text); //some MS Style Definitions - this last bit gets rid of any leftover comments */ $num_matches = preg_match_all("/\<!--/u", $text, $matches); if($num_matches){ $text = preg_replace('/\<!--(.)*--\>/isu', '', $text); } return $text; } I want to remove empty paragraphs from an HTML document using simple_html_dom.php. I know how to do it using the DOMDocument class, but, because the HTML files I work with are prepared in MS Word, the DOMDocument's loadHTMLFile() function gives this exception "Namespaces are not defined". This is the code I use with the DOMDocument object for HTML files not prepared in MS Word: <?php /* Using the DOMDocument class */ /* Create a new DOMDocument object. */ $html = new DOMDocument("1.0", "UTF-8"); /* Load HTML code from an HTML file into the DOMDocument. */ $html->loadHTMLFile("HTML File With Empty Paragraphs.html"); /* Assign all the <p> elements into the $pars DOMNodeList object. */ $pars = $html->getElementsByTagName("p"); echo "The initial number of paragraphs is " . $pars->length . ".<br />"; /* The trim() function is used to remove leading and trailing spaces as well as * newline characters. */ for ($i = 0; $i < $pars->length; $i++){ if (trim($pars->item($i)->textContent) == ""){ $pars->item($i)->parentNode->removeChild($pars->item($i)); $i--; } } echo "The final number of paragraphs is " . $pars->length . ".<br />"; // Write the HTML code back into an HTML file. $html->saveHTMLFile("HTML File WithOut Empty Paragraphs.html"); ?> This is the code I use with the simple_html_dom.php module for HTML files prepared in MS Word: <?php /* Using simple_html_dom.php */ include("simple_html_dom.php"); $html = file_get_html("HTML File With Empty Paragraphs.html"); $pars = $html->find("p"); for ($i = 0; $i < count($pars); $i++) { if (trim($pars[$i]->plaintext) == "") { unset($pars[$i]); $i--; } } $html->save("HTML File without Empty Paragraphs.html"); ?> It is almost the same, except that that the $pars variable is a DOMNodeList when using DOMDocument and an array when using simple_html_dom.php. But this code does not work. First it runs for two minutes and then reports these errors: "Undefined offset: 1" and "Trying to get property of nonobject" for this line: "if (trim($pars[$i]->plaintext == "")) {". Does anyone know how I can fix this? Thank you. I also asked on stackoverflow. First problem fixed. My second problem is if the result is 0X.XX,0 or 0X.XX,1 I would like to remove the first 0 The X.XX are numbers, but the 0 is not always there. Can anyone help please Thanks This topic has been moved to Application Design. http://www.phpfreaks.com/forums/index.php?topic=328435.0 This topic has been moved to Miscellaneous. http://www.phpfreaks.com/forums/index.php?topic=347856.0 This topic has been moved to CSS Help. http://www.phpfreaks.com/forums/index.php?topic=347457.0 This topic has been moved to JavaScript Help. http://www.phpfreaks.com/forums/index.php?topic=343133.0 |