PHP - Simple Html Parser Help
im using simple_html_dom.php
i want to extract the following html: and number the array key so i will know the location of each <td> and extract the value the this cell: <TD ALIGN=RIGHT NOWRAP class="ftableline1"> 3.7200 </TD> with this : Code: [Select] foreach($html->find('td[class=ftableline1]') as $e) echo $e->innertext . '<br>'; Code: [Select] <TR class="ftableline1"> <TD ALIGN=RIGHT NOWRAP class="ftableline1"> 3.7200 </TD> <TD ALIGN=RIGHT NOWRAP class="ftableline1"> 3.5400 </TD> <TD ALIGN=RIGHT NOWRAP class="ftableline1"> 3.6651 </TD> <TD ALIGN=RIGHT NOWRAP class="ftableline1"> 3.5982 </TD> <TD align="right" NOWRAP class="ftableline1"> <A HREF=_matbea=1><IMG SRC="images/tezuga_graphit.gif" WIDTH=15 HEIGHT=15 ALT="Show Graph" BORDER="0"></a><BR> </TD> <TD ALIGN=RIGHT NOWRAP>0.01%</TD> <TD ALIGN=right dir="rtl"> <IMG SRC="images/arrow_up.gif" WIDTH=10 HEIGHT=8 BORDER=0><BR> </TD> <TD align="right" NOWRAP dir="rtl" class="ftableline1"> 3.6316 </TD> <TD align="right" NOWRAP dir="rtl" class="ftableline1"> 1 </TD> <TD ALIGN=RIGHT NOWRAP dir="rtl" class="ftableline1"> <A HREF=_matbea=1> דולר ארה"ב</A><BR> </TD> <TD align="right" NOWRAP dir="rtl" class="ftableline1"> <A href="_matbea=1"><IMG SRC="../../meida/images/f1.gif" HEIGHT=15 WIDTH=21 border=0></A><BR> </TD> <TD ALIGN=center NOWRAP dir="rtl"><INPUT TYPE="Checkbox" VALUE="1" NAME="check" id="check" ></TD> </TR> Similar TutorialsHi everyone, I'm trying to select either a class or an id using PHP Simple HTML DOM Parser with absolutely no luck. My example is very simple and seems to comply to the examples given in the manual(http://simplehtmldom.sourceforge.net/manual.htm) but it just wont work, it's driving me up the wall. Here is my example: http://schulnetz.nibis.de/db/schulen/schule.php?schulnr=94468&lschb= I think the HTML is invalid: i cannot parse it. Well i need more examples - probly i have overseen something! If anybody has a working example of Simple-html-dom-parser...i would be happy. The examples on the developersite are not very helpful. your dilbertone require_once 'phpSimpleHtmlDomClass.php'; $html = '<div> <div class="man">Name: madac</div> <div class="man">Age: 18 <div class="man">Class: 12</div> </div>' $name=$html->find('div[class="man"]', 0)->innertext; $age=$html->find('div[class="man"]', 1)->innertext; $cls=$html->find('div[class="man"]', 2)->innertext; wanna get a text from each div class="man" but it didn't work because there is a missing closing div tag on 2nd line of html code. please help me to fix this. thanks in advance. Hello dear Community, i have a document i need to parse it and spit out only this part of the table: see http://schulnetz.nibis.de/db/schulen/schule.php?schulnr=67003&lschb= how to i parse the stuff!? With perl or php? Note i have the xpaths (see below) Sad that i cannot apply them on Simple DOM Parser since this Dom Parser does not work with Xpaths but with CSS-Selectors: Well i want to get all the data with that are within the table that name is called class="fliess" How to dump all the results? BTW - thinking about the most elegant way, i think it is the most pretty way would be to do it with perl - So i can try it with HTML::TableExtract or.... Well what do you suggest - Which way to choose to do this [very] simple thing? Look forward to hear from you! see the xpaths: Schule: /html/body/center/table/tbody/tr[2]/td[1] Stasse: /html/body/center/table/tbody/tr[3]/td[1] Ort: /html/body/center/table/tbody/tr[4]/td[1] Tel: /html/body/center/table/tbody/tr[5]/td[1] Schulgliederungen: /html/body/center/table/tbody/tr[6]/td[1] Besonderheite: /html/body/center/table/tbody/tr[7]/td[1] E-Mail: /html/body/center/table/tbody/tr[8]/td[1] Schulnummer: /html/body/center/table/tbody/tr[9]/td[1] Hi All, I am using the PHP Simple HTML DOM parser to connect to a financials website, parse out a companies financial information (Income statement in this case) and then insert the scrapped data into a mysql database that I can then later use to run automated calculations. Here is the code I have so far: Code: [Select] <?php include_once 'simple_html_dom.php'; //Connect to financial Website and Create DOM from URL $income_statement = file_get_html('http://www.WEBSITE.com/finance?etc..etc...etc...etc...'); //PULL FINANCIAL DATA foreach($income_statement->find('td[class]' ) as $lines=>$data) { echo $data->plaintext . "<br/>"; } // clean up memory $html->clear(); unset($html); ?> So far I am able to get output that looks like this: Code: [Select] Revenue 336.57 331.52 324.32 319.29 320.40 Other Revenue, Total - - - - - Total Revenue 336.57 331.52 324.32 319.29 320.40 etc............................. But being a newb I do not understand how I can break each $ value and each - into their own variables and then insert them to their corresponding mysql table fields. During the database insert I would like to ignore field headings from insertion (i.e Revenue, Total Revenue, etc.... Any help would be absolutely amazing, as I have been reading, scripting and searching for information like crazy, but just can't seem to figure it out. hello dear community, i am currently wroking on a approach to parse some sites that contain datas on Foundations in Switzerland with some details like goals, contact-E-Mail and the like,,, See http://www.foundationfinder.ch/ which has a dataset of 790 foundations. All the data are free to use - with no limitations copyrights on it. I have tried it with PHP Simple HTML DOM Parser - but , i have seen that it is difficult to get all necessary data -that is needed to get it up and running. Who is wanting to jump in and help in creating this scraper/parser. I love to hear from you. Please help me - to get up to speed with this approach? regards Dilbertone Hello dear friends, first of all : merry merry Xmas!!! i want to parse with the simple Simple HTML DOM Parser, well i am pretty new to php and to the Simple HTML DOM Parser. My example: http://schulen.bildung-rp.de/gehezu/startseite/einzelanzeige.html?tx_wfqbe_pi1[uid]=60119 I want to collect the data in the block: I have investigated the sourcecode - and found out that the attribute of interest should be this one: class="content"div class="content"><!-- TYPO3SEARCH_begin --> here the code is: - my trails. // inculde the Simple HTML DOM Parser include_once('simple_html_dom.php'); // get the file we want to parse right now,create a DOM $html = file_get_html(''); // simple_html_dom::find() creates a new // simple_html_dom-Objekt, that consists out of // corresponding childelements foreach($html->find('class: content ') as $h3) { // simple_html_dom::get the text in a tag // den Text innerhalb eines Tags if($h3->innertext == 'Text of a H3 Tag') { break; } } // simple_html_dom::next_sibling() gives the // next Element $table = $h3->next_sibling(); but believe me - it gives me not back what is aimed. what have id done wrong...? dbone I'm using PHP 5.2 Server and Simple HTML DOM 1.5. This script scrape or extract data from a football site, its fully working on PHP 5.9 Server but I need to know how I can fix it for PHP 5.2 server. Can someone give me a hint on how can I fix the error? Thanks in advance. My PHP 5.2 Server script output shows: ++++++++++++++++ Object id #599 Object id #604 Object id #609 Object id #614 Object id #619 Object id #627 Object id #632 Object id #637 Object id #642 Object id #647 Object id #655 Object id #660 Object id #665 Object id #670 Object id #675 Object id #683 Object id #688 Object id #693 Object id #698 Object id #703 Object id #711 Object id #716 Object id #721 Object id #726 Object id #731 ++++++++++++++++ while PHP 5.9 Server says ++++++++++++++++ Rk Player Team POS OPPONENT 1 Aaron Rodgers GB QB at CAR 2 Tom Brady NE QB vs. SD 3 Matt Schaub HOU QB at MIA 4 Michael Vick PHI QB at ATL ++++++++++++++++ I did applied the bug solution listed on https://sourceforge.net/tracker/index.php?func=detail&aid=3107230&group_id=218559&atid=1044037 but it is still not working. It says: ++++++++++++++++ Details: I get compiler errors in PHP 5.2 when using this as an object. The offending lines are 609 and 940, which both contain this construct: if ($this->size>0) $this->char = $this->doc[0]; This tries to get the first character of $this->doc, but PHP 5.2 sees it as trying to access it as an array. It's easily fixed by this: if ($this->size>0) $this->char = substr($this->doc, 0, 1); Or you could probably use chr(ord($this->doc)) as well. Either way solves the compile error without changing functionality. ++++++++++++++++ Here are my codes: Code: [Select] <?php # don't forget the library include('simple_html_dom.php'); # this is the global array we fill with article information $articles = array(); $source = 'http://www.athlonsports.com/columns/winning-game-plan/fantasy-football-qb-rankings'; # passing in the first page to parse, it will crawl to the end # on its own getArticles($source); function getArticles($page) { global $articles, $descriptions; $html = new simple_html_dom(); $html->load_file($page); //$items = $html->find('div[class=preview]'); $items = $html->find('tbody tr'); foreach($items as $post) { # remember comments count as nodes /*$articles[] = array($post->children(3)->outertext, $post->children(6)->first_child()->outertext);*/ $articles[] = array($post->children(0), $post->children(1), $post->children(2), $post->children(3), $post->children(4)); } # lets see if there's a next page if($next = $html->find('a[class=nextpostslink]', 0)) { $URL = $next->href; echo "going on to $URL <<<\n"; # memory leak clean up $html->clear(); unset($html); getArticles($URL); } } ?> <html> <head> </head> <body> <? echo "Source: " . $source; ?> <table cellpadding="5" cellspacing="0" border="0"> <?php foreach($articles as $item) { echo "<tr>"; echo "<td>" . $item[0] . "</td><td>" . $item[1] . "</td><td>" . $item[2] . "</td>"; echo "<td>" . $item[3] . "</td><td>" . $item[4] . "</td>"; echo "<tr>"; } ?> </table> </body> </html> good day dear community, this is a big issue. I have to decide: between native PHP DOM Extension or of simple DOM html parser well i want to parse the site he http://buergerstiftungen.de/cps/rde/xchg/SID-A7DCD0D1-702CE0FA/buergerstiftungen/hs.xsl/db.htm http://buergerstiftungen.de/cps/rde/xchg/SID-A7DCD0D1-702CE0FA/buergerstiftungen/hs.xsl/db.htm I will suggest to use the native PHP "DOM" Extension instead of "simple html parser", since it will be much faster and easier What do you think about this one here...: Code: [Select] $doc = new DOMDocument @$doc->loadHTMLFile('...URL....'); // Using the @ operator to hide parse errors $contents = $doc->getElementById('content')->nodeValue; // Text contents of #content look forward to hear from you best regards db1 Im using some software called php html dom parser i wont to be able to keep the souce tidy i.e before dom parser <?php //////////////////////SEO TOOL/////////////////////////// $title = 'Green Deal Nationwide - PB Energy Solutions Ltd'; $description = 'Delivering all your environmental needs to \'green\' up your business, improve reputation, increase profitability and give a competitive advantage.'; /////////////////////////////////////////////////////// ?> <?php include('includes/settings.php'); ?> <?php include('includes/header.php'); ?> <div class="container"> <div id="large-page-img"> <img src="<?php echo URL(); ?>images/home-page-slide.jpg" width="911" height="230" /> <img src="<?php echo URL(); ?>images/home-page-slide-1.jpg" width="911" height="230" /> <img src="<?php echo URL(); ?>images/home-page-slide-2.jpg" width="911" height="230" /> </div> <div id="content-home"> <div class="iedit"> after dom parser saved to file <?php //////////////////////SEO TOOL/////////////////////////// $title = 'Green Deal Nationwide - PB Energy Solutions Ltd'; $description = 'Delivering all your environmental needs to \'green\' up your business, improve reputation, increase profitability and give a competitive advantage.'; /////////////////////////////////////////////////////// ?> <?php include('includes/settings.php'); ?> <?php include('includes/header.php'); ?> <div class="container"> <div id="large-page-img"> <img src="<?php echo URL(); ?>images/home-page-slide.jpg" width="911" height="230" /> <img src="<?php echo URL(); ?>images/home-page-slide-1.jpg" width="911" height="230" /> <img src="<?php echo URL(); ?>images/home-page-slide-2.jpg" width="911" height="230" /> </div> <div id="content-home"> <div class="iedit"><div class="iedit"> is there anyway i can keep it like the original fil after dom? how should i approach the following: a page with a products list+link to product page i want to build a crawler that loops through all the products in the list and goes to the product page and and parses the product page. need help with the loop hello dear Freaks
i am currently musing bout the portover of a python bs4 parser to php - working with the simplehtmldom-parser / pr the DOM-selectors... (see below). The project: for a list of meta-data of wordpress-plugins: - approx 50 plugins are of interest! but the challenge is: i want to fetch meta-data of all the existing plugins. What i subsequently want to filter out after the fetch is - those plugins that have the newest timestamp - that are updated (most) recently. It is all aobut acutality... https://wordpress.org/plugins/participants-database ....and so on and so forth.
https://wordpress.org/plugins/wp-job-manager we have the following set of meta-data for each wordpress-plugin: Version: 1.9.5.12 installations: 10,000+ WordPress Version: 5.0 or higher Tested up to: 5.4 PHP Version: 5.6 or higher Tags 3 Tags:databasemembersign-up formvolunteer Last updated: 19 hours ago
the project consits of two parts: the looping-part: (which seems to be pretty straightforward). the parser-part: where i have some issues - see below. I'm trying to loop through an array of URLs and scrape the data below from a list of wordpress-plugins. See my loop below- as a base i think it is good starting point to work from the following target-url:
plugins wordpress.org/plugins/browse/popular with 99 pages of content: cf ...
the Output of text_nodes: ['Version: 1.9.5.12', 'Active installations: 10,000+', 'Tested up to: 5.6 '] but if we want to fetch the data of all the wordpress-plugins and subesquently sort them to show the -let us say - latest 50 updated plugins. This would be a interesting task:
first of all we need to fetch the urls then we fetch the information and have to sort out the newest- the newest timestamp. Ie the plugin that updated most recently List the 50 newest items - that are the 50 plugins that are updated recently ..
we have the following set see here the Soup_ soup = BeautifulSoup(r.content, 'html.parser') target = [item.get_text(strip=True, separator=" ") for item in soup.find( "h3", class_="screen-reader-text").find_next("ul").findAll("li")[:8]] head = [soup.find("h1", class_="plugin-title").text] new = [x for x in target if x.startswith( ("V", "Las", "Ac", "W", "T", "P"))] return head + new with ThreadPoolExecutor(max_workers=50) as executor1: futures1 = [executor1.submit(parser, url) for url in allin] for future in futures1: print(future.result())
see the formal output Quote
background: https://stackoverflow.com/questions/61106309/fetching-multiple-urls-with-beautifulsoup-gathering-meta-data-in-wp-plugins Well - i guess that we c an do this with the simple DOM Parser - here the seclector reference. https://stackoverflow.com/questions/1390568/how-can-i-match-on-an-attribute-that-contains-a-certain-string
look forward to any hint and help.
have a great day Edited May 3, 2020 by dil_bertI need a little bit of help. I have a problem in which I need to read a file using fopen then open another file and look for a match and then take the information from both and enter it in a database then loop like 1000-2000 times. EG: Code: [Select] 8;"zip";"File1.zip";"post-36-10839578260.ibf";"318";"7129";"8cc3bac5f531206b9ca0414d08430b1e";"3485" 27;"zip";"File2.zip";"post-40-10840088850.ibf";"222";"7162";"6af0e0d5798485656a2fd78d75a86e6d";"60984" and find the matching text in another csv file. Also, there is like 1000+ lines. The easy think I could think of is to parse the information and put it into a table and search the table.. hi, i have this xml <m time="2012-03-09T11:14:20+00:00" timestamp="1331291660"> <ma id="1219457" xsid="0"> <time>2012-03-09T19:30:00+00:00</time> <gru id="8388">Nacional</gru> <ht id="2325">Teste</ht> <at id="8919">Teste2</at> <results /> <mar did="6" name="Under"> <ofr id="95690814" n="2" ot="0" last_updated="2012-03-09T11:13:35+00:00" flags="1" bmoid="1000095485"> <ors i="0" time="2012-03-08T18:59:22+00:00" starting_time="2012-03-09T19:30:00+00:00"> <a1>4</a1> <a2>3.5</a2> <a3>4</a3> <a4>2</a4> <a5>8</a5> </ors> </ofr> </mar> </ma> </m> and this code: $DOMDocument = new DOMDocument( '1.0' , 'utf-8' ); $DOMDocument->preserveWhiteSpace = false; $DOMDocument->loadXML( $xml ); foreach ( $DOMDocument->getElementsByTagName( '*' ) as $Nodes ) { foreach ( $Nodes->getElementsByTagName( '*' ) as $Node ) { $Data[ $Node->parentNode->nodeName ][ $Node->nodeName ] = $Node->nodeValue; } with this code i can load the value of the a1 and a2 etc but i need load the name of the mar ( Under ) and the did. how can i do this? thanks I am trying to parse a XML file using SimpleXML, but I can't figure out how to display <Category> name and each <feature> Here's a example of the XML file. <Vehicle> <Description>To set an appointment callor email </Description> <Features> <Category Name="Comfort"> <Feature>Front Air Conditioning</Feature> <Feature>Front Air Conditioning Zones: Single</Feature> <Feature>Front Air Conditioning: Climate Control</Feature> </Category> </Features> </Vehicle> Hi All. I was working on this class for the last .. 2 hours or so. I was just wondering, if any of you could see anything immediatly wrong with it. I haven't really had chance to test it. If it does work. Help yourselves to it. <?php /* Made By Richard Clifford for the use and redistribution under the GPL 2 license. If you wish to use this class, please leave my name and this comment in here. */ /* Usage Example: $this->objXML->loadXMLDoc('Document.xml'); #Must start with This $this->objXML->getChildren(); $this->objXML->getChildrenByNode('NodeName'); $this->objXML->asXML(); $this->objXML->addAttribute('attrName','attrValue'); $this->objXML->addChild('ParentName','childName'); $this->objXML->getAttributesByNode('Attr'); $this->objXML->xmlToXPath(); $this->objXML->countChildren('parentNode'); */ class XML{ function __construct(){ //Calls the loadXMLDoc func $this->loadXMLDoc($docName); } /** @Brief Loads the XML doc to read from @Param docName The Name of the XML document @Param element The SimpleXMLElement, Default = null @Since Version 1. Dev @Return blOut Boolean */ public function loadXMLDoc($docName, $element = null){ //Set the Return value $blOut = false; //Load the XML Document $XML = simplexml_load_file($docName); //Call the SimpleXMLElement $SXE = new SimpleXMLElement($element); //Checks to make sure the file is loaded and the SimpleXMLElement has been called if( ($XML == TRUE) && ($SXE == TRUE)){ $blOut = true; }else{ die('Sorry, Something Went Wrong'); } //Return the Variable return $blOut; } /** @Brief Gets an XML Child Attribute @Since Version 1. Dev @Returns $strOut String from the Child */ public function getChildren(){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} $strOut = NULL; //Loads the XML Document $XML = $this->loadXMLDoc(); //Gets all of the Children in the Document $xmlChildren = $XML->children(); //Loops through the children to give each Child and its value. foreach($xmlChildren as $child){ $strOut = $child->getName() . ':' . $child . '<br />'; } return $strOut; } /** @Brief Return a well-formed XML string based on SimpleXML element @Since Version 1. Dev @Returns $strOut The Well Formed XML string */ public function asXML(){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} $strOut = NULL; //Loads XML DOC $strToXML = $this->loadXMLDoc(); //Calls the Element $SXE = new SimpleXMLElement($strToXML); if(!$SXE){ die('Oops, something went wrong'); }else{ //Set the Return Value as the XML $strOut = $xml->asXML(); } return $strOut; } /** @Brief Adds an attribute to the SimpleXML element @Since Version 1. Dev @Returns $blOut Boolean */ public function addAttribute($element, $attrName, $attrValue){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} $blOut = false; //loads the XML Doc and sets the SXE (SimpleXMLElement) $addAttr = $this->loadXMLDoc($this->docName, $element); $addNewAttr = $addAttr->addAttribute($attrName, $attrValue, $namespace = null); if(!$addNewAttr){ die('Sorry We could not add the attribute at this time.'); }else{ $blOut = true; } return $blOut; } /** @Brief Adds a child element to the XML node @Since Version 1. Dev @Returns $blOut Boolean */ public function addChild($childName, $childValue, $namespace = null){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} $blOut = false; //Loads the XML Doc $addChild = $this->loadXMLDoc(); //Adds a new child to the document $addNewChild = $addChild->addChild($childName, $childValue, $namespace); if(!$addNewChild){ die('Sorry, I couldn\'t addChild at this Time'); }else{ $blOut = true; } return $blOut; } /** @Brief Finds children of given node @Since Version 1. Dev @Param $nodeName The Name of the Node to retrieve Children @Param $prefix If is_prefix is TRUE, ns will be regarded as a prefix. If FALSE, ns will be regarded as a namespace URL. @Returns $arrOut Multi-Dimensional Array */ public function getChildrenByNode($nodeName, $prefix = true){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} $arrOut = array(); $xml = $this->loadXMLDoc(); $getChildren = $xml->children($nodeName); foreach($getChildren as $child){ $arrOut['child_name'] = array($child); } return $arrOut; } /** @Brief Gets an array of attributes from a node name @Since Version 1. Dev @Param $nodeName The name of a Node to get a */ public function getAttributesByNode($nodeName){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} $arrOut = array(); //Loads the XML Document $loadXml = $this->loadXMLDoc(); //Gets the XML code $xml = $loadXML->asXML(); //Loads the XML into a string $xmlString = simplexml_load_string($xml); //For each attribute in the XML Code string set as $attr=>$value foreach($xmlString->$nodeName->attributes() as $attr=>$value){ $arrOut = array($attr=>$value); } return $arrOut; } /** @COMPLETE AT A LATER DATE @NEED TO LOOKUP */ public function xmlToXPath(){ } /** @Brief Counts how many children are in a node @Param $nodeName The name of the node to query @Returns $strOut Counts the amount of children in a node, and adds it to a string. */ public function countChildren($nodeName, $return = 'int'){ if(!$this->loadXMLDoc() ){ die('Couldn\'t Load XML Document');} //Switch the $return as the user may want an integer or a string returned. switch($return){ case 'str': //if $strOut = NULL fails then user $strOut = ''; $strOut = NULL; //Load the XML Document $xml = $this->loadXmlDoc(); //Get the Children $xmlNode = $this->getChildrenByNode($nodeName); //Set a counter $counted = 0; //Forevery child in every node foreach($xmlNode as $child){ //add one to the counter $counted++; //return a string $strOut = sprintf('The %s has %s children',$xmlNode, $child); } return $strOut; break; case 'int': $intOut = 0; $xml = $this->loadXmlDoc(); $xmlNode = $this->getChildrenByNode($nodeName); $counted = 0; foreach($xmlNode as $child){ $counted++; } //Return the counter(int) (int) $intOut = $counted; return $intOut; break; } return $return; } } ?> Best Regards, Mantyy Code: [Select] <?php $dom = new DOMDocument(); $dom->loadHTMLFile('http://en.wikipedia.org/wiki/Liverpool_F.C.'); $domxpath = new DOMXPath($dom); foreach ($domxpath->query('//span[@id="Players"]/../following-sibling::table[1]//span[@class="fn"]') as $a) {echo " <p>$a->textContent</p> "; }; ?> Hello, how can I parse an XML that includes all of the $a->textContent with a tag like <player></player>? I was wonder how I could make some BBcode for my messaging system I made for a website. I made some simple ones like just by replace [ red ] with font color etc. I want to know how to do something like this: [ url = http://google . ca] Google [ / url ] without the spaces. ~AJ I need help with this old script I found. Parse Error on line 101. And Line 101 is ?> <?php $file = "music.xml"; $to_print = array("Name", "Artist", "Album", "Track ID", "Year", "Play Count", "Track Number", "Track Count", "Genre", "Rating", "Date Added"); $db_host = "localhost"; $db_name = "music_library"; $db_table = "table"; $db_username = "root"; $db_password = ""; function db_connect() { global $db_host, $db_name, $db_table, $db_username, $db_password; mysql_connect($db_host, $db_username, $db_password) or die("<p style='font-color:red'>Cannot connect to mySQL server</p>"); mysql_select_db($db_name) or die("<p style='font-color:red'>Cannot connect to mySQL database</p>"); } function alter_print_arr(&$input, $key) { $input = str_replace(' ', '_', strtolower($input)); } array_walk($to_print, 'alter_print_arr'); function array_to_table($array) { global $db_table, $to_print; db_connect(); mysql_query("DELETE FROM $db_table") or die("Could not remove old records."); mysql_query("OPTIMIZE TABLE $db_table"); foreach ($array as $elem_key => $element) { if (isset($element[track_id])) { $sql = ""; foreach ($element as $k => $v) { if (in_array($k, $to_print)) { $sql .= "$k='" . mysql_real_escape_string(str_replace('=amp=', '&', $v)) . "', "; } } $sql = rtrim(ltrim($sql, "track_id='$element[track_id]', "), ", "); $sql1 = "INSERT INTO $db_table (track_id) VALUES ('$element[track_id]');"; $sql2 = "UPDATE $db_table SET $sql WHERE track_id=$element[track_id];"; mysql_query($sql1) or die(mysql_error()); // echo"$sql1<br />$sql2<br /><br />"; // For debugging. Uncomment with caution! mysql_query($sql2) or die(mysql_error()); } } echo "Done! :)"; // print_r($array); // For debugging. Uncomment with caution! } $xml_parser = ""; //will hold each song in a 2-d array $songs = array(); //counter, number of 'dict' elements encountered $current_key=""; $number_dicts = 0; //key for each element in second dimension of array $current_element=""; //stores xml element name //value for second dimension array elements $current_data = ""; //boolean used to help let us know if we're done with the song list $end_of_songs = false; function start_element($parser, $name, $attribs) { global $current_element, $number_dicts; if ($name == "DICT") { $number_dicts++; } if ($number_dicts > 2) { $current_element = $name; } } function end_element($parser, $name) { global $songs, $current_element, $current_data, $number_dicts, $array_key, $end_of_songs; if ($end_of_songs) { return; } if ($current_element == "KEY") { $array_key = str_replace(' ', '_', strtolower($current_data)); } else { $songs[$number_dicts][$array_key] = $current_data; } } function character_data($parser, $data) { global $number_dicts, $current_data, $end_of_songs; if ($data == "Playlists") { $end_of_songs = true; } $current_data = trim($data); } $xml_parser = xml_parser_create(); xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 1); xml_set_element_handler($xml_parser, "start_element", "end_element"); xml_set_character_data_handler($xml_parser, "character_data"); if (!($fp = @fopen($file, "r"))) { return false; } while ($data = fread($fp, 4096)) { // xml_parser jumps over ampersands. Decode any entities then replace any ampersands. // Reverse this when building SQL statement. if (!xml_parse($xml_parser, str_replace('&', '=amp=', html_entity_decode($data)), feof($fp))) { die(sprintf("XML error: %s at line %d ", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } } xml_parser_free($xml_parser); array_to_table($songs); ?> |