PHP - Php Sitemap
I'm wanting to make a sitemap for my website but should I includes ALL possible url's? For example I have a search page (where you can search a film) where the url syntax is something like "search.php?type=all&film=a+film". Should I make sure that all possible urls are included in my sitemap e.g all the films in my database? Hope that makes sense.
Thanks for any help. Similar TutorialsHi. I'm trying to create my first XML sitemap that I am going to be submitting to google. The sitemap is going to be PHP but using MOD rewrite to make it look like an XML file. The site map has been created. http://test.powtest.co.uk/sitemap.xml But, I now want to style it. However, when I add the XML style file, I get an error: Parse error: syntax error, unexpected T_STRING in /home/a/d/adele/web/public_html/sitemap.php on line 1 I'm not to sure how to get my PHP file to read in the XML style file. Any help would be much appreciated PHP sitemap (sitemap.xml) Code: [Select] <?xml-stylesheet type="text/xsl" href="sitemap.xsl"?> <?PHP require_once('includes/initialize.php'); $newAbout = Aboutme::find_by_id(1); $aboutDateMod = $newContact->dateMod; $newContact = Contact_details::find_by_id(1); $contactDateMod = $newContact->dateMod; $lastService = Services::find_last_posted(); $servicesDateMod = $lastService->dateMod; $lastBlog = Blog::find_last_posted(); $blogDateMod = $lastBlog->dateMod; // TOP NAVIGATION echo ' <url> <loc>'.SITE_DOMAIN.DS.'</loc> <lastmod>2011-01-17</lastmod> <changefreq>yearly</changefreq> <priority>0.8</priority> </url> <url> <loc>'.SITE_DOMAIN.DS.'services</loc> <lastmod>'.$servicesDateMod.'</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> <url> <loc>'.SITE_DOMAIN.DS.'testimonials</loc> <lastmod>2011-01-17</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> <url> <loc>'.SITE_DOMAIN.DS.'about_adele_fifer</loc> <lastmod>'.$aboutDateMod.'</lastmod> <changefreq>yearly</changefreq> <priority>0.8</priority> </url> <url> <loc>'.SITE_DOMAIN.DS.'contact</loc> <lastmod>'.$contactDateMod.'</lastmod> <changefreq>never</changefreq> <priority>0.8</priority> </url> <url> <loc>'.SITE_DOMAIN.DS.'tips</loc> <lastmod>'.$blogDateMod.'</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url>'; // SERVICE CAT $check = Services::count_CID(); if($check >= 1){ $serviceCat = Services_categories::find_all_order_by_exclude_nonFolder('category', 'ASC'); foreach($serviceCat as $serviceCats){ $ID = $serviceCats->id; $cat = ucwords($serviceCats->category); $servicesCatDateMod = $serviceCats->dateMod; if($checkCat = Services::if_CID_exists($ID)){ echo ' <url> <loc>'.make_cat_url('services_list', $ID, $cat).'</loc> <lastmod>'.$servicesCatDateMod.'</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url>'; } } } // SERVICE LIST $serviceCat = Services::find_all_order_by('title', 'ASC'); foreach($serviceCat as $serviceCats){ $ID = $serviceCats->id; $title = ucwords($serviceCats->title); $servicesDateMod = $serviceCats->dateMod; echo ' <url> <loc>'.make_url('services', $ID, $title).'</loc> <lastmod>'.$servicesDateMod.'</lastmod> <changefreq>yearly</changefreq> <priority>0.8</priority> </url>'; } // BLOG CAT $blogCat = Blog_categories::find_all_order_by('category', 'ASC'); foreach($blogCat as $blogCats){ $ID = $blogCats->id; $cat = ucwords($blogCats->category); $desc = $blogCats->smallDesc; $img = $blogCats->img; echo ' <url> <loc>'.make_cat_url('tips_list', $ID, $cat).'</loc> <lastmod>'.$servicesCatDateMod.'</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url>'; } // BLOG $blogCat = Blog::find_all_order_by('position', 'ASC'); foreach($blogCat as $blogCats){ $ID = $blogCats->id; $title = ucwords($blogCats->title); echo ' <url> <loc>'.make_url('tips', $ID, $title).'</loc> <lastmod>'.$servicesCatDateMod.'</lastmod> <changefreq>yearly</changefreq> <priority>0.8</priority> </url>'; } ?> XML Stylesheet (sitemap.xsl) Code: [Select] <?xml version="1.0" encoding="UTF-8"?><!-- DWXMLSource="http://test.powtest.co.uk/sitemap.xml" --> <!DOCTYPE xsl:stylesheet [ <!ENTITY nbsp " "> <!ENTITY copy "©"> <!ENTITY reg "®"> <!ENTITY trade "™"> <!ENTITY mdash "—"> <!ENTITY ldquo "“"> <!ENTITY rdquo "”"> <!ENTITY pound "£"> <!ENTITY yen "¥"> <!ENTITY euro "€"> ]> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" encoding="UTF-8" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/> <xsl:template match="/"> <html> − <head> − <title> Adele Fifer </title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> − <style type="text/css"> body { font-family: Arial, sans-serif; font-size:13px; } #intro { background-color:#CFEBF7; border:1px #2580B2 solid; padding:5px 13px 5px 13px; margin:10px; margin-top:40px; } #intro p { line-height: 16.8667px; } td { font-size:11px; } th { text-align:left; padding-right:30px; font-size:11px; } tr.high { background-color:whitesmoke; } #footer { padding:2px; margin:10px; font-size:8pt; color:gray; } #footer a { color:gray; } a { color:black; } #content { padding: 30px; } img, img a { border:none; } </style> </head> − <body> <h1>Adele Fifer Sitemap</h1> − <div id="content"> − <table cellpadding="5"> − <tr style="border-bottom: 1px solid black;"> <th>URL</th> <th>Priority</th> <th>Change Frequency</th> <th>LastChange</th> </tr> <xsl:variable name="lower" select="'abcdefghijklmnopqrstuvwxyz'"/> <xsl:variable name="upper" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/> − <xsl:for-each select="sitemap:urlset/sitemap:url"> − <tr> − <xsl:if test="position() mod 2 != 1"> <xsl:attribute name="class">high</xsl:attribute> </xsl:if> − <td> − <xsl:variable name="itemURL"> <xsl:value-of select="sitemap:loc"/> </xsl:variable> − <a href="{$itemURL}"> <xsl:value-of select="sitemap:loc"/> </a> </td> − <td> <xsl:value-of select="concat(sitemap:priority*100,'%')"/> </td> − <td> <xsl:value-of select="concat(translate(substring(sitemap:changefreq, 1, 1),concat($lower, $upper),concat($upper, $lower)),substring(sitemap:changefreq, 2))"/> </td> − <td> <xsl:value-of select="concat(substring(sitemap:lastmod,0,11),concat(' ', substring(sitemap:lastmod,12,5)))"/> </td> </tr> </xsl:for-each> </table> </div> − <div id="footer"> <a href="http://test.powtest.co.uk/about_adele_fifer">About Adele Fifer</a> </div> </body> </html> </xsl:template> </xsl:stylesheet> First of all I have no idea how to do this. But I know it can be done. A little help would be alot for me. Thank you in advance. So I want to go through a sitemap and visit each of the link on the sitemap and save the URL on to a database. Example: Quote HP 5550N A3 Colour Laser Printer Konica Minolta PagePro 1350W A4 Mono Laser Printer HP 9050dn A3 Mono Laser Printer HP 5550DTN A3 Colour Laser Printer HP 5550HDN A3 Colour Laser Printer HP 5550DN A3 Colour Laser Printer Lets say this was on the sitemap and it just continues with other product like this. I want to write a code where it will go to the first link and saves the product URL on to the database and continues to do the same until the last link. On my example it would be HP 5550DN A3 Colour Laser Printer. Any help??? Any ideas?? I am not asking someone to write the code for me.. Just need help and good direction I'm wanting to make a sitemap for my website but as it's quite big there are a lot of url's I need to include. I was wondering if anyone know's of a way this can be done using PHP. e.g. using PHP to generate a list of URL's. I had a think myself but wasn't sure how to do it. Thanks for any help. Hello, I have found the following code which creates a sitemap from the home directory of files. Please could you point me in the right direction for changing this so that it looks up from my sql database and for each name within the database creates an entry in the sitemap (e.g. www.mysite.com/[NAME]). Thanks for your help, Stu Code: [Select] <?php /******************************************************************************\ * Author : Binny V Abraham * * Website: http://www.bin-co.com/ * * E-Mail : binnyva at gmail * * Get more PHP scripts from http://www.bin-co.com/php/ * ****************************************************************************** * Name : PHP Google Search Sitemap Generator * * Version : 1.00.A * * Date : Friday 17 November 2006 * * Page : http://www.bin-co.com/php/programs/sitemap_generator/ * * * * You can use this script to create the sitemap for your site automatically. * * The script will recursively visit all files on your site and create a * * sitemap XML file in the format needed by Google. * * * * Get more PHP scripts from http://www.bin-co.com/php/ * \******************************************************************************/ // Please edit these values before running your script. //////////////////////////////////// Options //////////////////////////////////// $url = "http://www.WEBSITE_NAME.com/"; //The Url of the site - the last '/' is needed $root_dir = ''; //Where the root of the site is with relation to this file. $file_mask = '*.php'; //Or *.html or whatever - Any pattern that can be used in the glob() php function can be used here. //The file to which the result is written to - must be writable. The file name is relative from root. $sitemap_file = 'sitemap.xml'; // Stuff to be ignored... //Ignore the file/folder if these words appear in the name $always_ignore = array( 'local_common.php','images' ); //These files will not be linked in the sitemap. $ignore_files = array( '404.php','error.php','configuration.php','include.inc' ); //The script will not enter these folders $ignore_folders = array( 'Waste','php_uploads','images','includes','lib','js','css','styles','system','stats','CVS','.svn' ); //The default priority for all pages - the priority of all pages will increase/decrease with respect to this. $starting_priority = ($_REQUEST['starting_priority']) ? $_REQUEST['starting_priority'] : 70; /////////////////////////// Stop editing now - Configurations are over //////////////////////////// /////////////////////////////////////////////////////////////////////////////////////////////////// function generateSiteMap() { global $url, $file_mask, $root_dir, $sitemap_file, $starting_priority; global $always_ignore, $ignore_files, $ignore_folders; global $total_file_count,$average, $lowest_priority_page, $lowest_priority; /////////////////////////////////////// Code //////////////////////////////////// chdir($root_dir); $all_pages = getFiles(''); $xml_string = '<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.google.com/schemas/sitemap/0.84" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84 http://www.google.com/schemas/sitemap/0.84/sitemap.xsd"> <?php # '; $modified_priority = array(); for ($i=30;$i>0;$i--) array_push($modified_priority,$i); $lowest_priority = 100; $lowest_priority_page = ""; //Process the files foreach ($all_pages as $link) { //Find the modified time. $handle = fopen($link,'r'); $info = fstat($handle); fclose($handle); $modified_at = date('Y-m-d\Th:i:s\Z',$info['mtime']); $modified_before = ceil((time() - $info['mtime']) / (60 * 60 * 24)); $priority = $starting_priority; //Starting priority //If the file was modified recently, increase the importance if($modified_before < 30) { $priority += $modified_priority[$modified_before]; } if(preg_match('/index\.\w{3,4}$/',$link)) { $link = preg_replace('/index\.\w{3,4}$/',"",$link); $priority += 20; } //These priority detectors should be different for different sites :TODO: if(strpos($link,'example')) $priority -= 30; //If the page is an example page elseif(strpos($link,'demo')) $priority -= 30; if(strpos($link,'tuorial')) $priority += 10; if(strpos($link,'script')) $priority += 5; if(strpos($link,'other') !== false) $priority -= 20; //Priority based on depth $depth = substr_count($link,'/'); if($depth < 2) $priority += 10; // Yes, I know this is flawed. if($depth > 2) $priority += $depth * 5; // But the results are better. if($priority > 100) $priority = 100; $loc = $url . $link; if(substr($loc,-1,1) == '/') $loc = substr($loc,0,-1);//Remove the last '/' char. $total_priority += $priority; if($lowest_priority > $priority) { $lowest_priority = $priority;//Find the file with the lowest priority. $lowest_priority_page = $loc; } $priority = $priority / 100; //The priority is given in decimals $xml_string .= " <url> <loc>$loc</loc> <lastmod>$modified_at</lastmod> <priority>$priority</priority> </url>\n"; } $xml_string .= "</urlset>"; if(!$hndl = fopen($sitemap_file,'w')) { //header("Content-type:text/plain"); print "Can't open sitemap file - '$sitemap_file'.\nDumping result to screen...\n<br /><br /><br />\n\n\n"; print '<textarea rows="25" cols="70" style="width:100%">'.$xml_string.'</textarea>'; } else { print '<p>Sitemap was written to <a href="' . $url.$sitemap_file .'">'. $url.$sitemap_file .'></a></p>'; fputs($hndl,$xml_string); fclose($hndl); } $total_file_count = count($all_pages); $average = round(($total_priority/$total_file_count),2); } ///////////////////////////////////////// Functions ///////////////////////////////// // File finding function. function getFiles($cd) { $links = array(); $directory = ($cd) ? $cd . '/' : '';//Add the slash only if we are in a valid folder $files = glob($directory . $GLOBALS['file_mask']); foreach($files as $link) { //Use this only if it is NOT on our ignore lists if(in_array($link,$GLOBALS['ignore_files'])) continue; if(in_array(basename($link),$GLOBALS['always_ignore'])) continue; array_push($links, $link); } //asort($links);//Sort 'em - to get the index at top. //Get All folders. $folders = glob($directory . '*',GLOB_ONLYDIR);//GLOB_ONLYDIR not avalilabe on windows. foreach($folders as $dir) { //Use this only if it is NOT on our ignore lists $name = basename($dir); if(in_array($name,$GLOBALS['always_ignore'])) continue; if(in_array($dir,$GLOBALS['ignore_folders'])) continue; $more_pages = getFiles($dir); // :RECURSION: if(count($more_pages)) $links = array_merge($links,$more_pages);//We need all thing in 1 single dimentional array. } return $links; } //////////////////////////////// Display ///////////////////////////// ?> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.1 Transitional//EN"> <html> <head> <title>Sitemap Generation Using PHP</title> <style type="text/css"> a {color:blue;text-decoration:none;} a:hover {color:red;} </style> </head> <body> <h1>PHP Google Search Sitemap Generator Script</h1> <?php if($_POST['action'] == 'Create Sitemap') { generateSiteMap(); ?> <h2>Sitemap Created...</h2> <h2>Statastics</h2> <p><strong><?php echo $total_file_count; ?></strong> files were found and indexed.<br /> Lowest priority of <strong><?php echo $lowest_priority; ?></strong> was given to <a href='<?php echo $lowest_priority_page; ?>'><?php echo $lowest_priority_page; ?></a></p> Average Priority : <strong><?php echo $average; ?></strong><br /> <h2>Redo</h2> <?php } else { ?> <p>You can use this script to create the sitemap for your site automatically. The script will recursively visit all files on your site and create a sitemap XML file in the format needed by Google. </p> <p>You can customize the result by changing the starting priorities.</p> <h2>Set Starting Priority</h2> <?php } ?> <form action="create_sitemap.php" method="post"> Starting Priority : <input type="text" name="starting_priority" size="3" value="<?php echo $starting_priority; ?>" /> <input type="submit" name="action" value="Create Sitemap" /> </form> </body> </html> Hi, I have found a nice little script to write to the xml file for google, but it appears not to be updating the file. Could someone please let me know what might be wrong as this is the first time I've tried to do this. Code: [Select] $dom = new DomDocument('1.0', 'utf-8'); $dom->load('../sitemap.xml'); // if it's a string, $dom->load($xmlFile) if is a file // create the new node element $node = $dom->createElement('url'); // build node $node->appendChild($dom->createElement('loc', $title)); $node->appendChild($dom->createElement('lastmode', date(Y-m-d))); $node->appendChild($dom->createElement('changefreq', 'monthly')); $node->appendChild($dom->createElement('priority', '2')); // append the child node to the root $dom->documentElement->appendChild($node); // get the document as an XML string $xmlStr = $dom->save(); I was wondering if it was something to do with the [urlset] tag within the xml file I'm using? The xml file is just a standard google sitemap.xml Any help on this would be great Cheers Rob I wish to create a sitemap for a shopping cart and so far it seem this needs to be in XML format? is this true ? If so are there any templates that I can use to base my sitemap on. I have a simple php script that updates my sitemap automatically. However, when you click the link to the sitemap at www.bomstudies.com/sitemap-xml.php - the last mod date is showing 1969. The php script works for files that are not in sub folders. such as /talks or /commentary. I just need to know what snippet of code i should add, change, or delete to the sitemap php side of it to show the correct last mod date for my xml sitemap. I greatly appreciate your help.
Thanks -
Here is some of my code:
The Config Code: posted as config.php
<?php From my experience, using Google's sitemap generator to generate a sitemap is not that good. It creates page that you do not want to show up. I am wondering what the best method is for creating a sitemap of a website so that only certain pages of the website show up on Google?
For the last 2 months I have purchased software and utilized free software to attempt to create a sitemap fro my websites
www.negroartist.com
www.africanafrican.com
all of them fail and do not look at all links. i.e. look at a photo album with all the htm in the photo album.
Most of the windows based software do not work and often crash. I want it to index ALL of my site (except images, pdf etc)
Is there any php code, program etc. out there to help me through my dilemna that has instructions for configurabillity? I can also try anything that uses apache etc.
i am at my whits end with this. please help.
has anyone used bing sitemap plugin? how do you install this baby?
my links will total in the hundreds of thousands.
thanks :-)
So a project I have simply to learn how it is done is to make an auto updating sitemap with a auto submitter. I have been doing a little research on sitemaps but haven't found much in the way of a tutorial or white paper or similar, anyway I was wondering if I am on the right track or not. What I was thinking for the auto-updating sitemap is since site maps are XML, when an article is published using my custom CMS/Blogging system, I make it rewrite the sitemap appending the new URL that will be generated. Then for the auto-submit, I can either make it a cron job or just manual press, but it would go though a for list of URLs, replacing the end part with my site map url. I think that is how it would be done. If you have a better idea, or can point out an article so I can understand a lot more with sitemaps, I will appreciate it, I am trying to learn as much as possible, I have gone to the offical website (sitemaps.org) but it doesn't have what I am looking for. Thanks, James |