Sitemap snippet/package that crawls?

I have a website with a large publications section, where resources link to PDF documents that have been uploaded to the server, and it would be nice to find a MODX way to generate a simple XML sitemap that includes those PDF documents plus the resources that correspond to web pages, excluding the usual suspects like the 404 page, etc. I have found a couple of MODX packages (Sterc’s SEOTools and pdoSitemap, but cannot see if they actually crawl the site so that the PDF documents are included, or if they just list the resources.

Is there a script that crawls (assuming crawling is necessary to fetch the PDFs)?

Does it also update itself automatically, or does it need to run manually by viewing the resource that calls the script?

At the moment, I am using Tristan Goossens PHP script, which seems to work, although only manually, but it feels like a terrible kind of disloyalty:

You can have multiple sitemaps and you just have to create a sitemap with the snippet, that lists the PDF files.

1 Like

Thank you for that suggestion. I did not know that a chap could have multiple xml sitemaps for one website. And I find the approval and the mechanics here:

So I guess (hesitantly) that the workflow would be to set up two resources, one with a getResources call that lists the pages drawn from the resources, and a second with a getResources call that lists the PDF URLs that are contained in the PDF template variable on each publication resource.

Will try to just have one XML file from only one snippet that crawls all links and lists all PDFs, etc., Tristan’s solution reducing his three files to one snippet.