Hi, it’s been one issue after another trying to get google to index my site properly… Hopefully I’m at the last hurdle.
I’m using friendly urls and have installed the canonical extra but google is not happy with www.mydomain.com/index.html saying it’s being redirected but the entry in .htaccess is for index.php and I cannot find any setting in system settings to potentially resolve this.
Can someone please assist?
Many thanks : -)
I think I may have fixed it by deleting the resource alias setting which is (index) by default in the manager homepage resource which then changes it to (home) - hopefully this fixes the issue?
Hey Bobray, just jumping in here, removing the index.html file does make sense, especially if it’s conflicting with the default index.php or causing confusion with redirects. I’ve seen similar issues where leftover files like index.html end up getting indexed or prioritized by Google, even when the actual site runs off index.php or MODX’s routing. Worth double-checking the server’s DirectoryIndex settings too, in case it’s still trying to serve index.html by default.
The thing is index.html is not a physical file it’s being generated from the homepage in the manager with friendly urls enabled.
I did try renaming the homepage resource alias to “home” but then google said the problem was “index.html - 404 not found” - maybe I should have left it this way and given it more time?
Currently google reports an issue on search console that index.php as “Alternative page with proper canonical tag”
Also currently I returned the homepage a few weeks ago to “index.html” and placed the following into htaccess - “RewriteRule ^index.html$ / [R=301,L]” after reading some forum posts, this was just a somewhat desperate attempt to fix the issue, I’m clutching at straws!
Google also currently reports a problem in search console for index.html - Reason “404 - Not Found” (Started) after I clicked the fix button a few weeks ago, I think I should have heard from them by now?
Google currently has only indexed my domain “mydomain.com”.
I think you’re on the right track with renaming the resource to “home.”
What do the base href lines look like in your templates?
Are all links to the home page on the site in the form of link tags ([[~##]])? They should be. You may have some links with index hard-coded, which would confuse Google.
Be sure to clear the site cache after making changes.
Well good news I now have indexed 10 pages on google search console by using “inspect any url” and after pasting each url pressing “request indexing” on each one.
I did have a major error with using the “breadcrumbs” extra as it was using an obsolete schema, (‘data-vocabulary.org schema deprecated’) this was fixed by switching to the “breadcrumb” extra.
Although I don’t know how to configure the “breadcrumb” extra to use:- BreadCrumb 1.4.4-pl
Added placeholder to support BreadcrumbList - Schema.org Type
Their is now two outstanding issues on google console:
Alternative page with proper canonical tag - /index.php (Probably expected and unavoidable ?)
Not found (404) - /index.html (Maybe just give google time to remove it or maybe just keep clicking fix issue ?
Anyway I’m very happy today : - )
Thank you both bobray and poldx for your input with this!
Hey Bob,
I’ve been looking into DirectoryIndex I’m just not sure it’s necessary atm as things are going well - search engines seem happy now.
poldx mentioned DirectoryIndex as well but as both of you wrote: get rid of index.html which is now sorted.
Tell you what though I had a long break from web design (years) and now things are so different I used evolution back in the day - wow that was a long time ago and now revolution with containers which can contain content. And no need to have a index.html / index.php after a container. (folder).
And the only index document is in the site root now and is (index.php as you know).
I think I’ll see how things go for a few weeks and then reconsider using DirectoryIndex.
An issue I do have is the w3c html validator page returning a 500 internal server error message when I submit any url from my site - that’s not good.
I could be wrong, but I think that index.html is still the default directory index for many servers. So the point of the command in .htaccess is to change that. If you only list index.php, it should stop looking for index.html.