Dealing with previously indexed (Google Search Console) that are now unpublished

I’m unfortunately getting hundreds of errors from pages that are not loading for Google Search Console that have been indexed from other sources, but are from biography pages that have been unpublished due to people leaving the company.

Is there a way that I can fix this for non-logged-in users? Namely the search bots? Something like adding to the bio template a conditional that checks if the resource is published first? But if not, it redirects to a specific page rather than a 404 or, in this case, a 500 error?

These are specifically throwing errors that are affecting our mobile search results.

1 Like

You could make 301 redirects, this would be the most SEO friendly way of fixing this. It is concerning to me that visitors are receiving 500 errors, way worse for SEO than 404s. Maybe someone else can speak to that?

Have you seen the Redirector Extra?

Also, reaching out to the websites who are directing people to your broken pages, and asking them to change the links is a good idea. They would probably appreciate knowing that they are sending tons of their visitors to error pages. :yum: I would.

2 Likes

I don’t understand why unpublished pages would result in a 500 error. Should just be a 404… Can you give us an example of a link that is resulting in a 500 error?

Do you have your 404 page set up? And do you have a site map that is submitted to Google? You can also have Google re-index your site, once you have a site map and a 404 page in place…

I agree, the 500 part is concerning. If you could provide a server side error log reference of the 500 it would help a lot. As @Mr_JimWest suggested, adding a pluggin like Redirector that forwards on pattern match when a 404 happens will definitely help in normal circumstances. I am not sure where the 500 is coming from though, and it may be a “Why?” to ask before a 404 plugin can successfully complete.

1 Like

One reason you might get a 500 instead of a 404 is if you haven’t configured an error_page per context. That can cause MODX to get thrown into a loop when trying to serve an error in a certain context.

1 Like

Ok I’ll try and address everyone’s questions at once here. I apparantly cannot mention more than 2 people per post either. Anytime an attorney leaves, this tends to happen. For example, here’s a link that is getting the 500 treatment: https://www.bipc.com/gregory-eck.

@lucy, I do have my 404 page set up and it does work for other issues. I’ve also submitted our sitemap to Google. However that’s something in another thread I’m having issues with. For some reason the sitemap extra hasn’t updated any content except for the homepage since February 2019: GoogleSiteMap Snippet: Sitemap not updating.

matdave & Mr_JimWest, I’ll look for some error logs that correspond to this and will let you know, but I’ll also check out that redirector extra too.

@markh, what do you mean by configuring an “error_page per context”?

1 Like

Well, you don’t want to set up redirects for unpublished pages if you want the pages to be removed from Google index. As far as I know, the only way to have Google remove a page from their index is to let the bots hit a 404 enough times for them to be sure the page is really gone. I don’t know how long this takes, I just know that they won’t remove the page the first time they hit a 404 (in case the situation is temporary), and they won’t remove the page if they get a redirect instead of a 404. My gut tells me you have a redirect(s) set up somewhere and that some part of the redirect is causing the 500s.

1 Like

Here’s a thought I just had. Can a 500 Error happen when one resource contains a link to a now unpublished resource?

For example: https://www.bipc.com/2017-tax-act. (Broken link: Christine Boronyak Bowers https://www.bipc.com/christine-bowers 500.00 Internal Server Error)

I have come across a number of pages now that had links to many of the attorney pages coming up in the 500 Error results in Google Search Console. However this may just be a coincidence.

1 Like

If a page is missing or deleted the makeUrl process just logs an error that the resource couldn’t be found. It shouldn’t result in an actual 500 error. It takes quite a while for the 500 to load, so it’s maybe a redirect loop internally? Without a corresponding server error log it is hard to tell.

You haven’t said if you’ve checked for redirects somewhere. Is there anything in httpd.conf or .htacess file (if on apache) or nginx.conf file (if on nginx)?

Also I would be curious to know – if you make a new test page, save it, unpublish it – what happens when you try to hit it while not logged in? 404 or 500?

@matdave & @lucy, there is nothing regarding it in the server log (apache). I even had the hosting service confirm this. There are also no redirects happening in the .htaccess file. I downloaded a fresh .htaccess file from a blank installation and configured it appropriately to remove any potential issues since we did have a bunch of redirects for older resources.

I also did just create a test page https://www.bipc.com/test-service and it worked when I published it, got a 404 page before publishing it, but a 500 error after unpublishing it.

Ok, here’s something interesting (or maybe not, i dunno). I turned on ini_set(‘display_errors’, ‘1’); under the plugin customFurls and received this error message when going to any of the 500 pages:

Fatal error : Allowed memory size of 536870912 bytes exhausted (tried to allocate 20480 bytes) in /home/user/public_html/core/model/modx/modaccessibleobject.class.php on line 245

That line is $matched = array_diff_assoc($criteria, $matches); and is a part of public function checkPolicy which is for “Determining if the current/specified user attributes satisfy the object policy.”

Alternatively, I’ve also received on other error pages this error:

Fatal error : Allowed memory size of 536870912 bytes exhausted (tried to allocate 16384 bytes) in /home/user/public_html/core/xpdo/om/xpdoobject.class.php on line 236

Does this mean anything to anybody?

Sorry to say I have never heard of a MODX customFurls plugin and can’t find any docs for it… Can you disable it to see if the issue is fixed?

Also what MODX version are you running?

I’m running 2.7.1. Also, yep! When I disabled that plugin (probably from the person that built the site) that 500 error went away. Now I just need to figure out what this was for.

1 Like

So glad that is sorted!
If, after trying to figure it out, you still have questions about the plugin you can start a new thread and post the code. I’m sure folks would take a look. It’s possible (probable even) that it worked properly on older versions and needs tweaking for current MODX.

1 Like