• Resolved saumya010

    (@saumya010)


    I have a question about the indexing – I see that the plugin auto indexes a post whenever its published or updated. However, there are a few posts that even when indexed, does not appear in the search.

    Is there any way to figure out how/why these posts are missing/not appearing in the search?

Viewing 11 replies - 1 through 11 (of 11 total)
  • Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    Hi @saumya010

    To be certain, you’ve logged in to your algolia.com dashboard and verified that these posts are listed there, and when performing a query using that UI, the posts in question are among the results? However, when making the same query from your WP install, the posts in question are NOT showing?

    Have you done any customization or perhaps filter configuration on the WordPress side that could potentially remove them from results?

    Thread Starter saumya010

    (@saumya010)

    Hi Michael, thanks for the quick response.

    Yes, I have logged into Algolia and checked the logs and I can confirm that the posts were indexed correctly. I haven’t made any changes in the WP side and no customizations either.

    The issue is, some random posts, even though indexed correctly(as per the Algolia logs), does not appear even in Algolia. Like it went missing or was deleted automatically somehow. I reached out to Algolia support as well but they are not that responsive and are trying to pin the problem on the plugin. Also, their logging is limited to 7 days, which is not really of any help.

    I would like to understand if there is something, from the plugin code, that might fire the API in case of a post update and somehow removes it from the index? Also, is there a way to identify plugin’s API calls via WP logs?

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    We don’t have much for logs in the current version but know we have a GitHub issue around the topic.

    Extra to be certain, you haven’t made use of either of these filters, correct?

    • algolia_should_index_searchable_post
    • algolia_should_index_post

    By default as long as a given post is published, and not password protected, it should be getting indexed https://github.com/WebDevStudios/wp-search-with-algolia/blob/2.8.3/includes/indices/class-algolia-searchable-posts-index.php#L95-L109 but as you’ll note, there’s one of the filters above. For brevity I am not linking to the other one.

    As an extra test, have you edited the specific post and just clicked “update”, and then confirm whether or not it’s now in the index again?

    Thread Starter saumya010

    (@saumya010)

    Thanks for specifying the filters, I completely missed one that we use, to exclude pages and other CPTs from indexing – algolia_searchable_post_types

    For the test part, updating the same post again always works and reindexes the post and then it appears correctly in the search. But this is not a permanent fix. Is there any other way to ensure that the post is always indexed and appears correctly in Algolia and WP?

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    That’s going to be a troubleshooting topic on your end with the specific install in question. The plugin isn’t coded to potentially randomly de-index content unless it’s been trashed/deleted in WordPress.

    The only way I could think of **potential** fail to re-index, would be if there’s somehow an error during the process of indexing. As is, the plugin deletes the entire record right before re-pushing the entire new state of the record, as opposed to patch/updating just the parts of it. For example if you just renamed the post, and clicked update, the plugin removes, and processes and re-indexes the whole post, instead of sending just the new title value. If the last part of that is somehow failing…I could see potential in “weird deletion” but that’s the only way off the top of my head.

    Are you doing any sort of routine bulk re-indexing? Perhaps as part of a daily schedule or similar?

    Thread Starter saumya010

    (@saumya010)

    No, there is no bulk indexing being done. Only new indexing occurs when a post is published/updated.

    Is there any way to capture when the indexer runs on post publish/update, in the logs? This way I will be able to narrow the issue down further.

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    All of the post watching happens in https://github.com/WebDevStudios/wp-search-with-algolia/blob/main/includes/watchers/class-algolia-post-changes-watcher.php

    This is away from bulk re-indexing, and just at the per-post level.

    We still don’t have all that much for meaningful logging outside of spots that attempt to log thrown exceptions.

    While I know we’re getting close to a new release, we’re not there yet and it won’t be in the next week or so. I bring this up because there’s room where I would not fault you for self-modifying the plugin for troubleshooting purposes to try and track things down a bit more manually.

    Thread Starter saumya010

    (@saumya010)

    Hi Michael,

    Algolia support replied with some info that the plugin is sending a batch “deleteObject” before “updateObject” operation is fired. Is there any way to prevent that or figure out if this is deleting the posts and not updating them afterwards?

    This is the screenshot they provided.

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    We do indeed run deleteObjects() note the plurality, in say https://github.com/WebDevStudios/wp-search-with-algolia/blob/main/includes/indices/class-algolia-searchable-posts-index.php#L478-L497 and that I do believe gets mapped to deleteObject() singular, within Algolia’s PHP client that we ship with the plugin. This has been true from the point that we forked the plugin originally, and how Algolia themselves did this spot before us. Not pointing blame anywhere, just providing context.

    Best I can tell, this was done with the intent of just “remove the post completely, and push everything fresh”, instead of trying to determine all of the matches and the differences between the currently indexed content and the to-be-indexed content. More efficient to delete previous and push current.

    The question ultimately ends up being why isn’t some of the content not making it successfully through an update_records() method, which is getting invoked right after the deletion.

    Sorry for throwing some code your way, but I’m hopeful you can generally comprehend it.

    Here’s the update_records() method location:

    https://github.com/WebDevStudios/wp-search-with-algolia/blob/main/includes/indices/class-algolia-index.php#L334-L384

    Cases where it may return early, or indicate issues:

    1. if already re-indexing things elsewhere.
    2. if $this->sanitize_json_data() returns an empty dataset or throws an exception.
    3. If the $index->saveObjects method throws an exception. This one is going to be from Algolia’s PHP client as well.

    I’m curious if we should wrap parent::update_records( $post, $records ); from update_post_records() in some sort of try/catch block, or something, to try and better indicate a failure. If error_log() is working for your server, then the logs may have details around potential exceptions.

    Hi,

    This is marked as resolved, but seems to be the same issue I am having, so I thought I would add my results to the pool here in case it’s illuminating. New posts seem to be just fine. But posts that were added before we first added Algolia seem to be hit or miss. They will be missing both in the wp search results as well as the Algolia dashboard. Going to a post and hitting update, gets the post to appear however.

    I checked the api logs and found that the initial indexing did find the missing posts, and returned 200. There are some 404s in the api logs, all ending with ?getVersion=2.

    I do have some custom logic with algolia_should_index_searchable_post (used to only index certain pages, and exclude one post type completely). I don’t believe this is causing the problem; the results seem to be exactly as intended. And the custom post types affected by this bug (?) aren’t mentioned anywhere in the code, and it’s odd that some work and others do not.

    It’s not the end of the world because we can force them into results by updating the post, but it is curious.

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    @blgerber The plugin isn’t going to automatically index everything all at once, once activated and the application details filled in. You’ll want to make use of the “bulk index” functionality to have WP push all of your eligible content indexed.

    The indexes will self manage, by say saving a given post, and that one post will re-index. However, without bulk indexing, you’d need to click to edit and save EACH post.

    Also just to ensure it’s known, saving the logic for what to maybe index, also benefits from a bulk index run to get just what newly should be indexed, into the index.

Viewing 11 replies - 1 through 11 (of 11 total)

You must be logged in to reply to this topic.