Writing Query Hooks for AdvSearch

Hello all,

Trust everyone is keeping safe. I am having some trouble writing a query hook for AdvSearch. I have a multi-list tv “animal” with an output delimiter “||” that I want to search with AdvSearch. In this example, I’m using a textarea to allow the user input what they would like to search for. They can use the words “AND” or “OR” to filter what they are searching for.

I have 4 resources with tv “animal”. Provided below are the values within each resource.

Resource 1: dog,cat,bird
Resource 2: cat,dog,fish
Resource 3: cat,bird,fish,dog
Resource 4: dog,turtle

The user tries searching for cat AND dog, the results must only return resources that have “cat” AND “dog” within the animals tv. I tried using different combination of AdvSearch queries and I didn’t get any results. However, I got the “cat” OR “dog” condition to work.

Here in my setup. I’m using Formit to post my Advsearch Results Page. I wrote a hook that searches for “AND”/“OR”. If it finds “AND” in the hook->getValue(“animal”), it replaces “AND” with “,”. If it finds “OR”, it explodes every instance of “OR”. Afterwards, I set Formit “redirectParams” to the toJSON($hook->getValues) and redirect to my search results page.

Therefore, for the “AND” search, my $_GET[“animals”] is a string (e.g value = cat, dog) and my “OR” $_GET[“animals”] is an array. Like I said, the “OR” query appears to be working $andConditions[‘tv.animals:REGEXP’] = ‘animals’ but I can’t figure out the “AND” condition

I tried the following
$andConditions[‘tv.animals:REGEXP’] = ‘animals’
$andConditions[‘tv.animals:IN’] = ‘animals’
$andConditions[‘tv.animals:FIND:||’] = ‘animals’
$andConditions[‘tv.animals:LIKE’] = ‘animals’

Nothing hasn’t work. I’m hoping someone out there is familar working with AdvSearch or proficient with MYSQL. Apologies for the super long post but I’m trying to be as clear as possible. Any help will be appreciated.

Normally for AND conditions, you just add another element to the $andConditions array.

//example to search for 'cat' AND 'dog'
$andConditions = array(
    'tv.animals:FIND:||' => 'dog:string',
    'tv.animals:FIND:||' => 'cat:string'
);
//or in your case
$andConditions = array();
$ands = $_GET["animals"];
$ands = explode(',',$ands);
foreach($ands as $and){
    $andConditions['tv.animals:FIND:||'] = $and . ':string';
}

The problem here is, that all the entries in the $andConditions array have the same key, so they get overwritten. Because of this, I don’t think you can simply use the function $hook->setQueryHook(). You’ll have to be more creative.


Here is a possible solution:
Make sure your hook works for a single value (let’s say for cat).
Then log the content of $hook->queryHook['andConditions'] to the error log (after the call to setQueryHook()) to get the processed subquery for this condition.

The code should look something like this:

//if your values are separated by commas then just use 'tv.animals:FIND' instead
$andConditions = array(
    'tv.animals:FIND:||' => 'cat:string'
);

$qhDeclaration = array(
    'qhVersion' => '1.2',
	'andConditions' => $andConditions
);

$hook->setQueryHook($qhDeclaration);

$cond = $hook->queryHook['andConditions'];
foreach($cond as $key => $val){
    $modx->log( modX::LOG_LEVEL_ERROR, 'andConditions[' . $key . ']=' . $val);
}
return true;

The output in the error log like this:

EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animals' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) )) WHERE `tvcv`.`contentid` = `modResource`.`id` )

Now, to add an additional AND condition (let’s say for dog) take this EXISTS(...) string, replace the word 'cat' with 'dog' and add it to the $hook->queryHook['andConditions'] array.

$hook->queryHook['andConditions'][] = str_replace('cat','dog','EXISTS( ...   ....)');
1 Like

This is awesome. It works. I wouldn’t have thought of this. Now I will build on top of this. Thank you so much. Also, I was using

and it doesn’t work. When I changed it to ‘qhVersion’ => ‘1.2’ as in your example, it worked. Any ideas why this is happening?

The syntax changed from version 1.2 to version 1.3.

I believe in version 1.3 this and-condition

//version 1.2
$andConditions = array(
    'tv.animals:FIND:||' => 'cat:string'
);

should be like this:

// version 1.3
$andConditions = array(
    'tv.animals:FIND' => array('key' => 'cat', 'method' => 'string', 'pattern' => '||')
);

Is there any good documentation for query hooks version 1.3 somewhere? This PDF in core/components/advsearch/docs/advsearchHook120_v1.0.pdf is quite extensive but only covers version 1.2.

ahhh ok…guess I will stick with version 1.2 for now. Unfortunately I cannot find any good documentation for this.

Now I’m trying to do the following combinations. Let say the user wants “cat AND dog OR turtle”. How can I achieve this. I realised that I may have to merge my queries. Here is what I tried

//OR Condition
if (is_array($sanitizedGets['keywords'])){
    //$andConditions['tv.animal:MATCH'] = 'keywords';
    //$modx->log( modX::LOG_LEVEL_ERROR, "values: ".print_r($sanitizedGets['keywords'],true));
                    
    foreach ($sanitizedGets['keywords'] as $key => $value){
           if(stripos($value, ',') !== false){ //AND Condition
                            
                $ands = explode(',',$value);
                $needle =  $ands[0];
                $sql = "EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (FIND_IN_SET( '$needle', REPLACE( `tvcv`.`value`, '||', ',' ) )) WHERE `tvcv`.`contentid` = `modResource`.`id` ) ";
                            
                 foreach ($ands as $innerkey => $innervalue){
            
                        $hook->queryHook['andConditions'][] = str_replace($needle,trim($innervalue),$sql);
                 }
                            
            }
            else { //No AND condition
                  $value = trim($value);
                  //$modx->log( modX::LOG_LEVEL_ERROR, "values: ".gettype($value));
                  //$andConditions['tv.index:MATCH'] = $value.":string";
                  $hook->queryHook['andConditions'][] = " EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (`tvcv`.`value` REGEXP '(^|\\\\|)+$value(\\\\||$)+' ) WHERE `tvcv`.`contentid` = `modResource`.`id` )";
                            
            }
    }
}

And got back the following query results

andConditions[0]= EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) )) WHERE `tvcv`.`contentid` = `modResource`.`id` ) 

andConditions[1]= EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (FIND_IN_SET( 'dog', REPLACE( `tvcv`.`value`, '||', ',' ) )) WHERE `tvcv`.`contentid` = `modResource`.`id` ) 

andConditions[2]= EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (`tvcv`.`value` REGEXP '(^|\\|)+turtle(\\||$)+' ) WHERE `tvcv`.`contentid` = `modResource`.`id` )

How should this search be interpreted? As “cat AND (dog OR turtle)” or as “(cat AND dog) OR turtle”.

For the first case you can probably just combine two EXISTS-subqueries into 1 entry of $hook->queryHook['andConditions']:

( EXISTS( ... 'dog' ... ) OR EXISTS( ... 'turtle' ... ) )

For the second case you probably have to combine all three of them.

( EXISTS( ... 'cat' ... ) AND EXISTS( ... 'dog' ... ) OR EXISTS( ... 'turtle' ... ) )

The search should be interpreted as “(cat AND dog) OR turtle”. After messing around abit, I found out that the string (FIND_IN_SET( '$variable', REPLACE( `tvcv`.`value`, '||', ',' ) )) could be merged as one string instead of using 2 $hook->queryHook['andConditions'] lines. So here I coded it for “(cat AND dog) OR turtle”

if (is_array($sanitizedGets['keywords'])){
   foreach ($sanitizedGets['keywords'] as $key => $value){
        if(stripos($value, ',') !== false){ //AND Condition
                            
            $ands = explode(',',$value);
            $searchsql = array();
                            
            foreach ($ands as $innerkey => $innervalue){
                $needle = trim($innervalue);
                $searchsql[] = "(FIND_IN_SET( '$needle', REPLACE( `tvcv`.`value`, '||', ',' ) ))";
            }
            $final_string = implode(" AND ", $searchsql);
            $or[] = "EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND $final_string WHERE `tvcv`.`contentid` = `modResource`.`id` )";
        }
        else { //No AND condition
            $value = trim($value);
            $or[] = " EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (`tvcv`.`value` REGEXP '(^|\\\\|)+$value(\\\\||$)+' ) WHERE `tvcv`.`contentid` = `modResource`.`id` )";
         }
   }
   $final_sql = implode(" OR ", $or);
   $hook->queryHook['andConditions'][] = $final_sql;
}

Which results with

EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) )) AND (FIND_IN_SET( 'dog', REPLACE( `tvcv`.`value`, '||', ',' ) )) WHERE `tvcv`.`contentid` = `modResource`.`id` ) OR  EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'animal' AND `tv`.`id` = `tvcv`.`tmplvarid` AND (`tvcv`.`value` REGEXP '(^|\\|)+turtle(\\||$)+' ) WHERE `tvcv`.`contentid` = `modResource`.`id` )

This code works for the following combinations (also in reverse order)
turtle OR cat AND dog = turtle OR (cat AND dog)
turtle OR fish OR cat AND dog = turtle OR fish OR (cat AND dog)
turtle AND fish OR cat AND dog = (turtle AND fish) OR (cat AND dog)

I believe you should wrap $final_sql in brackets ("(" . $final_sql . ")") at the end.
Otherwise the search result will contain wrong results (like resources that are unpublished or deleted for example).


The conditions (`tvcv`.`value` REGEXP '(^|\\|)+turtle(\\||$)+' ) and (FIND_IN_SET( 'turtle', REPLACE( `tvcv`.`value`, '||', ',' ) )) are interchangeable. You don’t have to use one for ANDs and the other for ORs.


You probably can simplify the code further and combine everything into one EXISTS subquery:

EXISTS( 
	SELECT 1 
	FROM `modx_site_tmplvar_contentvalues` `tvcv` 
	JOIN `modx_site_tmplvars` tv 
	ON `tv`.`name` = 'animal' 
		AND `tv`.`id` = `tvcv`.`tmplvarid` 
	WHERE `tvcv`.`contentid` = `modResource`.`id`
    AND (
		FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) )
		AND FIND_IN_SET( 'dog', REPLACE( `tvcv`.`value`, '||', ',' ) )
		OR FIND_IN_SET( 'turtle', REPLACE( `tvcv`.`value`, '||', ',' ) )
	)
)

If you still use a FormIt-hook and a redirect (as you described above), then you can probably get rid of it and code everything you need in this query hook.

Ok. Understood and I followed your suggestion. It works.

Another question, now I am trying to search the content field for a word or phrase. For example: “the boy with the girl” or “the boy” or “boy”. I tried several combinations but I got either no result or all of the results. Here is my attempt.

$andConditions['modResource.content:REGEXP'] = 'summary';
$andConditions = array(
     'modResource.content:FIND' => 'summary',
);

Both results returned all of the pages. Why is that? And how should I search the content field for a word or phrase?

Are you sure, you need a query hook for this? By default AdvSearch should search the content field (and other fields) for the existence of the search term you provide in the REQUEST parameter search.


In the query hook you should be able to search for “boy” with this line

$andConditions['modResource.content:LIKE'] = '%boy%:string';

FIND is only for searching in a set (comma-separated list).
REGEXP somehow generates an xpdo parsing error in the modx error log.

Honestly, I am not sure if it is necessary. The current solution that I am using is ONLY searching a tv for keywords (cat, dog, bird, etc) but I’m thinking that it should also check the content field for these keywords as well. Therefore, if the tv OR content field has in the keywords, the search should return results.What I tried to do for “cat AND dog” is this:

andConditions[0]=EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'index' AND `tv`.`id` = `tvcv`.`tmplvarid` AND ( FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) ) AND FIND_IN_SET( 'dog', REPLACE( `tvcv`.`value`, '||', ',' ) ) ) WHERE `tvcv`.`contentid` = `modResource`.`id` )
andConditions[1]=(`modResource`.`content` LIKE '%cat and dog%')

Unfortunately, it doesn’t work. So then I tried to merge the 2 lines in 1:

andConditions[0]=EXISTS( SELECT 1 FROM `modx_site_tmplvar_contentvalues` `tvcv` JOIN `modx_site_tmplvars` tv ON `tv`.`name` = 'index' AND `tv`.`id` = `tvcv`.`tmplvarid` AND ( FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) ) AND FIND_IN_SET( 'dog', REPLACE( `tvcv`.`value`, '||', ',' ) ) OR (`modResource`.`content` LIKE '%cat and dog%')) WHERE `tvcv`.`contentid` = `modResource`.`id` )

What would you recommend to achieve this? Also, if the keywords are found within the content field, is it possible to count the number of occurrences found in the content field and to highlight at least 1 occurence?

Really appreciate your guidance with this. Hopefully, this thread will be useful to others who may need to use AdvSearch.

I believe your andCondition should look like this:

(
	EXISTS( 
		SELECT 1 
		FROM `modx_site_tmplvar_contentvalues` `tvcv` 
		JOIN `modx_site_tmplvars` tv 
		ON `tv`.`id` = `tvcv`.`tmplvarid`
		WHERE `tv`.`name` = 'index' 
		AND `tvcv`.`contentid` = `modResource`.`id`
		AND ( 
			FIND_IN_SET( 'cat', REPLACE( `tvcv`.`value`, '||', ',' ) ) 
			AND FIND_IN_SET( 'dog', REPLACE( `tvcv`.`value`, '||', ',' ) ) 
		)
	) OR (
		`modResource`.`content` LIKE '%cat%'
		AND `modResource`.`content` LIKE '%dog%'
	)
)

If your REQUEST parameter search has the value “cat dog” (search terms separated by a space), then you should get an extract (with a highlighted search term).

I don’t think there is a way to output the number of occurrences without changing the code of the extra.

In this line of the function _getExtracts() the variable $nbr contains the number of occurrences for each search term. But the value is never stored or output.