0

Could you please help me in optimizing this mysql query or any other way of writing this mysql query, it runs but it takes too much time.

SELECT DISTINCT
    A.item1,
    A.item2,
    A.item3,

FROM tableAA AS A
INNER JOIN(tableBB AS B)
ON(A.item4 = B.item4)
INNER JOIN(tableCC AS C)
ON(A.item4 = C.item4)
INNER JOIN(tableDD AS D)
ON(A.item4 = D.item4)
WHERE (B.item5 = '$selected1' AND
  B.item6 LIKE '%$selected2%' AND
  C.item7='$selected3' ) OR
  (A.item8 LIKE '%$selected2%' AND
D.item5 = '$selected1' AND
 C.item7='$selected3')

Is there any other way of writing this query?

edit: tableAA & tableBB & tableCC contains billions of entry, i think i have index the tables correctly.

edit2: But i think the problem is not in Like % condition. I have switched it off and again run the query. but it is still taking long like 565 seconds.

7
  • Please look at similar questions and provide required data that other people provide. Commented Sep 18, 2013 at 14:36
  • 4
    There are many ways to rewrite a query. Please include the explain for the query to provide the information about what it is doing inefficiently. Commented Sep 18, 2013 at 14:36
  • When string matching like that, often times you need to index the column you are matching on to improve performance. Commented Sep 18, 2013 at 14:37
  • could you provide a little more in your select fields query on ALL the final columns you expect out. You have joins that appear to be REQUIRED from all tables (b,c,d) since they are all JOINed (regardless of the where, they must be in each respective table -- or should it be a LEFT-JOIN where the Item 4 may NOT exist in the other table(s)). Commented Sep 18, 2013 at 15:11
  • 1
    If you want us to help optimize a query, you need to show us the table and index definitions, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We also need row counts because that can affect query optimization greatly. If you know how to do an EXPLAIN or get an execution plan, put the results in the question as well. Commented Sep 18, 2013 at 20:39

1 Answer 1

2

A few obvious things to do.

1) index item4 on tables A, B, C and D. Generally, indexes speed reads (wheres and joins) and slow down writes.

2) index the columns you do a lot of straight "where"s on - B.item5, etc. But not B.item6 where the index will change nothing with your "like" matching.

3) This last one takes more time, but it's very important if you have to search using like "%$key%", because these are very time consuming - pretty much every character of every string in the column is compared to your search string: create tables where search keywords for B.item6 and A.item8 are listed, indexed, and search through them. If you do you'll make binary search possible on these columns - an enormous change.

======== EDIT: you say indexing is done, so more on point 3 above. ============

Suppose you have one table to search:

table A
-------
k (PK)
s (big string)

And you want to do:

select k
from A
where s like '%$keyword%'

s being a long string, searching through every occurrence of s for the keyword, in an unknown place, is going to be hugely time-consuming.

So we create 1 more tables:

table s_words
-------------
word | (PK)
k *  |

s_words is a list of words found in s, with the key to find the corresponding rows in table A. Each word occurs multiple times, often many times in the table. To populate s_words, you need to take all strings in s, parse them, and add each word to the table.

Now your query is:

select k
from s_words
where word = '$keyword'

Now word can be indexed, there's no need for character by character matching of the string to the user keyword.

Sign up to request clarification or add additional context in comments.

5 Comments

Point 3 is exactly the way to go. Make a separate table with unique item6 and item8. You will need to do 2 queries, one for the exact strings in that new table and a second to your old table with exact string matching. Most work is keeping that table up to date (this query will also take time with billion entries) but it will speed up the original select considerably.
Just read your edit, but the problem is not "big strings". This will also happen with small strings. The problem is the billion and billion of entries needed to be checked with LIKE %string%. No index will help there. Hence the solution to create a separate table with unique entries (probably much smaller due to the unique) from item6 and item8 and first check those.
@Rik, I think the complexity is O(N x (|s|-|keyword|), so long strings and the number of rows both affect the result. But if you have billions of rows, then yes - that's where the problem is. If binary search is used, search through billions of rows take 40-50 operations.
But i think the problem is not in Like % condition. I have switched it off and again run the query. but it is still taking long like 565 seconds.
In that case, do as other comments suggested - provide the 'explain'.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.