0

We would like to do the following:

SELECT * FROM table WHERE char_length(text) >= 15 AND char_length(text) <= 100

HOWEVER, we want it to return only 'text' column that matches this query WITHOUT hashtags. that is: If text = 'hello how are you' will match but text = 'hello #how #are you' will not match (because we want to filter out the hashtags and char_length('hello you') is too short).

5
  • 2
    I might have misunderstood you but why don't you just add: and text not like '%#%' Commented Feb 11, 2018 at 9:11
  • I want to do char_length excluding the hashtags Commented Feb 11, 2018 at 9:17
  • 1
    @OleEHDufour Close...but # could be in the middle of a string and not be part of a hashtag. Your suggested LIKE expression would fail all pound signs. Commented Feb 11, 2018 at 9:17
  • This question does not show any research effort, and there's not even a list of unambiguous sample inputs/outputs. Commented Feb 11, 2018 at 9:21
  • @TimBiegeleisen Well, your answer isn't useful because it doesn't answer OP's (clarified) question. Am I supposed to not downvote bad answers for fear of revenge? Commented Feb 11, 2018 at 9:29

2 Answers 2

3

You can remove hashtags from the text and check that the remaining string is within your desired length bounds:

with t as (select 'this is some text' as txt union all 
           select 'this is #hashed text - loong' union all 
           select 'too short #despite #many #hashtags')
select * from t
where length(regexp_replace(txt, '#[a-z]+ *', '', 'g')) between 15 and 100

You might want to fiddle a bit with the regexp #[a-z]+ * recognizing your tags. Note that the 'g' makes regexp_replace replace all occurrences of the regexp, not just the first one, see PostgreSQL docs.

Sign up to request clarification or add additional context in comments.

1 Comment

Nice approach, but be careful that your pattern does not also remove # when it be not associated with a hashtag.
1

Try using a POSIX regex which does not match the pattern .*([ ]|^)#[a-z].*:

SELECT *
FROM table
WHERE
    char_length(text) >= 15 AND
    char_length(text) <= 100 AND
    text !~* '.*([ ]|^)#[a-z].*';

Demo

4 Comments

I want to do char_length excluding the hashtags
@Himberjack : the Where text !~* is already excluding the hashtags, what do you mean by "do char_length excluding the hashtags"
@Himberjack Check the demo I added. If it fails for some edge cases you have in mind, I can edit my answer.
Your code rejects 'hello how are you #yolo', which OP presumably wants to be included.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.