4

I have data like

http://www.linz.at/politik_verwaltung/32386.asp

stored in a text column. I thought a non-greedy extraction with

select substring(turl from '\..*?$') as ext from tdata

would give me .asp but instead it still ?greedely results in

 .linz.at/politik_verwaltung/32386.asp

How can I only match against the last occurence of dot .?
Using Postgresql 9.3

3
  • Could you provide expected output example? Commented Feb 20, 2015 at 7:58
  • .asp is what you expect right Commented Feb 20, 2015 at 8:02
  • Sorry for being imprecise, yes, .asp would be what I expect Commented Feb 20, 2015 at 12:51

2 Answers 2

6

\.[^.]*$ matches . followed by any number of non-dot characters followed by end-of-string:

# select substring('http://www.linz.at/politik_verwaltung/32386.asp' 
  from '\.[^.]*$');
 substring 
-----------
 .asp
(1 row)

As for why the non-greedy quantifiers do not work here is that they still start matching as soon as possible while still trying to match as short as possible from there on.

Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

\.[\w]*$

Here is how it works:

all the word characters (\w), any numbers of them with *, between dot (\.) and the end of the string ($), with the last . itself.

Note: updated the answer, now will capture the strings ends with ..

5 Comments

I don't see any good reason to wrap \w in a character class.
@mickmackusa this is a template, as one can use more specific set of chars inside
but do you see how an unknowing user might assume that the square braces are necessary because you don't say that their optional?
@mickmackusa well, from my point of view, it's better to be more explicit with the code rather than use some implicit logic hidden from a beginner
In that case, I should clarify the fact to readers that: the square braces serve no benefit in this pattern and can be safely removed. If anyone wants to extend the literal dot to be more characters, that should be wrapped in square braces too then escaping on the dot could be removed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.