0

I have a table music:

author                |  music
----------------------+-------
Kevin Clein           |   a
Gucio G. Gustawo      |   b
R. R. Andrzej         |   c
John McKnight Burman  |   d

How can I split a column which contain two different symbols (space and dot) and how to split name and surmane correctly to have result like:

author                |  name   | surname
----------------------+---------+----------------
Kevin Clein           |   Kevin | Clein           
Gucio G. Gustawo      | Gucio G.| Gustawo
R. R. Andrzej         |   R. R. | Andrzej
John McKnight Burman  |   John  | McKnight Burman

I have tried something like that so far:

WITH ad AS(
SELECT author,
  s[1] AS name,
  s[2] AS surname
  FROM (SELECT music.*,
  regexp_split_to_array(music.author,E'\\s[.]') AS s
       FROM music)t
)SELECT * FROM ad;
6
  • 1
    I see two pattern problems to do this with your set. Joahim van der Aber this van der Aber part. There is no rule (exaplained) that attend that. Also the second line for Donald Frut Is not a case for a regex. Regex is about pattern matching so define your pattern so we can help. Also and most important. Show some effort, what have you tried? SO is not a coding service always remember that. Commented Feb 25, 2016 at 17:58
  • How Can I write a regex statement to have it like now, after edit? @JorgeCampos Commented Feb 25, 2016 at 18:05
  • what are the rules to decide when John McKnight Burman must be John as name and McKnight Burman as surname and not John McKnight as name and Burman as surname? You see the problem? If there are no rules to solve the pattern it is almost impossible to solve it, unless you have a list defining specific cases like this. Commented Feb 25, 2016 at 18:16
  • Also it is possible that you don't need regex at all, depending on your rules. Commented Feb 25, 2016 at 18:17
  • I'm new in postgreSQL and I am learning how to split those patters. I don't know how to declare rules even. I'm just trying to solve the problem that I have. Also I had a patter which split it after space, but it's not working at all, because I was missing one part of the string if it was longer than 3 constuctors. It is possible to have a middle name column and treat McKnight as middle name? Thanks Commented Feb 25, 2016 at 18:24

1 Answer 1

1

I've create a possible solution to you. Be aware that it may not solve all problems and you will need to create an extra table to solve rules problem. By rule I mean what I've said in the comments like:

When to decide which is name and surname.

So in order to solve your problem I had to create another table that will handle surnames that should be considered as so.

The test case scenario:

create table surname (
  id SERIAL NOT NULL primary key,
  sample varchar(100)
);

--Test case inserts
insert into surname (sample) values ('McKnight'), ('McGregory'), ('Willian'), ('Knight');

create table music (
  id SERIAL NOT NULL primary key,
  author varchar(100)
);

insert into music (author) values
('Kevin Clein'),
('Gucio G. Gustawo'),
('R. R. Andrzej'),
('John McKnight Burman'),
('John Willian Smith'),
('John Williame Smith');

And My proposed solution:

select author,
       trim(replace(author, surname, '')) as name,
       surname
  from (
    select author,
          case when position(s.sample in m.author)>0 
          then (regexp_split_to_array( m.author, '\s(?='||s.sample||')' ))[2]::text
          else trim(substring( author from '\s\w+$'  ))
           end as surname
      from music m left join surname s 
        on m.author like '%'||s.sample||'%'
     where case when position(s.sample in m.author)>0 
          then (regexp_split_to_array( m.author, '\s(?='||s.sample||')' ))[2]::text
          else trim(substring( author from '\s\w+$'  )) end is not null
       ) as x

The output will be:

   AUTHOR              NAME             SURNAME
------------------------------------------------------------   
Kevin Clein            Kevin            Clein
Gucio G. Gustawo       Gucio G.         Gustawo
R. R. Andrzej          R. R.            Andrzej
John McKnight Burman   John             McKnight Burman
John Willian Smith     John             Willian Smith
John Williame Smith    John Williame    Smith

See it working here: http://sqlfiddle.com/#!15/c583f/2

In the table surname you will insert all names that should be considered as surname.

You may want to sub-query the query that do the case expression so you would use just the field instead of the hole case statement again on the where clause.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for the answer. I will try to implement it and check how it will work :)!
If you have any doubt, just ask here, I will edit it with some more explanation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.