Reverse strings in SQL Server

Question

I have a table with product values as below:

apple iphone
iphone apple
samsung phone
phone samsung

I want to delete those products from the table which are exact reverse(as I consider them as duplicates), such that instead of 4 records, my table just have 2 records

apple iphone
samsung phone

I understand that there is REVERSE function in SQL Server, but it will reverse the whole string, and its not what I'm looking for.

I'd greatly appreciate any suggestions/ideas.

I understand it when you call iPhone's like that, since Apple has only one brand... But having had to support Galaxy S/Y/II/III/IV/Grand Duos/Grand Quattro/Win/Note/Note 2/Tab/Tab 2 7.0", I think "Samsung Phone" is calling a lot of different things by the same name... — Geeky Guy
– Geeky Guy, Commented Aug 23, 2013 at 18:03
Great! Can you please show those cases too, instead of just the simplest? When you only show the simplest scenario, people tend to solve for that, and then you have to come back and say "but it didn't work for..." - ask the whole question up front, please. — Aaron Bertrand
– Aaron Bertrand, Commented Aug 23, 2013 at 18:04
Are these keyed-in strings? I think you may be looking for approximate string matching algorithms, not word reversal. — transistor1
– transistor1, Commented Aug 23, 2013 at 18:19

Community · Accepted Answer · 2017-05-23 12:22:17Z

Assuming that your dictionary does not include any XML entities (e.g. > or <), and that it is not practical to manually create a bunch of UPDATE statements for every combination of words in your table (if it is practical, then simplify your life, stop reading this answer, and use Justin's answer), you can create a function like this:

CREATE FUNCTION dbo.SplitSafeStrings
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN 
   ( SELECT Item = LTRIM(RTRIM(y.i.value('(./text())[1]', 'nvarchar(4000)')))
     FROM ( SELECT x = CONVERT(XML, '<i>' 
          + REPLACE(@List, @Delimiter, '</i><i>') + '</i>').query('.')
      ) AS a CROSS APPLY x.nodes('i') AS y(i));
GO

(If XML is a problem, there are other, more complex alternatives, such as CLR.)

Then you can do this:

DECLARE @x TABLE(id INT IDENTITY(1,1), s VARCHAR(64));

INSERT @x(s) VALUES
  ('apple iphone'),
  ('iphone Apple'),
  ('iphone samsung hoochie blat'),
  ('samsung hoochie blat iphone');

;WITH cte1 AS 
(
  SELECT id, Item FROM @x AS x
  CROSS APPLY dbo.SplitSafeStrings(LOWER(x.s), ' ') AS y
),
cte2(id,words) AS 
(
  SELECT DISTINCT id, STUFF((SELECT ',' + orig.Item 
    FROM cte1 AS orig
    WHERE orig.id = cte1.id
    ORDER BY orig.Item
    FOR XML PATH(''), TYPE).value('.[1]','nvarchar(max)'),1,1,'')
  FROM cte1
),
cte3 AS 
(
  SELECT id, words, rn = ROW_NUMBER() OVER (PARTITION BY words ORDER BY id)
  FROM cte2
)
SELECT id, words, rn FROM cte3
-- WHERE rn = 1 -- rows to keep
-- WHERE rn > 1 -- rows to delete
;

So you could, after the three CTEs, instead of the final SELECT above, say:

DELETE t FROM @x AS t
  INNER JOIN cte3 ON cte3.id = t.id
  WHERE cte3.rn > 1;

And what should be left in @x?

SELECT id, s FROM @x;

Results:

id  s
--  ---------------------------
1   apple iphone
3   iphone samsung hoochie blat

Justin Pihony · Accepted Answer · 2013-08-23 18:02:04Z

5

It seems to me that you are complicating this too much, a simple update statement would work:

UPDATE table SET productname = 'apple iphone' WHERE productname = 'iphone apple'

answered Aug 23, 2013 at 18:02

Justin Pihony

67.2k20 gold badges154 silver badges185 bronze badges

9 Comments

Aaron Bertrand Over a year ago

That assumes you know all of the possible combinations and it isn't too tedious to write all of those commands (what if there are thousands?). Also should be = 'apple iphone' - single quotes are string delimiters in T-SQL, double quotes are not. As an aside, how did you have an up-vote when your answer was exactly 3 seconds old?

Geeky Guy Over a year ago

@AaronBertrand I upvoted it. And it's been single quotes there from the beggining.

Justin Pihony Over a year ago

First, yes it assumes that. Second, you are correct, fixed. Third, I dont know

Aaron Bertrand Over a year ago

@Renan no, it has not. The original version had "apple iphone" but that won't show in the revision history because he fixed it during the grace period.

Justin Pihony Over a year ago

@AaronBertrand Also, a syntax error on an example should not be a reason to not upvote something IMO. In most cases, the answer should be more of a guide to help. It can be commented on to be fixed, or even fixed directly

|

shieldgenerator7 · Accepted Answer · 2013-08-23 18:06:50Z

3

I don't know how to do this in SQL, but in a language where you interface with SQL, you can do this:

You can tokenize each line so that you have an array of words, so that "iphone apple" becomes {"iphone","apple"} and then you can switch the order of the elements using a common swap statement so that it becomes {"apple","iphone"} and then you can turn it back into a string to make "apple iphone"

Although the process I describe above isn't all that hard to do, finding out which ones are duplicates of each other (knowing which ones to flip) might be a harder problem

answered Aug 23, 2013 at 18:06

shieldgenerator7

1,7761 gold badge22 silver badges25 bronze badges

Comments

Meldor · Accepted Answer · 2013-08-23 18:41:32Z

2

Basing on data examples you've provided you could try something like this:

In case the "proper" format for productname is <brand> <product_type> you can just delete all products with productname not like '<brand>%'.

In case above won't help - are there any product naming rules?

As above idea cannot be applied, create Split function:

CREATE FUNCTION [dbo].[Split]
(
    @String NVARCHAR(4000),
    @Delimiter NCHAR(1)
)
RETURNS TABLE 
AS
RETURN 
(
    WITH Split(stpos,endpos) 
    AS(
        SELECT 0 AS stpos, CHARINDEX(@Delimiter,@String) AS endpos
        UNION ALL
        SELECT endpos+1, CHARINDEX(@Delimiter,@String,endpos+1)
            FROM Split
            WHERE endpos > 0
    )
    SELECT 'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
        'Data' = SUBSTRING(@String,stpos,COALESCE(NULLIF(endpos,0),LEN(@String)+1)-    stpos)
FROM Split
)

And use it in query:

select 
    (SELECT (', ' + Data) 
     FROM Split(t.textVal, ' ')
     order by [Data]
     FOR XML PATH( '' )
    )
from 
    test t

This will provide you with product name with sorted words. With this you can easily find duplicates. Second query is rough around the edges as i gotta go afk, but you should manage to smooth it out :) Good luck

edited Aug 23, 2013 at 18:41

answered Aug 23, 2013 at 18:08

Meldor

2361 silver badge3 bronze badges

1 Comment

pk188 Over a year ago

There are no product naming rules as such, some other examples can be:"online nokia lumia shop", "shop lumia nokia online"

gordy · Accepted Answer · 2013-08-23 19:02:45Z

2

here's a solution for two or more words separated by space. basically the idea is to use a recursive CTE to split by space and then for xml to put the names back together sorted. Then you can group by the new name column to get your deduplicated list:

with split as (
  select id,
    convert(varchar(max), left(name, charindex(' ', name + ' ') - 1)) word,
    stuff(name, 1, charindex(' ', name + ' '), '') name
  from products

  union all

  select id,
    convert(varchar(max), left(name, charindex(' ', name + ' ') - 1)) word,
    stuff(name, 1, charindex(' ', name + ' '), '') name
  from split where name > ''
),
hom as (
  select id,
    (select word + ' '
     from split where id=o.id
     order by word for xml path('')) name
  from split o
)

select name, min(id) id from hom group by name

SQLFiddle

edited Aug 23, 2013 at 19:02

answered Aug 23, 2013 at 18:19

gordy

9,9713 gold badges37 silver badges49 bronze badges

2 Comments

Aaron Bertrand Over a year ago

Your SQLfiddle breaks down pretty quickly, if you add a 3rd word (which the OP has indicated). New SQLfiddle

gordy Over a year ago

the solution for 2 or more words will involve a table-valued function.. just a sec

Collectives™ on Stack Overflow

Reverse strings in SQL Server

5 Answers 5

Comments

9 Comments

Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

9 Comments

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related