3

I'm trying to join two tables together based on an ID column. The join is not working successfully because I cannot join a varchar column on an integer column, despite using cast().

In the first table, the ID column is character varying, in the format of: XYZA-123456. In the second table, the ID column is simply the number: 123456.

-- TABLE 1
create table fake_receivers(id varchar(11));

insert into fake_receivers(id) values 
('VR2W-110528'),
('VR2W-113640'),
('VR4W-113640'),
('VR4W-110528'),
('VR2W-110154'),
('VMT2-127942'),
('VR2W-113640'),
('V16X-110528'),
('VR2W-110154'),
('VR2W-110528');

-- TABLE 2
create table fake_stations(receiver_sn integer, station varchar);

insert into fake_stations values 
('110528', 'Baff01-01'),
('113640', 'Baff02-02'),
('110154', 'Baff03-01'),
('127942', 'Baff05-01');

My solution is to split the string at the dash, take the number after the dash, and cast it as an integer, so that I may perform the join:

select cast(split_part(id, '-', 2) as integer) from fake_receivers; -- this works fine, seemingly selects integers

However, when I actually attempt to perform the join, I'm getting the following error, despite using an explicit cast:

select cast(split_part(id, '-', 2) as integer), station
        from fake_receivers 
        inner join fake_locations 
        on split_part = fake_locations.receiver_sn -- not recognizing split_part cast as integer! 

>ERROR: operator does not exist: character varying = integer
>Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.

Strangely enough, I can perform this join with my full dataset (a queried result set shows up) but I then can't manipulate it at all (e.g. sorting, filtering it) - I get an error saying ERROR: invalid input syntax for integer: "UWM". The string "UWM" appears nowhere in my dataset or in my code, but I strongly suspect it has to do with the split_part cast from varchar to integer going wrong somewhere.

-- Misc. info
select version();
>PostgreSQL 10.5 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 9.0.0 (clang-900.0.39.2), 64-bit

EDIT: dbfiddle exhibiting behavior

4
  • NOTE to anybody reading this at the moment: minor errors in my code above which I will fix tomorrow, but Tim's answer below uses correct code that correctly reproduces & then subsequently fixes the issue. Commented Nov 20, 2018 at 3:08
  • "Smart keys"--your fake_receivers id values--are a design antipattern. Commented Nov 20, 2018 at 3:50
  • Not sure I understand, can you explain? If you're referring to the fact that there are letters & dash before the number - nothing I can do about that, that's the way the sensor hardware spits out the data; in the second table, we don't always know the letter code associated with a serial number, but always know the serial number and that the serial numbers will always line up. Not ideal but splitting the string in the first table is the only option I have. Commented Nov 20, 2018 at 17:15
  • 1
    Whatever raw sensor data we log, we can extract appropriate data into a more suitable schema. Re "smart keys". Commented Nov 21, 2018 at 8:02

1 Answer 1

5

You need to include your current logic directly in the join condition:

select *
from fake_receivers r
inner join fake_stations s
    on split_part(r.id, '-', 2)::int = s.receiver_sn;

Demo

Sign up to request clarification or add additional context in comments.

2 Comments

That definitely helps. I'm still getting the mysterious ERROR: invalid input syntax for integer: "UWM" in my full db - will update the above fiddle with more complex data to see if I can recreate it.
Then most likely your data has some problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.