I have a one to many relationship. In this case, it's a pipelines entity that can have many segments. The segments entity has a column to list the wells associated with this pipeline. This column is purely informational, and is only updated from a regulatory source as a comma separated list, so the data type is text.
What I want to do is to list all the pipelines and show the segment column that has the most associated wells. Each well is identified with a standardized land location (text is the same length for each well). I am also doing other aggregate functions on the segments, so my query looks something like this (I have to simplify it because it's pretty large):
SELECT pipelines.*, max(segments.associated_wells), min(segments.days_without_production), max(segments.production_water_m3)
FROM pipelines
JOIN segments ON segments.pipeline_id = pipelines.id
GROUP BY pipelines.id
This selects the associated_wells that has the highest alphabetical value, which makes sense, but is not what I want.
max(length(segments.associated_wells)) will select the record I want, but only show the length. I need to show the column value.
How can I aggregate based on the string length but show the value?
Here's an example of what I am expecting:
Segment entity:
| id | pipeline_id | associated_wells | days_without_production | production_water_m3 |
|----|-------------|--------------------------|-------------------------|---------------------|
| 1 | 1 | 'location1', 'location2' | 30 | 2.3 |
| 2 | 1 | 'location1' | 15 | 1.4 |
| 3 | 2 | 'location1' | 20 | 1.8 |
Pipeline entity:
| id | name |
|----|-------------|
| 1 | 'Pipeline1' |
| 2 | 'Pipeline2' |
| | |
Desired Query Result:
| id | name | associated_wells | days_without_production | production_water_m3 |
|----|-------------|--------------------------|-------------------------|---------------------|
| 1 | 'Pipeline1' | 'location1', 'location2' | 15 | 2.3 |
| 2 | 'Pipeline2' | 'location1' | 20 | 1.8 |
| | | | | |