Why does this code parse a null string instead of a None?

Question

I'm looking to parse a string that can be in one of the following formats:

"[a]"
"[a-b]"
"[a-b/9]"
"[a-b, b-c]"
"[a-b/9, b-c]"

In words, the part after - is optional, and if present, may in turn have an optional weight separated by /. The separator character - may change.

Here's my code (executable code here):

import scala.util.matching.Regex

case class Edge[A, B](u: A, v: A, data: Option[B])

def parseEdge(
    s: String,
    sep: Char
): (List[Edge[String, String]], List[String]) =
  if !s.startsWith("[") || !s.endsWith("]")
  then throw IllegalArgumentException("string must be enclosed by '[' and ']'")
  else
    val edgePattern: Regex = raw"""^(.+?)(?:$sep(.+?)(?:\/(.+?))??)??$$""".r

    s
      .substring(1, s.length() - 1)
      .split(",")
      .map(_.trim())
      .foldRight((List.empty[Edge[String, String]], List.empty[String])) {
        case (x, (es, vs)) =>
          x match
            case edgePattern(u)       => (es, u :: vs)
            case edgePattern(u, v)    => (Edge(u, v, None) :: es, vs)
            case edgePattern(u, v, d) => (Edge(u, v, Some(d)) :: es, vs)
      }

But:

println(parseEdge("[b-c]", '-'))  // (List(Edge(b,c,Some(null))),List())
println(parseEdge("[b-c/9]", '-'))  // (List(Edge(b,c,Some(9))),List())

Why's the first string parsed with a null instead of a None?

The fourth bird · Accepted Answer · 2025-01-13 09:47:34Z

4

This case will match, because there are 3 capture groups:

case edgePattern(u, v, d)

You can test this when printing:

val edgePattern: Regex = raw"""^(.+?)(?:$sep(.+?)(?:/(.+?))?)?$$""".r
val m = edgePattern.pattern.matcher("[b-c]")
println(m.groupCount()) // 3

In the match you give back a Some() where you add the null, resulting into Some(null)

If you wrap in in an Option(null) it will result in a None

case edgePattern(u, v, d) => (Edge(u, v, Option(d)) :: es, vs)

And then

println(parseEdge("[b-c]", '-'))  // (List(Edge(b,c,None)),List())

See the updated executable code

edited Jan 13 at 9:47

answered Jan 13 at 9:32

The fourth bird

165k16 gold badges61 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Abhijit Sarkar Jan 14 at 7:43

Upvoted, but problem seems to be with the once or not at all regex group (?), not the number of capturing groups that are present, so, this answer is not entirely correct. Removing the ? quantifier results in no match, no matter what case pattern is used. I've asked a question in the Scala user forum.

The fourth bird Jan 14 at 8:46

@AbhijitSarkar the groupCount counts the number of groups in the regex not the number of parts that it captured. So the question that you posted actually returns the expected. Also the ? in the regex makes the groups optional, if you remove the ? then you change what the pattern is expected to match. So if you remove it, then the matches change. See regex101.com/r/2XoItw/1

Collectives™ on Stack Overflow

Why does this code parse a null string instead of a None?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related