95

I want to match an input string (contained in the variable $1) with a regex representing the date formats MM/DD/YYYY and MM-DD-YYYY.

REGEX_DATE="^\d{2}[\/\-]\d{2}[\/\-]\d{4}$"
 
echo "$1" | grep -q $REGEX_DATE
echo $?

The echo $? returns the error code 1 no matter the input string.

5
  • 1
    This is possible a kind of duplicate of:stackoverflow.com/questions/19737675/… Commented Mar 10, 2016 at 14:25
  • That's because $? reports on the first command in the pipe chain, which is echo - the echo will obviously succeed, so you get a 1 exit code. try grep $pattern <<< $1 instead. Commented Mar 10, 2016 at 14:26
  • See this question for one solution. Commented Mar 10, 2016 at 14:28
  • 1
    Always check your program's documentation to see what style of regular expressions are accepted. Commented Mar 10, 2016 at 14:34
  • 5
    @MarcB err, no, it's the other way around -- $? is the last exit status in the pipeline Commented Mar 10, 2016 at 14:39

4 Answers 4

142

To complement the existing helpful answers:

Using Bash's own regex-matching operator, =~, is a faster alternative in this case, given that you're only matching a single value already stored in a variable:

set -- '12-34-5678' # set $1 to sample value

kREGEX_DATE='^[0-9]{2}[-/][0-9]{2}[-/][0-9]{4}$' # note use of [0-9] to avoid \d
[[ $1 =~ $kREGEX_DATE ]]
echo $? # 0 with the sample value, i.e., a successful match

Note that =~ even allows you to define capture groups (parenthesized subexpressions) whose matches you can later access through Bash's special ${BASH_REMATCH[@]} array variable.

Portability caveat:

  • While =~ supports EREs (extended regular expressions), it also supports the host platform's specific extensions - it's a rare case of Bash's behavior being platform-dependent; examples:

    • \d to match a digit is supported on macOS, but not on Linux - use [0-9]

    • \< /\> and \b (word-boundary assertions) are supported on Linux, but not on macOS, where you must use [[:<:]] / [[:>:]] - none these are POSIX-compliant

    • Backreferences (e.g. \1) work on Linux, but not on macOS (per POSIX, they're only supported in basic regexes (BREs)).

      • If your scripts are designed to run Linux only, you can use a backreference to make matching more robust, by capturing the specific character used as the first separator in a capture group ((...)) and referring to it later with \1 to ensure that the same separator is matched:Thanks, ibonyun

        kREGEX_DATE='^[0-9]{2}([-/])[0-9]{2}\1[0-9]{4}$'
        
  • To remain portable (in the context of Bash), stick to the POSIX ERE specification.

Further notes:

  • $kREGEX_DATE is used unquoted, which is necessary for the regex to be recognized as such (quoted parts would be treated as literals).

  • While not always necessary, it is advisable to store the regex in a variable first, because Bash has trouble with regex literals containing \.

    • E.g., on Linux, where \< is supported to match word boundaries, [[ 3 =~ \<3 ]] && echo yes doesn't work, but re='\<3'; [[ 3 =~ $re ]] && echo yes does.
  • I've changed variable name REGEX_DATE to kREGEX_DATE (k signaling a (conceptual) constant), so as to ensure that the name isn't an all-uppercase name, because all-uppercase variable names should be avoided to prevent conflicts with special environment and shell variables.

Sign up to request clarification or add additional context in comments.

2 Comments

I'd use ^[0-9]{2}([-/])[0-9]{2}\1[0-9]{4}$ instead. It's the same except that the 1st delimiter is inside a capturing group ([-/]) and the 2nd delimiter is a back-reference to whatever that capturing group matches \1. That way you're guaranteed to only match strings where the delimiters are the same which will avoid false positives for strings like "12-34/5678" that coincidently look similar to dates but could mean something completely different.
@ibonyun, that is preferable - but not portable (which you may or may not care about in a given use case): it works on Linux, but not on macOS, for instance. I'm trying to keep the answer portable, but I've made the pitfalls of using platform-specific extensions clearer in the answer, and I've added your regex as a Linux-alternative.
31

I think this is what you want:

REGEX_DATE='^\d{2}[/-]\d{2}[/-]\d{4}$'

echo "$1" | grep -P -q $REGEX_DATE
echo $?

I've used the -P switch to get perl regex.

2 Comments

just to clarify, -P is not guaranteed to be supported in all distros. so if portability is a concern, you'll want to avoid it.
In which case, @MikeFrysinger 's solution is preferable. This one has the slight attraction of using the original regex, give or take some escaping.
15

the problem is you're trying to use regex features not supported by grep. namely, your \d won't work. use this instead:

REGEX_DATE="^[[:digit:]]{2}[-/][[:digit:]]{2}[-/][[:digit:]]{4}$"
echo "$1" | grep -qE "${REGEX_DATE}"
echo $?

you need the -E flag to get ERE in order to use {#} style.

2 Comments

++; note that if you \ -escaped the { and } instances, this particular regex would have worked without -E, as a BRE (basic regex) as well; as a non-portable aside: BSD/OSX grep - unlike GNU grep - actually does support \d.
you are certainly correct; however i prefer -E over `` everywhere as it makes the code much more readable, and it's in POSIX.
1

Below is example shell script, which we used for pre-commit hook to enforce branch name

#!/bin/bash
BRANCH_NAME=$(git rev-parse --abbrev-ref HEAD)

BRANCH_REGEX='^(feature|hotfix|chore)/MYAPP-[0-9]+(_|-).*$'
 
if ! [[ $BRANCH_NAME =~ $BRANCH_REGEX ]]; then
   echo "Error: Invalid branch name $BRANCH_NAME."
   echo "You can run below command to rename branch"
   echo "git branch -m old_branch new_branch"
fi

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.