13

As I am going through tutorials on Python 3, I came across the following:

>>> '' in 'spam'
True

My understanding is that '' equals no blank spaces.

When I try the following the shell terminal, I get the output shown below it:

>>> '' in ' spam '
True

Can someone please help explain what is happening?

0

3 Answers 3

18

'' is the empty string, same as "". The empty string is a substring of every other string.

When a and b are strings, the expression a in b checks that a is a substring of b. That is, the sequence of characters of a must exist in b; there must be an index i such that b[i:i+len(a)] == a. If a is empty, then any index i satisfies this condition.

This does not mean that when you iterate over b, you will get a. Unlike other sequences, while every element produced by for a in b satisfies a in b, a in b does not imply that a will be produced by iterating over b.

So '' in x and "" in x returns True for any string x:

>>> '' in 'spam'
True
>>> "" in 'spam'
True
>>> "" in ''
True
>>> '' in ""
True
>>> '' in ''
True
>>> '' in ' ' 
True
>>> "" in " "
True
Sign up to request clarification or add additional context in comments.

6 Comments

If empty string, by definition, must exist in ever other string, why is it not part of the iterator set? I.E. for i in 'spam': print(i)
@Brightlights That's an interesting question. I may have phrased that incorrectly – essentially, a in b (for strings) checks that all elements of a are in b. Thus, if a is empty, every element of it (which are no elements) exist in any b. See my updated answer.
@RushyPanchal: That's not how the check works. a in b for strings checks that a is a substring of b. For the check to evaluate to True, there must be some index i such that b[i:i+len(a)] == a. (This is completely different from all the other built-in sequence types.)
@Brightlights The iterator for strings iterates over every 1-character substring (i.e. over each character). The empty string is not a character of the string. However, the in operator, the containment check, just checks whether a string a is contained as a substring within b. You can find the empty string in every zero-length substring of a string, so '' in x is true for every string x.
@Brightlights: for i in 'spam': print(i) will also not print 'am', even though 'am' is a substring of 'spam'. That's because for i in 'spam' does not iterate over all substrings, it iterates over all characters. If you somehow iterated over all substrings, it would indeed include ''.
|
6

The string literal '' represents the empty string. This is basically a string with a length of zero, which contains no characters.

The in operator is defined for sequences to return “True if an item of s is equal to x, else False” for an expression x in s. For general sequences, this means that one of the items in s (usually accessible using iteration) equals the tested element x. For strings however, the in operator has subsequence semantics. So x in s is true, when x is a substring of s.

Formally, this means that for a substring x with a length of n, there must be an index i which satisfies the following expression: s[i:i+n] == x.

This is easily understood with an example:

>>> s = 'foobar'

>>> x = 'foo'
>>> n = len(x) # 3
>>> i = 0
>>> s[i:i+n] == x
True

>>> x = 'obar'
>>> n = len(x) # 4
>>> i = 2
>>> s[i:i+n] == x
True

Algorithmically, what the in operator (or the underlying __contains__ method) needs to do is iterate the i to all possible values (0 <= i < len(s) - n) and check if the condition is true for any i.

Looking back at the empty string, it becomes clear why the '' in s check is true for every string s: n is zero, so we are checking s[i:i]; and that is the empty string itself for every valid index i:

>>> s[0:0]
''
>>> s[1:1]
''
>>> s[2:2]
''

It is even true for s being the empty string itself, because sequence slicing is defined to return an empty sequence when a range outside of the sequence is specified (that’s why you could do s[74565463:74565469] on short strings).

So that explains why the containment check with in always returns True when checking the empty string as a substring. But even if you think about it logically, you can see the reason: A substring is part of a string which you can find in another string. The empty string however can be find between every two characters. It’s like how you can add an infinite amount of zeros to a number, you can add an infinite amount of empty strings to a string without actually modifying that string.

1 Comment

Great answer that deserves to be @ the top.
1

As Rushy Panchal points out, in inclusion operator follows set-theoretic convention and assumes that an empty string is a substring of any string.

You can try to persuade yourself why this makes sense by considering the following: let s be a string such that '' in s == False. Then '' in s[len(s):] better be false by transitivity (or else there is a subset of s that contains '', but s does not contain '', etc). But then '' in '' == False, which isn't great either. So you cannot pick any string s such that '' not in s which does not create a problem.

Of course, when in doubt, simulate it:

s = input('Enter any string you dare:\n')

print('' in '')
print(s == s + '' == '' + s)
print('' in '' + s)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.