"IN" operator with empty strings in Python 3.0 [duplicate]

Question

As I am going through tutorials on Python 3, I came across the following:

>>> '' in 'spam'
True

My understanding is that '' equals no blank spaces.

When I try the following the shell terminal, I get the output shown below it:

>>> '' in ' spam '
True

Can someone please help explain what is happening?

user2357112 · Accepted Answer · 2016-05-15 03:26:49Z

18

'' is the empty string, same as "". The empty string is a substring of every other string.

When a and b are strings, the expression a in b checks that a is a substring of b. That is, the sequence of characters of a must exist in b; there must be an index i such that b[i:i+len(a)] == a. If a is empty, then any index i satisfies this condition.

This does not mean that when you iterate over b, you will get a. Unlike other sequences, while every element produced by for a in b satisfies a in b, a in b does not imply that a will be produced by iterating over b.

So '' in x and "" in x returns True for any string x:

>>> '' in 'spam'
True
>>> "" in 'spam'
True
>>> "" in ''
True
>>> '' in ""
True
>>> '' in ''
True
>>> '' in ' ' 
True
>>> "" in " "
True

edited May 15, 2016 at 3:26

user2357112

286k32 gold badges491 silver badges571 bronze badges

answered May 15, 2016 at 1:09

Rushy Panchal

17.7k16 gold badges66 silver badges94 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Brightlights Over a year ago

If empty string, by definition, must exist in ever other string, why is it not part of the iterator set? I.E. for i in 'spam': print(i)

Rushy Panchal Over a year ago

@Brightlights That's an interesting question. I may have phrased that incorrectly – essentially, a in b (for strings) checks that all elements of a are in b. Thus, if a is empty, every element of it (which are no elements) exist in any b. See my updated answer.

user2357112 Over a year ago

@RushyPanchal: That's not how the check works. a in b for strings checks that a is a substring of b. For the check to evaluate to True, there must be some index i such that b[i:i+len(a)] == a. (This is completely different from all the other built-in sequence types.)

poke Over a year ago

@Brightlights The iterator for strings iterates over every 1-character substring (i.e. over each character). The empty string is not a character of the string. However, the in operator, the containment check, just checks whether a string a is contained as a substring within b. You can find the empty string in every zero-length substring of a string, so '' in x is true for every string x.

BlueRaja - Danny Pflughoeft Over a year ago

@Brightlights: for i in 'spam': print(i) will also not print 'am', even though 'am' is a substring of 'spam'. That's because for i in 'spam' does not iterate over all substrings, it iterates over all characters. If you somehow iterated over all substrings, it would indeed include ''.

|

poke · Accepted Answer · 2016-05-15 02:21:59Z

The string literal '' represents the empty string. This is basically a string with a length of zero, which contains no characters.

The in operator is defined for sequences to return “True if an item of s is equal to x, else False” for an expression x in s. For general sequences, this means that one of the items in s (usually accessible using iteration) equals the tested element x. For strings however, the in operator has subsequence semantics. So x in s is true, when x is a substring of s.

Formally, this means that for a substring x with a length of n, there must be an index i which satisfies the following expression: s[i:i+n] == x.

This is easily understood with an example:

>>> s = 'foobar'

>>> x = 'foo'
>>> n = len(x) # 3
>>> i = 0
>>> s[i:i+n] == x
True

>>> x = 'obar'
>>> n = len(x) # 4
>>> i = 2
>>> s[i:i+n] == x
True

Algorithmically, what the in operator (or the underlying __contains__ method) needs to do is iterate the i to all possible values (0 <= i < len(s) - n) and check if the condition is true for any i.

Looking back at the empty string, it becomes clear why the '' in s check is true for every string s: n is zero, so we are checking s[i:i]; and that is the empty string itself for every valid index i:

>>> s[0:0]
''
>>> s[1:1]
''
>>> s[2:2]
''

It is even true for s being the empty string itself, because sequence slicing is defined to return an empty sequence when a range outside of the sequence is specified (that’s why you could do s[74565463:74565469] on short strings).

So that explains why the containment check with in always returns True when checking the empty string as a substring. But even if you think about it logically, you can see the reason: A substring is part of a string which you can find in another string. The empty string however can be find between every two characters. It’s like how you can add an infinite amount of zeros to a number, you can add an infinite amount of empty strings to a string without actually modifying that string.

Rushy Panchal · Accepted Answer · 2016-05-15 03:05:32Z

1

As Rushy Panchal points out, in inclusion operator follows set-theoretic convention and assumes that an empty string is a substring of any string.

You can try to persuade yourself why this makes sense by considering the following: let s be a string such that '' in s == False. Then '' in s[len(s):] better be false by transitivity (or else there is a subset of s that contains '', but s does not contain '', etc). But then '' in '' == False, which isn't great either. So you cannot pick any string s such that '' not in s which does not create a problem.

Of course, when in doubt, simulate it:

s = input('Enter any string you dare:\n')

print('' in '')
print(s == s + '' == '' + s)
print('' in '' + s)

edited May 15, 2016 at 3:05

Rushy Panchal

17.7k16 gold badges66 silver badges94 bronze badges

answered May 15, 2016 at 2:50

hilberts_drinking_problem

11.6k3 gold badges25 silver badges55 bronze badges

Collectives™ on Stack Overflow

"IN" operator with empty strings in Python 3.0 [duplicate]

3 Answers 3

6 Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

1 Comment

Comments

Linked

Related