find() function - NumPy String Operations
The numpy.char.find() function is used to locate first occurrence of a substring within each string element of a NumPy array. This is helpful when working with string arrays and searching for patterns within them.
For Example: This example shows how to find the position of "y" in each string of the array.
import numpy as np
arr = np.array(["happy", "python", "numpy"])
print("Array:", arr)
res = np.char.find(arr, "y")
print("Result:", res)
Output
Array: ['happy' 'python' 'numpy'] Result: [4 1 4]
Explanation:
- "y" is found at index 4 in "happy",
- at index 5 in "python",
- and at index 4 in "numpy".
Syntax
numpy.char.find(arr, sub, start=0, end=None)
Parameters:
- arr: array_like of str -> Input array of strings.
- sub: str -> Substring to search for.
- start: int, optional -> Starting index for search.
- end: int, optional -> Ending index for search.
Return Value: ndarray of ints -> Positions of first occurrence of substring. Returns -1 if not found.
Examples
Example 1: This example searches for 'a' in the strings but only within indices 3 to 7.
import numpy as np
arr = np.array(['aAaAaA', 'aA', 'abBABba'])
print("Array:", arr)
res = np.char.find(arr, 'a', start=3, end=7)
print("Result:", res)
Output
Array: ['aAaAaA' 'aA' 'abBABba'] Result: [ 4 -1 6]
Explanation:
- In "aAaAaA", 'a' occurs at index 4 (within range 3–7).
- "aA" has no 'a' in that range -> -1.
- In "abBABba", 'a' is found at index 6.
Example 2: This example checks for the substring "num" inside each word.
import numpy as np
arr = np.array(["python", "numpy", "number", "fun"])
print("Array:", arr)
res = np.char.find(arr, "num")
print("Result:", res)
Output
Array: ['python' 'numpy' 'number' 'fun'] Result: [-1 0 0 -1]
Explanation:
- "num" is found at index 0 in "numpy" and "number".
- Not found in "python" or "fun", hence -1.
Example 3: This example demonstrates that find() is case-sensitive.
import numpy as np
arr = np.array(["Apple", "banana", "Apricot"])
print("Array:", arr)
res = np.char.find(arr, "a")
print("Result:", res)
Output
Array: ['Apple' 'banana' 'Apricot'] Result: [-1 1 -1]
Explanation:
- 'a' is not found in "Apple" (uppercase 'A' is different).
- Found at index 1 in "banana".
- "Apricot" has uppercase 'A', so lowercase 'a' is not matched.