0

So I made my input from the read-host into an array an figured it would let me count the amount of times a word in the sentence $a is seen from the $Array. however Count++ doesn't give me a total

   function Get-Sentence($a){
                if($a -contains $array) {
                   $Count++
                }
             else {
                   return 0
                }   
        }
        Write-Host "There are $count words"

        [array]$Array = @("a", "an", "the")
        [array]$a = Read-Host "Enter a long sentence from a story book or novel: ").split(" ")
2
  • -contains does not work that way. You're asking if the split array contains the entire array of words you specified, which obviously it does not (it contains words, not an array). You can solve that with a loop, or more effectively by storing your word list as a hash table and checking membership in that. Commented Nov 4, 2016 at 16:42
  • Thanks for your quick response, so you're saying that I should make my $Array array into a hash table? Commented Nov 4, 2016 at 17:48

4 Answers 4

2

Preferred approach:

Easiest way to accurately count occurrences of multiple substrings is probably:

  1. Construct regex pattern that matches on any one of the substrings
  2. Use the -split operator to split the string
  3. Count the resulting number of strings and substract 1:

# Define the substrings and a sentence to test against
$Substrings = "a","an","the"
$Sentence   = "a long long sentence to test the -split approach, anticipating false positives"

# Construct the regex pattern
# The \b sequence ensures "word boundaries" on either side of a 
# match so that "a" wont match the a in "man" for example
$Pattern = "\b(?:{0})\b" -f ($Substrings -join '|')

# Split the string, count result and subtract 1
$Count = ($Sentence -split $Pattern).Count - 1

Outputs:

C:\> $Count
2

As you can see it will have matched and split on "a" and "the", but not the "an" in "anticipating".

I'll leave converting this into a function an exercise to the reader


Note: if you start feeding more than just simple ASCII strings as input, you may want to escape them before using them in the pattern:

$Pattern = "\b(?:{0})\b" -f (($Substrings |ForEach-Object {[regex]::Escape($_)}) -join '|')

Naive approach:

If you're uncomfortable with regular expressions, you can make the assumption that anything in between two spaces is "a word" (like in your original example), and then loop through the words in the sentence and check if the array contains the word in question (not the other way around):

$Substrings = "a","an","the"
$Sentence   = (Read-Host "Enter a long sentence from a story book or novel: ").Split(" ")

$Counter = 0

foreach($Word in $Sentence){
    if($Substrings -contains $Word){
        $Counter++
    }
}

As suggested by Jeroen Mostert, you could also utilize a HashTable. With this you could track occurrences of each word, instead of just a total count:

$Substrings = "a","an","the"
$Sentence   = (Read-Host "Enter a long sentence from a story book or novel: ").Split(" ")

# Create hashtable from substrings
$Dictionary = @{}
$Substrings |ForEach-Object { $Dictionary[$_] = 0 }

foreach($Word in $Sentence){
    if($Dictionary.ContainsKey($Word)){
        $Dictionary[$Word]++
    }
}

$Dictionary
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your quick response! I don't understand what it going on in the "\b(?:{0})\b" part. Is there any way I could use it the way my script is written?
@BartVanRooijen \b(?:sometext)\b is a Regular Expression pattern - the {0} is a placeholder. The -f operator will replace {0} with the first argument on the right-hand side. When you say "use it the way my script is written", what exactly do you mean? The -contains operator for example makes no sense in this context
Thank you so much!!
1
  $Substrings = "a","an","the"
  ("a long long sentence to test the -split approach, anticipating false positives" -split " " | where {$Substrings -contains $_}).Count

Comments

0

Another way to do it using Enumerable.Count, a Hashset<T> for efficiently check if each token is contained in the substrings and -split to trim the white space from the split tokens.

$Substrings = [System.Collections.Generic.HashSet[string]]::new(
    [string[]] ('a', 'an', 'the'),
    [System.StringComparer]::InvariantCultureIgnoreCase)

$Sentence = Read-Host 'Enter a long sentence from a story book or novel: '

[System.Linq.Enumerable]::Count(
    -split $Sentence,
    [Func[string, bool]] { $Substrings.Contains($args[0]) })

Comments

0
$mystring = "abc,123"
$elements = ($element = $mystring -split ",").count
Write-Host($elements)
2
Write-Host($element[0])
abc
Write-Host($element[1))
123

2 Comments

Welcome to Stack Overflow! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.
You might also want to learn about formatting here. stackoverflow.com/editing-help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.