3

I have searched for a similar topic here but most questions included single-character delimiter.

I have this sample of text:

Some text here,
continuing on next lineDELIMITERSecond chunk of text
which may as well continue on next lineDELIMITERFinal chunk

And the desired output is a list (extracted=()) which contains:

  1. Some text here, continuing on next line
  2. Second chunk of text which may as well continue on next line
  3. Final chunk

As could be seen from the sample, "DELIMITER" is used as a splitting delimiter.

I have tried numerous samples on SO incl awk, replacing etc.

2
  • 2
    Not clear, please mention expected output more clearly. Also add your efforts in your post too. Commented Jan 25, 2019 at 6:17
  • It is not clear what your requirement is? Are you suggesting even if the input spawns multiple lines, you want the content that split as a single string? i.e. should final string be Some text here,continuing on next line one entry in the final array? Commented Jan 25, 2019 at 7:18

6 Answers 6

5

In case you don't want to change default RS value then could you please try following.

awk '{gsub("DELIMITER",ORS)} 1' Input_file
Sign up to request clarification or add additional context in comments.

6 Comments

How do I put it into an array? I tried putting it inside parantheses ( ) but accessing elements like [0] [1] gives the words (because of space separation)
@Hubbs try : tab=($(awk '{gsub("DELIMITER","\"")} 1' infile ));echo "${tab[1]}"
@ctac_, thanks ctac, if you want you could edit my answer or I could add it in my solution too, please let me know.
@RavinderSingh13 add it in your answer.
@Hubbs sorry missing IFS. Try IFS='"' tab=($(awk '{gsub("DELIMITER","\"")} 1' infile ));echo ${tab[1]}
|
1

With AWK please try the following:

awk -v RS='^$' -v FS='DELIMITER' '{
    n = split($0, extracted)
    for (i=1; i<=n; i++) {
        print i". "extracted[i]
    }
}' sample.txt

which yields:

1. Some text here,
continuing on next line
2. Second chunk of text
which may as well continue on next line
3. Final chunk

If you require to transfer the awk array to bash array, further step will be needed depending on the succeeding process on the array.

Comments

1

You can try using arrays.

#!/bin/bash
str="continuing on next lineDELIMITERSecond chunk of text
which may as well continue on next lineDELIMITERFinal chunk";


delimiter=DELIMITER
s=$str$delimiter

array=();
while [[ $s ]]; do
array+=( "${s%%"$delimiter"*}" );
s=${s#*"$delimiter"};
done;
declare -p array

this will split your text into array based on your delimiter the result will be an array of your text.

array=([0]="continuing on next line" [1]=$'Second chunk of text\nwhich may as well continue on next line' [2]="Final chunk")

you can access each line using the array indices or you can print all the lines using printf '%s\n' "${array[@]}"

the results will be

continuing on next line Second chunk of text which may as well continue on next line Final chunk

The solution gives you an opportunity to do a lot with your text.

Comments

0

You can try something like:

awk 'BEGIN {RS="DELIMITER";} {print}' input_file

And then assign it to variable, etc...

Comments

0

I think the most challenge in the question is to handle space, newline, and DELIMITER correctly, and then put all things in an array. It it was to split file only, then it would be too easy. How about this template:

#!/bin/bash
gencode(){
  echo -e "extracted=(); read -r -d '' item <<-DELIMITER"
  sed 's:DELIMITER:\n&\nextracted+=("$item"); read -r -d "" item <<-&\n:' Input_file;
  echo -e "DELIMITER\n"'extracted+=("$item")'
}
gencode|cat -n                                 # for explaination purpose only
eval "`gencode`"                               # do not remove "eval"
for (( i=0; i < ${#extracted[@]}; i++ )); do   # print results
  echo "$i: ${extracted[i]}"
done

Outputs

     1  extracted=(); read -r -d '' item <<-DELIMITER
     2  Some text here,
     3  continuing on next line
     4  DELIMITER
     5  extracted+=("$item"); read -r -d "" item <<-DELIMITER
     6  Second chunk of text
     7  which may as well continue on next line
     8  DELIMITER
     9  extracted+=("$item"); read -r -d "" item <<-DELIMITER
    10  Final chunk
    11  DELIMITER
    12  extracted+=("$item")
0: Some text here,
continuing on next line
1: Second chunk of text
which may as well continue on next line
2: Final chunk

Comments

0

You can try Perl. With -0777 option, perl slurps the entire file into a $_ variable. You can then split the content using the DELIMITER. Check this out.

$ perl -0777 -ne '@x=split("DELIMITER");print join("\n\n",@x) ' hubbs.txt
Some text here,
continuing on next line

Second chunk of text
which may as well continue on next line

Final chunk

$

Adding array positions while printing

$ perl -0777 -ne '@x=split("DELIMITER"); for(@x) { print ++$i,". $_\n"  } ' hubbs.txt
1. Some text here,
continuing on next line
2. Second chunk of text
which may as well continue on next line
3. Final chunk


$

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.