0

I have an XML file of the format:

<classes>

 <subject lb="Fall Sem 2020">
  <name>Operating System</name>
  <credit>3</credit>
  <type>Theory</type>
  <faculty>Prof. XYZ</faculty> 
 </subject>

 <subject lb="Spring Sem 2020">
  <name>Web Development</name>
  <credit>3</credit>
  <type>Lab</type>
 </subject>

 <subject lb="Fall Sem 2021">
  <name>Computer Network</name>
  <credit>3</credit>
  <type>Theory</type>
  <faculty>Prof. ABC</faculty> 
 </subject>

 <subject lb="Spring Sem 2021">
  <name>Software Engineering</name>
  <credit>3</credit>
  <type>Lab</type>
 </subject>

</classes>

I'm able to get the desired result using sed command. i.e. sed -En 's/.* lb="([^"]+)".*/\1/p' file

Output:

Fall Sem 2020
Spring Sem 2020
Fall Sem 2021
Spring Sem 2021

I want this output to be stored in an array. i.e.

arr[0]="Fall Sem 2020"

My try: arr=($(sed -En 's/.* lb="([^"]+)".*/\1/p' file)) But in this case, I'm getting individual element as an array element. i.e. arr[0]="Fall"

2
  • Check this thread --> stackoverflow.com/questions/24628076/… Commented Apr 15, 2020 at 12:43
  • The best answer in that Q&A is the last one using readarray, I'd say. Commented Apr 15, 2020 at 12:47

3 Answers 3

1

Could you please try following(considering that OP doesn't have xml tools and can't install them too).

IFS=',';array=( $(
awk '
BEGIN{ OFS="," }
/<subject lb="/{
  match($0,/".*"/)
  val=(val?val OFS:"")substr($0,RSTART+1,RLENGTH-2)
}
END{
  print val
}' Input_file))

To print all elements of array use:

echo ${array[@]}
Fall Sem 2020 Spring Sem 2020 Fall Sem 2021 Spring Sem 2021

To print specific element use:

echo ${array[0]}
Fall Sem 2020
Sign up to request clarification or add additional context in comments.

Comments

1

With bash:

# disable job control and enable lastpipe to run mapfile in current environment
set +m; shopt -s lastpipe

sed -En 's/.* lb="([^"]+)".*/\1/p' file | mapfile -t arr

declare -p arr

Output:

declare -a arr=([0]="Fall Sem 2020" [1]="Spring Sem 2020" [2]="Fall Sem 2021" [3]="Spring Sem 2021")

In a script job control is disabled by default.

Comments

0

You could use an XML aware tool such as XmlStarlet to extract the attribute you want, and then use readarray and process substitution to read the output into an array:

$ readarray -t arr < <(xml sel -t -v 'classes/subject/@lb' infile.xml)
$ declare -p arr
declare -a arr=([0]="Fall Sem 2020" [1]="Spring Sem 2020" [2]="Fall Sem 2021" [3]="Spring Sem 2021")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.