3

I am fairly new to this stuff, and I need a shell file to loop through all ".xml" files in a folder, and do some text replacements. So far I have come up with this:

sed "s/old_text/new_text/g" testfile.xml -i

However, I want this to run on all xml files in the current folder, not just on "testfile.xml". Furthermore, how can I make a backup of the original file ?

Any input is more than welcomed! Thankls a lot!

3 Answers 3

9

To run sed on all the xml files, just specify the wildcard

sed "s/old_text/new_text/g" *.xml -i

To create a backup, just specify the extension after -i:

sed "s/old_text/new_text/g" *.xml -i~

Note that's usually better to use XML aware tools to handle XML.

Sign up to request clarification or add additional context in comments.

7 Comments

...and just pray that neither old_text nor new_text contain any of $, /, \1, &, ?, *, (, ), [, ], \+, ., etc. Just be aware that sed does NOT operate on strings, it operates on regexps with a restricted character set. See stackoverflow.com/questions/29613304/… and consider using a tool that does operate on strings, e.g. awk. You can reduce the risk of breakage slightly by using single quotes instead of double around the command.
@EdMorton: True, but awk doesn't parse XML, either :)
It does if you use the XML library, see gnu.org/software/gawk/manual/html_node/gawkextlib.html, but mainly I just wanted to give the self-declared newbie a heads up that he will not be using strings with sed.
@EdMorton: Interesting, I didn't know they exist. Could you provide links?
just updated my previous comment to include a link. See also sourceforge.net/projects/gawkextlib
|
3

For all .xml files that lie in the current directory:

sed -i.bak 's/old_text/new_text/g' *.xml

To recurse into subdirectories, combine with find:

find . -name '*.xml' -exec sed -i.bak 's/old_text/new_text/g' '{}' \;

The backup files will end in .xml.bak this way (the parameter to -i is appended to the original file name).

2 Comments

Should point out that this creates .bak versions of all .xml files, rather than just the ones which are updated.
what if I don't want to have backup files
2

a practical shell script, if you intend to sanitize a bunch of files with a number of measures – things that will get a little impractical on a single line...

# only take files form certain subfolders and certain extensions

# be careful to not tamper with .git or .svn folders 
# - thus excluding all hidden folders as an extra precaution
# - also tampering with node_modules is a bad idea

FILES=$(find . -type f -regextype posix-extended     \
    -regex "^\./(public|source)/.*\.(scss|js)$"         \
    -not -regex ".*\/(\.|node_modules).*")

for f in $FILES
do
echo "Processing $f file..."

# all files: prune trailing whitespace on each file.
sed -i 's/ *$//' $f

if [[ $f =~ \.js$ ]]; then
    echo "javascript file!"
    # DO stuff
fi

if [[ $f =~ \.scss$ ]]; then
    echo "scss file!"
    # \b whole word matching – stackoverflow.com/a/1032039/444255
    sed -i 's/\#000\b/black/g' $f
    sed -i 's/\#000000\b/black/g' $f
    sed -i 's/\#fff\b/white/g' $f
    sed -i 's/\#ffffff\b/white/g' $f
fi

done

caveat: with great power comes great responsibility, and mass-replacement means great power...

1 Comment

This doesn't work if a file name contains space.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.