Replacing directory path on some 100000 files

zafouhar · June 2016

Hello guys,

Does anyone know of a quick solution to replace the directory path on 100,000 files in CentOs?

I've tried the following solutions up to now but they've all failed due to the amount of files:

find . -name *.php |xargs grep -rl /home/test/public_html/test2/*/* | xargs sed -i 's|/home/admin/public_html/test/nqd/test2/hello.php|/home/test/public_html/nqd/test2/hello.php|g'


perl -e "s|/home/admin/public_html/test/nqd/test2/hello.php|/home/test/public_html/nqd/test2/hello.php|g;" -pi $(find /home/test/public_html/test2/*/* -type f)

grep -rl /home/test/public_html/test2/*/* | xargs sed -i 's|/home/admin/public_html/test/nqd/test2/hello.php|/home/test/public_html/nqd/test2/hello.php|g'

ehab · June 2016

did you try rsync with dryrun?

zafouhar · June 2016

@ehab said:
did you try rsync with dryrun?

Would I be able to replace hardcoded directory paths in php files though with rsync?

ehab · June 2016

sorry, i now understand you question better.

did you try what you have above with a small set? for example was the hardcoded urls replaced with the newer ones? if Yes then why not build a list of files and then feed the file to a your script/ or the sed?

zafouhar · June 2016

@ehab said:
sorry, i now understand you question better.

did you try what you have above with a small set? for example was the hardcoded urls replaced with the newer ones? if Yes then why not build a list of files and then feed the file to a your script/ or the sed?

I tried building a list of files, but these are probably millions of files, my estimate of 100,000 was quite off as there are around 20 or so directories with that many files/sub directories and more files.

I think there was one more way to do that without getting the argument too large errors but right now I can't think heh.

Microlinux · June 2016

You could try doing them in chunks rather than one monolithic process.

Edit: Never mind, reading comprehension error.

William · June 2016

zafouhar said: I think there was one more way to do that without getting the argument too large errors but right now I can't think heh.

xargs into background, but at this scale you run into filehandle and kernel process limits also unless this happens to be tweaked/BSD/OSX.

In similar situations (i.e. once around 2mil NFO/text files) i usually went by the first char, executed, and then went on (i.e. 0* 1*) - in some cases needed to split further due still being too much.

Last thing also, BSD does seemingly not have a lot of this problems, operating differently in handle management.

awvnx · June 2016

Isn't this exactly what the find -exec {} syntax is for?

Falzo · June 2016

awvnx said: Isn't this exactly what the find -exec {} syntax is for?

+1 , get rid of those pipes.

find . -name *.php -exec perl -i -p -e 's/\/path\/tobe\/replaced/\/new\/path\/toset/g' {} \;

you could add -print to get the files printed which were affected.

as this doesn't pipe but starts perl seperately for replacing on every single file it probably should work much better even if it will take while.

zafouhar · June 2016

@Falzo said:

awvnx said: Isn't this exactly what the find -exec {} syntax is for?

+1 , get rid of those pipes.

> find . -name *.php -exec perl -i -p -e 's/\/path\/\/tobe\/replaced/\/new\/path\/toset/g' {} \;
>

you could add -print to get the files printed which were affected.

as this doesn't pipe but starts perl seperately for replacing on every single file it probably should work much better even if it will take while.

I think that did the trick, its been running for 15minutes with no error yet.

Falzo · June 2016

@zafouhar said:

you're welcome ;-)

I happen to use something like that quite often, find is very powerful. piping is very useful too but may have some culprits ;-)

PS: check the syntax for your regex twice, I may have had too much escaped slashes in it ^^

hostingwizard_net · June 2016

find . -name \*.php -exec perl -i -p -e 's_/path/tobe/replaced/_/new/path/toset/_g' {} \;

(escape the glob, use another delimiter for the regexp)

raindog308 · June 2016

I think this is at least the third time I've thought "nice work @Falzo"

Falzo · June 2016

hostingwizard_net said: escape the glob, use another delimiter for the regexp

you're right, I am just to used to much things or ways doing something ;-)
thanks for pointing that out.

@raindog308 said:
I think this is at least the third time I've thought "nice work @Falzo"

thank you, just trying to help sometimes esp. with problems I encountered myself once ^^

Howdy, Stranger!

Categories

In this Discussion

Replacing directory path on some 100000 files

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Replacing directory path on some 100000 files

Comments