New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Replacing directory path on some 100000 files
Hello guys,
Does anyone know of a quick solution to replace the directory path on 100,000 files in CentOs?
I've tried the following solutions up to now but they've all failed due to the amount of files:
find . -name *.php |xargs grep -rl /home/test/public_html/test2/*/* | xargs sed -i 's|/home/admin/public_html/test/nqd/test2/hello.php|/home/test/public_html/nqd/test2/hello.php|g'
perl -e "s|/home/admin/public_html/test/nqd/test2/hello.php|/home/test/public_html/nqd/test2/hello.php|g;" -pi $(find /home/test/public_html/test2/*/* -type f)
grep -rl /home/test/public_html/test2/*/* | xargs sed -i 's|/home/admin/public_html/test/nqd/test2/hello.php|/home/test/public_html/nqd/test2/hello.php|g'
Comments
did you try rsync with dryrun?
Would I be able to replace hardcoded directory paths in php files though with rsync?
sorry, i now understand you question better.
did you try what you have above with a small set? for example was the hardcoded urls replaced with the newer ones? if Yes then why not build a list of files and then feed the file to a your script/ or the sed?
I tried building a list of files, but these are probably millions of files, my estimate of 100,000 was quite off as there are around 20 or so directories with that many files/sub directories and more files.
I think there was one more way to do that without getting the argument too large errors but right now I can't think heh.
You could try doing them in chunks rather than one monolithic process.
Edit: Never mind, reading comprehension error.
xargs into background, but at this scale you run into filehandle and kernel process limits also unless this happens to be tweaked/BSD/OSX.
In similar situations (i.e. once around 2mil NFO/text files) i usually went by the first char, executed, and then went on (i.e. 0* 1*) - in some cases needed to split further due still being too much.
Last thing also, BSD does seemingly not have a lot of this problems, operating differently in handle management.
Isn't this exactly what the find -exec {} syntax is for?
+1 , get rid of those pipes.
you could add -print to get the files printed which were affected.
as this doesn't pipe but starts perl seperately for replacing on every single file it probably should work much better even if it will take while.
I think that did the trick, its been running for 15minutes with no error yet.
you're welcome ;-)
I happen to use something like that quite often, find is very powerful. piping is very useful too but may have some culprits ;-)
PS: check the syntax for your regex twice, I may have had too much escaped slashes in it ^^
find . -name \*.php -exec perl -i -p -e 's_/path/tobe/replaced/_/new/path/toset/_g' {} \;
(escape the glob, use another delimiter for the regexp)
I think this is at least the third time I've thought "nice work @Falzo"
you're right, I am just to used to much things or ways doing something ;-)
thanks for pointing that out.
thank you, just trying to help sometimes esp. with problems I encountered myself once ^^