
时间:2022-01-08 09:00:55

I'm given a directory with sub directories and about 300000 different kinds of text files in there. All related to some production project, changing its architecture isn't an option.


Some tasks require replacing specific strings everywhere they occur. Using grep and sed takes about 5 minutes for every such a replace. Using find and sed takes a lot more time...


However, PhpStorm takes some time to index all the files while opening this directory, but after that searching and replacing in all the files with PhpStorm is blazing fast!


Is it possible to achieve a similar behaviour remaining in terminal emulator? To index somehow all files in a given directory for a fast search&replace after that?


Trying to google around I found some tools like cscope, idutils, seascope, but as far as I could check there are serious limitations like search only without an obvious way to replace, or indexing only source files for functions, keywords, etc...

在尝试谷歌时,我发现了一些工具,比如cscope, iowe, seascope,但是就我所能检查到的而言,有一些严重的限制,比如搜索没有明显的替代方法,或者只对函数、关键字等源文件进行索引……

What I'm looking for is a way to index all the files for fast search&replace with auto updated index. Like in PhpStorm but terminal way and open source.




1 个解决方案



How about this:


find <base directory> -type f -exec sed -i \
  -e 's/<pattern1>/<replacement1>/' \
  -e 's/<pattern2>/<replacement2>/' \
  -e 's/<patternN>/<replacementN>/' \
  {} ';'

The key there is to specify all the replacements you want to do at the same time, so that you only need one pass over the file set. If most files will need at least one replacement, then I can't see how you could do much better than that.


If only a few files need replacements, then you could instead do


grep -R --files-with-matches '<pattern1>\|<pattern2>\|...<patternN>' <base directory> \
  | xargs sed -i \
  -e 's/<pattern1>/<replacement1>/' \
  -e 's/<pattern2>/<replacement2>/' \
  -e 's/<patternN>/<replacementN>/'

Again, the key is to do all the replacements in one pass through the file list, but this version uses grep to pre-test each file for whether it needs any replacements. Pre-testing is faster than processing the whole thing with sed when there are no replacements to be made, but you have to run the file through sed anyway when replacements do need to be made.


Anything fancier is likely to take you more time to make than you will end up saving.


Do note that generic tools such as grep and sed probably will not work well for you if you need to be smart about which text to replace, such as avoiding replacements in quoted strings. If you need something like that then you really should use tools that understand the format of the files.




How about this:


find <base directory> -type f -exec sed -i \
  -e 's/<pattern1>/<replacement1>/' \
  -e 's/<pattern2>/<replacement2>/' \
  -e 's/<patternN>/<replacementN>/' \
  {} ';'

The key there is to specify all the replacements you want to do at the same time, so that you only need one pass over the file set. If most files will need at least one replacement, then I can't see how you could do much better than that.


If only a few files need replacements, then you could instead do


grep -R --files-with-matches '<pattern1>\|<pattern2>\|...<patternN>' <base directory> \
  | xargs sed -i \
  -e 's/<pattern1>/<replacement1>/' \
  -e 's/<pattern2>/<replacement2>/' \
  -e 's/<patternN>/<replacementN>/'

Again, the key is to do all the replacements in one pass through the file list, but this version uses grep to pre-test each file for whether it needs any replacements. Pre-testing is faster than processing the whole thing with sed when there are no replacements to be made, but you have to run the file through sed anyway when replacements do need to be made.


Anything fancier is likely to take you more time to make than you will end up saving.


Do note that generic tools such as grep and sed probably will not work well for you if you need to be smart about which text to replace, such as avoiding replacements in quoted strings. If you need something like that then you really should use tools that understand the format of the files.
