将一个随机文件递归复制到另一个文件夹,同时保持文件夹结构

时间:2022-02-03 16:04:13

I want to create a .bat script to copy only one random file from each folder (also subfolders, so recursively) whilst also keeping the folder structure. I've tried the following code which comes close to what I want but doesn't copy the folder structure and one file per folder.

我想创建一个.bat脚本来复制每个文件夹中的一个随机文件(也是子文件夹,所以递归),同时保持文件夹结构。我尝试了以下代码,它接近我想要的但不复制文件夹结构和每个文件夹一个文件。

@ECHO OFF

SETLOCAL EnableExtensions EnableDelayedExpansion
SET Destination=H:\Temp
SET FileFilter=.ape
SET SubDirectories=/S

SET Source=%~dp1
SET FileList1Name=FileList1.%RANDOM%.txt
SET FileList1="%TEMP%\%FileList1Name%"
SET FileList2="%TEMP%\FileList2.%RANDOM%.txt"

ECHO Source: %Source%
IF /I {%SubDirectories%}=={/S} ECHO + Sub-Directories
IF NOT {"%FileFilter%"}=={""} ECHO File Filter: %FileFilter%
ECHO.
ECHO Destination: %Destination%
ECHO.
ECHO.
ECHO Building file list...

CD /D "%Source%"
DIR %FileFilter% /A:-D-H-S /B %SubDirectories% > %FileList1%

FOR /F "tokens=1,2,3 delims=:" %%A IN ('FIND /C ":" %FileList1%') DO SET     TotalFiles=%%C
SET TotalFiles=%TotalFiles:~1%

ECHO The source has %TotalFiles% total files.
ECHO Enter the number of random files to copy to the destination.
SET /P FilesToCopy=
ECHO.

IF /I %TotalFiles% LSS %FilesToCopy% SET %FilesToCopy%=%TotalFiles%

SET Destination="%Destination%"
IF NOT EXIST %Destination% MKDIR %Destination%

SET ProgressTitle=Copying Random Files...

FOR /L %%A IN (1,1,%FilesToCopy%) DO (
    TITLE %ProgressTitle% %%A / %FilesToCopy%
    REM Pick a random file.
    SET /A RandomLine=!RANDOM! %% !TotalFiles!
    REM Go to the random file's line.
    SET Line=0
    FOR /F "usebackq tokens=*" %%F IN (%FileList1%) DO (
        IF !Line!==!RandomLine! (
            REM Found the line. Copy the file to the destination.
            XCOPY /V /Y "%%F" %Destination%
        ) ELSE (
            REM Not the random file, build the new list without this file included.
            ECHO %%F>> %FileList2%
        )
        SET /A Line=!Line! + 1
    )
    SET /A TotalFiles=!TotalFiles! - 1
    REM Update the master file list with the new list without the last file.
    DEL /F /Q %FileList1%
    RENAME %FileList2% %FileList1Name%
)

IF EXIST %FileList1% DEL /F /Q %FileList1%
IF EXIST %FileList2% DEL /F /Q %FileList2%

ENDLOCAL

The destination should be set in the .bat code like the code above. Can anybody please help me with this? Thanks in advance!

目的地应该在.bat代码中设置,就像上面的代码一样。有人可以帮帮我吗?提前致谢!

3 个解决方案

#1


1  

Copying a directory tree structure (folders only) is trivial with XCOPY.

使用XCOPY复制目录树结构(仅限文件夹)是微不足道的。

Selecting a random file from a given folder is not too difficult. First you need the count of files, using DIR /B to list them and FIND /C to count them. Then use the modulo operator to select a random number in the range. Finally use DIR /B to list them again, FINDSTR /N to number them, and another FINDSTR to select the Nth file.

从给定文件夹中选择随机文件并不困难。首先,您需要文件计数,使用DIR / B列出它们,使用FIND / C对它们进行计数。然后使用模运算符选择范围中的随机数。最后使用DIR / B再次列出它们,FINDSTR / N为它们编号,另一个FINDSTR用于选择第N个文件。

Perhaps the trickiest bit is dealing with relative paths. FOR /R can walk a directory tree, but it provides a full absolute path, which is great for the source, but doesn't do any good when trying to specify the destination.

也许最棘手的一点是处理相对路径。 FOR / R可以遍历目录树,但它提供了一个完整的绝对路径,这对于源很有用,但在尝试指定目标时没有任何好处。

There are a few things you could do. You can get the string length of the root source path, and then use substring operations to derive the relative path. See How do you get the string length in a batch file? for methods to compute string length.

你可以做一些事情。您可以获取根源路径的字符串长度,然后使用子字符串操作来派生相对路径。请参阅如何在批处理文件中获取字符串长度?用于计算字符串长度的方法。

Another option is to use FORFILES to walk the source tree and get relative paths directly, but it is extremely slow.

另一种选择是使用FORFILES来遍历源树并直接获得相对路径,但它非常慢。

But perhaps the simplest solution is to map unused drive letters to the root of your source and destination folders. This enables you to use the absolute paths directly (after removing the drive letter). This is the option I chose. The only negative aspect of this solution is you must know two unused drive letters for your system, so the script cannot be simply copied from one system to another. I suppose you could programatically discover unused drive letters, but I didn't bother.

但也许最简单的解决方案是将未使用的驱动器号映射到源文件夹和目标文件夹的根目录。这使您可以直接使用绝对路径(删除驱动器号后)。这是我选择的选项。此解决方案唯一不利的方面是您必须知道系统的两个未使用的驱动器号,因此不能简单地将脚本从一个系统复制到另一个系统。我想你可以以编程方式发现未使用的驱动器号,但我没有打扰。

Note: It is critical that the source tree does not contain the destination

注意:源树不包含目标是至关重要的

@echo off
setlocal

:: Define source and destination
set "source=c:\mySource"
set "destination=c:\test2\myDestination"

:: Replicate empty directory structure
xcopy /s /t /e /i "%source%" "%destination%"

:: Map unused drive letters to source and destination. Change letters as needed
subst y: "%source%"
subst z: "%destination%"

:: Walk the source tree, calling :processFolder for each directory.
for /r y:\ %%D in (.) do call :processFolder "%%~fD"

:: Cleanup and exit
subst y: /d
subst z: /d
exit /b


:processFolder
:: Count the files
for /f %%N in ('dir /a-d /b %1 2^>nul^|find /c /v ""') do set "cnt=%%N"

:: Nothing to do if folder is empty
if %cnt% equ 0 exit /b

:: Select a random number within the range
set /a N=%random% %% cnt + 1

:: copy the Nth file
for /f "delims=: tokens=2" %%F in (
  'dir /a-d /b %1^|findstr /n .^|findstr "^%N%:"'
) do copy "%%D\%%F" "z:%%~pnxD" >nul

exit /b

EDIT

I fixed an obscure bug in the above code. The original COPY line read as follows:

我在上面的代码中修复了一个不起眼的bug。原始COPY行如下:

copy "%%~1\%%F" "z:%%~pnx1" >nul

That version fails if any of the folders within the source tree contain %D or %F in their name. This type of problem always exists within a FOR loop if you expand a variable with %var% or expand a :subroutine parameter with %1.

如果源树中的任何文件夹名称中包含%D或%F,则该版本将失败。如果使用%var%扩展变量或使用%1扩展a:子例程参数,则此类问题始终存在于FOR循环中。

The problem is easily fixed by using %%D instead of %1. It is counter-intuitive, but FOR variables are global in scope as long as any FOR loop is currently active. The %%D is inaccessible throughout most of the :processFolder routine, but it is available within the FOR loops.

使用%% D而不是%1可以轻松解决问题。这是违反直觉的,但只要FOR循环当前处于活动状态,FOR变量就是全局范围的。 %% D在大多数:processFolder例程中都是不可访问的,但它在FOR循环中可用。

#2


1  

The "natural" way to process a directory tree is via a recursive subroutine; this method minimize the problems inherent to this process. As I said at this post: "You may write a recursive algorithm in Batch that gives you exact control of what you do in every nested subdirectory". I taken the code at this answer, that duplicate a tree, and slightly modified it in order to solve this problem.

处理目录树的“自然”方式是通过递归子程序;此方法可最大限度地减少此过程固有的问题。正如我在这篇文章中所说:“你可以在Batch中编写一个递归算法,它可以精确控制你在每个嵌套子目录中所做的事情”。我在这个答案中采用了代码,复制了一个树,并稍微修改了它以解决这个问题。

@echo off
setlocal

set "Destination=H:\Temp"
set "FileFilter=*.ape"

rem Enter to source folder and process it
cd /D "%~dp1"
call :processFolder
goto :EOF


:processFolder
setlocal EnableDelayedExpansion

rem For each folder in this level
for /D %%a in (*) do (

   rem Enter into it, process it and go back to original
   cd "%%a"
   set "Destination=%Destination%\%%a"
   if not exist "!Destination!" md "!Destination!"

   rem Get the files in this folder and copy a random one
   set "n=0"
   for %%b in (%FileFilter%) do (
      set /A n+=1
      set "file[!n!]=%%b"
   )
   if !n! gtr 0 (
      set /A "rnd=!random! %% n + 1"
      for %%i in (!rnd!) do copy "!file[%%i]!" "!Destination!"
   )

   call :processFolder
   cd ..
)
exit /B

#3


0  

Here is anther approach using xcopy /L to walk through all files in the source directory, which does not actually copy anything due to /L but returns paths relative to the source directory. For explanation of the code see all the remarks:

以下是使用xcopy / L遍历源目录中的所有文件的另一种方法,该目录实际上不会复制因/ L而导致的任何内容,而是返回相对于源目录的路径。有关代码的说明,请参阅所有备注:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem Define source and destination directories here:
set "SOURCE=%dp~1"
set "DESTIN=H:\Temp"

rem Change to source directory:
cd /D "%SOURCE%"
rem Reset index number:
set /A "INDEX=0"
rem Walk through output of `xcopy /L`, which returns
rem all files in source directory as relative paths;
rem `find` filters out the summary line; `echo` appends one more line
rem with invalid path, just to process the last item as well:
for /F "delims=" %%F in ('
    2^> nul xcopy /L /S /I /Y "." "%TEMP%" ^
        ^| find ".\" ^
        ^& echo^(C:\^^^|\^^^|
') do (
    rem Store path to parent directory of current item:
    set "CURRPATH=%%~dpF"
    setlocal EnableDelayedExpansion
    if !INDEX! EQU 0 (
        rem First item, so build empty directory tree:
        xcopy /T /E /Y "." "%DESTIN%"
        endlocal
        rem Set index and first array element, holding
        rem all files present in the current directory:
        set /A "INDEX=1"
        set "ITEMS_1=%%F"
    ) else if "!CURRPATH!"=="!PREVPATH!" (
        rem Previous parent directory equals current one,
        rem so increment index and store current file:
        set /A "INDEX+=1"
        for %%I in (!INDEX!) do (
            endlocal
            set /A "INDEX=%%I"
            set "ITEMS_%%I=%%F"
        )
    ) else (
        rem Current parent directory is not the previous one,
        rem so generate random number from 1 to recent index
        rem to select a file in the previous parent directory,
        rem perform copying task, then reset index and store
        rem the parent directory of the current (next) item:
        set /A "INDEX=!RANDOM!%%!INDEX!+1"
        for %%I in (!INDEX!) do (
            xcopy /Y "!ITEMS_%%I!" "%DESTIN%\!ITEMS_%%I!"
            endlocal
            set /A "INDEX=1"
            set "ITEMS_1=%%F"
        )
    )
    rem Store path to parent directory of previous item:
    set "PREVPATH=%%~dpF"
)
endlocal
exit /B

For this approach the destination directory can also be located within the source directory tree.

对于此方法,目标目录也可以位于源目录树中。

#1


1  

Copying a directory tree structure (folders only) is trivial with XCOPY.

使用XCOPY复制目录树结构(仅限文件夹)是微不足道的。

Selecting a random file from a given folder is not too difficult. First you need the count of files, using DIR /B to list them and FIND /C to count them. Then use the modulo operator to select a random number in the range. Finally use DIR /B to list them again, FINDSTR /N to number them, and another FINDSTR to select the Nth file.

从给定文件夹中选择随机文件并不困难。首先,您需要文件计数,使用DIR / B列出它们,使用FIND / C对它们进行计数。然后使用模运算符选择范围中的随机数。最后使用DIR / B再次列出它们,FINDSTR / N为它们编号,另一个FINDSTR用于选择第N个文件。

Perhaps the trickiest bit is dealing with relative paths. FOR /R can walk a directory tree, but it provides a full absolute path, which is great for the source, but doesn't do any good when trying to specify the destination.

也许最棘手的一点是处理相对路径。 FOR / R可以遍历目录树,但它提供了一个完整的绝对路径,这对于源很有用,但在尝试指定目标时没有任何好处。

There are a few things you could do. You can get the string length of the root source path, and then use substring operations to derive the relative path. See How do you get the string length in a batch file? for methods to compute string length.

你可以做一些事情。您可以获取根源路径的字符串长度,然后使用子字符串操作来派生相对路径。请参阅如何在批处理文件中获取字符串长度?用于计算字符串长度的方法。

Another option is to use FORFILES to walk the source tree and get relative paths directly, but it is extremely slow.

另一种选择是使用FORFILES来遍历源树并直接获得相对路径,但它非常慢。

But perhaps the simplest solution is to map unused drive letters to the root of your source and destination folders. This enables you to use the absolute paths directly (after removing the drive letter). This is the option I chose. The only negative aspect of this solution is you must know two unused drive letters for your system, so the script cannot be simply copied from one system to another. I suppose you could programatically discover unused drive letters, but I didn't bother.

但也许最简单的解决方案是将未使用的驱动器号映射到源文件夹和目标文件夹的根目录。这使您可以直接使用绝对路径(删除驱动器号后)。这是我选择的选项。此解决方案唯一不利的方面是您必须知道系统的两个未使用的驱动器号,因此不能简单地将脚本从一个系统复制到另一个系统。我想你可以以编程方式发现未使用的驱动器号,但我没有打扰。

Note: It is critical that the source tree does not contain the destination

注意:源树不包含目标是至关重要的

@echo off
setlocal

:: Define source and destination
set "source=c:\mySource"
set "destination=c:\test2\myDestination"

:: Replicate empty directory structure
xcopy /s /t /e /i "%source%" "%destination%"

:: Map unused drive letters to source and destination. Change letters as needed
subst y: "%source%"
subst z: "%destination%"

:: Walk the source tree, calling :processFolder for each directory.
for /r y:\ %%D in (.) do call :processFolder "%%~fD"

:: Cleanup and exit
subst y: /d
subst z: /d
exit /b


:processFolder
:: Count the files
for /f %%N in ('dir /a-d /b %1 2^>nul^|find /c /v ""') do set "cnt=%%N"

:: Nothing to do if folder is empty
if %cnt% equ 0 exit /b

:: Select a random number within the range
set /a N=%random% %% cnt + 1

:: copy the Nth file
for /f "delims=: tokens=2" %%F in (
  'dir /a-d /b %1^|findstr /n .^|findstr "^%N%:"'
) do copy "%%D\%%F" "z:%%~pnxD" >nul

exit /b

EDIT

I fixed an obscure bug in the above code. The original COPY line read as follows:

我在上面的代码中修复了一个不起眼的bug。原始COPY行如下:

copy "%%~1\%%F" "z:%%~pnx1" >nul

That version fails if any of the folders within the source tree contain %D or %F in their name. This type of problem always exists within a FOR loop if you expand a variable with %var% or expand a :subroutine parameter with %1.

如果源树中的任何文件夹名称中包含%D或%F,则该版本将失败。如果使用%var%扩展变量或使用%1扩展a:子例程参数,则此类问题始终存在于FOR循环中。

The problem is easily fixed by using %%D instead of %1. It is counter-intuitive, but FOR variables are global in scope as long as any FOR loop is currently active. The %%D is inaccessible throughout most of the :processFolder routine, but it is available within the FOR loops.

使用%% D而不是%1可以轻松解决问题。这是违反直觉的,但只要FOR循环当前处于活动状态,FOR变量就是全局范围的。 %% D在大多数:processFolder例程中都是不可访问的,但它在FOR循环中可用。

#2


1  

The "natural" way to process a directory tree is via a recursive subroutine; this method minimize the problems inherent to this process. As I said at this post: "You may write a recursive algorithm in Batch that gives you exact control of what you do in every nested subdirectory". I taken the code at this answer, that duplicate a tree, and slightly modified it in order to solve this problem.

处理目录树的“自然”方式是通过递归子程序;此方法可最大限度地减少此过程固有的问题。正如我在这篇文章中所说:“你可以在Batch中编写一个递归算法,它可以精确控制你在每个嵌套子目录中所做的事情”。我在这个答案中采用了代码,复制了一个树,并稍微修改了它以解决这个问题。

@echo off
setlocal

set "Destination=H:\Temp"
set "FileFilter=*.ape"

rem Enter to source folder and process it
cd /D "%~dp1"
call :processFolder
goto :EOF


:processFolder
setlocal EnableDelayedExpansion

rem For each folder in this level
for /D %%a in (*) do (

   rem Enter into it, process it and go back to original
   cd "%%a"
   set "Destination=%Destination%\%%a"
   if not exist "!Destination!" md "!Destination!"

   rem Get the files in this folder and copy a random one
   set "n=0"
   for %%b in (%FileFilter%) do (
      set /A n+=1
      set "file[!n!]=%%b"
   )
   if !n! gtr 0 (
      set /A "rnd=!random! %% n + 1"
      for %%i in (!rnd!) do copy "!file[%%i]!" "!Destination!"
   )

   call :processFolder
   cd ..
)
exit /B

#3


0  

Here is anther approach using xcopy /L to walk through all files in the source directory, which does not actually copy anything due to /L but returns paths relative to the source directory. For explanation of the code see all the remarks:

以下是使用xcopy / L遍历源目录中的所有文件的另一种方法,该目录实际上不会复制因/ L而导致的任何内容,而是返回相对于源目录的路径。有关代码的说明,请参阅所有备注:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem Define source and destination directories here:
set "SOURCE=%dp~1"
set "DESTIN=H:\Temp"

rem Change to source directory:
cd /D "%SOURCE%"
rem Reset index number:
set /A "INDEX=0"
rem Walk through output of `xcopy /L`, which returns
rem all files in source directory as relative paths;
rem `find` filters out the summary line; `echo` appends one more line
rem with invalid path, just to process the last item as well:
for /F "delims=" %%F in ('
    2^> nul xcopy /L /S /I /Y "." "%TEMP%" ^
        ^| find ".\" ^
        ^& echo^(C:\^^^|\^^^|
') do (
    rem Store path to parent directory of current item:
    set "CURRPATH=%%~dpF"
    setlocal EnableDelayedExpansion
    if !INDEX! EQU 0 (
        rem First item, so build empty directory tree:
        xcopy /T /E /Y "." "%DESTIN%"
        endlocal
        rem Set index and first array element, holding
        rem all files present in the current directory:
        set /A "INDEX=1"
        set "ITEMS_1=%%F"
    ) else if "!CURRPATH!"=="!PREVPATH!" (
        rem Previous parent directory equals current one,
        rem so increment index and store current file:
        set /A "INDEX+=1"
        for %%I in (!INDEX!) do (
            endlocal
            set /A "INDEX=%%I"
            set "ITEMS_%%I=%%F"
        )
    ) else (
        rem Current parent directory is not the previous one,
        rem so generate random number from 1 to recent index
        rem to select a file in the previous parent directory,
        rem perform copying task, then reset index and store
        rem the parent directory of the current (next) item:
        set /A "INDEX=!RANDOM!%%!INDEX!+1"
        for %%I in (!INDEX!) do (
            xcopy /Y "!ITEMS_%%I!" "%DESTIN%\!ITEMS_%%I!"
            endlocal
            set /A "INDEX=1"
            set "ITEMS_1=%%F"
        )
    )
    rem Store path to parent directory of previous item:
    set "PREVPATH=%%~dpF"
)
endlocal
exit /B

For this approach the destination directory can also be located within the source directory tree.

对于此方法,目标目录也可以位于源目录树中。