我如何比较golang的两个文件?

时间:2022-10-26 12:06:22

With Python I can do the next:

用Python,我可以做下一个:

equals = filecmp.cmp(file_old, file_new)

Is there any builtin function to do that in go language? I googled it but without success.

在go语言中有什么内置函数吗?我用谷歌搜索了一下,但没有成功。

I could use some hash function in hash/crc32 package, but that is more work that the above Python code.

我可以在散列/crc32包中使用一些散列函数,但这比上面的Python代码更有效。

4 个解决方案

#1


8  

I am not sure that function does what you think it does. From the docs,

我不确定这个函数是否符合你的想法。的文档,

Unless shallow is given and is false, files with identical os.stat() signatures are taken to be equal.

除非给出的是浅的,并且是假的,否则使用相同的os.stat()签名的文件将是相等的。

Your call is comparing only the signature of os.stat, which only includes:

您的调用只比较操作系统的签名。统计,它只包括:

  1. File mode
  2. 文件模式
  3. Modified Time
  4. 修改时间
  5. Size
  6. 大小

You can learn all three of these things in Go from the os.Stat function. This really would only indicate that they are literally the same file, or symlinks to the same file, or a copy of that file.

你可以从操作系统中学习这三种方法。统计功能。这实际上只表明它们实际上是相同的文件,或者是同一个文件的符号链接,或者是该文件的副本。

If you want to go deeper you can open both files and compare them (python version reads 8k at a time).

如果您想要深入,您可以打开两个文件并比较它们(python版本一次读取8k)。

You could use an crc or md5 to hash both files, but if there are differences at the beginning of a long file, you want to stop early. I would recommend reading some number of bytes at a time from each reader and comparing with bytes.Compare.

您可以使用crc或md5来对两个文件进行散列,但是如果在长文件的开头有差异,那么您希望尽早停止。我建议在每个读者的时间内阅读一些字节数,并与bytes.Compare比较。

#2


8  

To complete the @captncraig answer, if you want to know if the two files are the same, you can use the SameFile(fi1, fi2 FileInfo) method from the OS package.

要完成@captncraig的回答,如果您想知道这两个文件是否相同,您可以从操作系统包中使用SameFile(fi1, fi2 FileInfo)方法。

SameFile reports whether fi1 and fi2 describe the same file. For example, on Unix this means that the device and inode fields of the two underlying structures are identical;

SameFile报告fi1和fi2是否描述了相同的文件。例如,在Unix上,这意味着两个底层结构的设备和inode字段是相同的;

Otherwise, if you want to check the files contents, here is a solution which checks the two files line by line avoiding the load of the entire files in memory.

否则,如果您想要检查文件内容,这里有一个解决方案,它可以通过避免在内存中加载整个文件的方式来检查两个文件。

First try: https://play.golang.org/p/NlQZRrW1dT

第一次尝试:https://play.golang.org/p/NlQZRrW1dT


EDIT: Read by bytes chunks and fail fast if the files have not the same size. https://play.golang.org/p/YyYWuCRJXV

编辑:以字节为单位读取,如果文件大小不相同,则快速失败。https://play.golang.org/p/YyYWuCRJXV

const chunkSize = 64000

func deepCompare(file1, file2 string) bool {
    // Check file size ...

    f1, err := os.Open(file1)
    if err != nil {
        log.Fatal(err)
    }

    f2, err := os.Open(file2)
    if err != nil {
        log.Fatal(err)
    }

    for {
        b1 := make([]byte, chunkSize)
        _, err1 := f1.Read(b1)

        b2 := make([]byte, chunkSize)
        _, err2 := f2.Read(b2)

        if err1 != nil || err2 != nil {
            if err1 == io.EOF && err2 == io.EOF {
                return true
            } else if err1 == io.EOF || err2 == io.EOF {
                return false
            } else {
                log.Fatal(err1, err2)
            }
        }

        if !bytes.Equal(b1, b2) {
            return false
        }
    }
}

#3


3  

How about using bytes.Equal?

使用bytes.Equal怎么样?

package main

import (
"fmt"
"io/ioutil"
"log"
"bytes"
)

func main() {
    // per comment, better to not read an entire file into memory
    // this is simply a trivial example.
    f1, err1 := ioutil.ReadFile("lines1.txt")

    if err1 != nil {
        log.Fatal(err1)
    }

    f2, err2 := ioutil.ReadFile("lines2.txt")

    if err2 != nil {
        log.Fatal(err2)
    }

    fmt.Println(bytes.Equal(f1, f2)) // Per comment, this is significantly more performant.
}

#4


0  

You can use a package like equalfile

您可以使用像equalfile这样的包。

Main API:

主要API:

func CompareFile(path1, path2 string) (bool, error)

Godoc: https://godoc.org/github.com/udhos/equalfile

Godoc:https://godoc.org/github.com/udhos/equalfile

Example:

例子:

package main

import (
    "fmt"
    "os"
    "github.com/udhos/equalfile"
 )

func main() {
    if len(os.Args) != 3 {
        fmt.Printf("usage: equal file1 file2\n")
        os.Exit(2)
    }

    file1 := os.Args[1]
    file2 := os.Args[2]

    equal, err := equalfile.CompareFile(file1, file2)
    if err != nil {
        fmt.Printf("equal: error: %v\n", err)
        os.Exit(3)
    }

    if equal {
        fmt.Println("equal: files match")
        os.Exit(0)
    }

    fmt.Println("equal: files differ")
    os.Exit(1)
}

#1


8  

I am not sure that function does what you think it does. From the docs,

我不确定这个函数是否符合你的想法。的文档,

Unless shallow is given and is false, files with identical os.stat() signatures are taken to be equal.

除非给出的是浅的,并且是假的,否则使用相同的os.stat()签名的文件将是相等的。

Your call is comparing only the signature of os.stat, which only includes:

您的调用只比较操作系统的签名。统计,它只包括:

  1. File mode
  2. 文件模式
  3. Modified Time
  4. 修改时间
  5. Size
  6. 大小

You can learn all three of these things in Go from the os.Stat function. This really would only indicate that they are literally the same file, or symlinks to the same file, or a copy of that file.

你可以从操作系统中学习这三种方法。统计功能。这实际上只表明它们实际上是相同的文件,或者是同一个文件的符号链接,或者是该文件的副本。

If you want to go deeper you can open both files and compare them (python version reads 8k at a time).

如果您想要深入,您可以打开两个文件并比较它们(python版本一次读取8k)。

You could use an crc or md5 to hash both files, but if there are differences at the beginning of a long file, you want to stop early. I would recommend reading some number of bytes at a time from each reader and comparing with bytes.Compare.

您可以使用crc或md5来对两个文件进行散列,但是如果在长文件的开头有差异,那么您希望尽早停止。我建议在每个读者的时间内阅读一些字节数,并与bytes.Compare比较。

#2


8  

To complete the @captncraig answer, if you want to know if the two files are the same, you can use the SameFile(fi1, fi2 FileInfo) method from the OS package.

要完成@captncraig的回答,如果您想知道这两个文件是否相同,您可以从操作系统包中使用SameFile(fi1, fi2 FileInfo)方法。

SameFile reports whether fi1 and fi2 describe the same file. For example, on Unix this means that the device and inode fields of the two underlying structures are identical;

SameFile报告fi1和fi2是否描述了相同的文件。例如,在Unix上,这意味着两个底层结构的设备和inode字段是相同的;

Otherwise, if you want to check the files contents, here is a solution which checks the two files line by line avoiding the load of the entire files in memory.

否则,如果您想要检查文件内容,这里有一个解决方案,它可以通过避免在内存中加载整个文件的方式来检查两个文件。

First try: https://play.golang.org/p/NlQZRrW1dT

第一次尝试:https://play.golang.org/p/NlQZRrW1dT


EDIT: Read by bytes chunks and fail fast if the files have not the same size. https://play.golang.org/p/YyYWuCRJXV

编辑:以字节为单位读取,如果文件大小不相同,则快速失败。https://play.golang.org/p/YyYWuCRJXV

const chunkSize = 64000

func deepCompare(file1, file2 string) bool {
    // Check file size ...

    f1, err := os.Open(file1)
    if err != nil {
        log.Fatal(err)
    }

    f2, err := os.Open(file2)
    if err != nil {
        log.Fatal(err)
    }

    for {
        b1 := make([]byte, chunkSize)
        _, err1 := f1.Read(b1)

        b2 := make([]byte, chunkSize)
        _, err2 := f2.Read(b2)

        if err1 != nil || err2 != nil {
            if err1 == io.EOF && err2 == io.EOF {
                return true
            } else if err1 == io.EOF || err2 == io.EOF {
                return false
            } else {
                log.Fatal(err1, err2)
            }
        }

        if !bytes.Equal(b1, b2) {
            return false
        }
    }
}

#3


3  

How about using bytes.Equal?

使用bytes.Equal怎么样?

package main

import (
"fmt"
"io/ioutil"
"log"
"bytes"
)

func main() {
    // per comment, better to not read an entire file into memory
    // this is simply a trivial example.
    f1, err1 := ioutil.ReadFile("lines1.txt")

    if err1 != nil {
        log.Fatal(err1)
    }

    f2, err2 := ioutil.ReadFile("lines2.txt")

    if err2 != nil {
        log.Fatal(err2)
    }

    fmt.Println(bytes.Equal(f1, f2)) // Per comment, this is significantly more performant.
}

#4


0  

You can use a package like equalfile

您可以使用像equalfile这样的包。

Main API:

主要API:

func CompareFile(path1, path2 string) (bool, error)

Godoc: https://godoc.org/github.com/udhos/equalfile

Godoc:https://godoc.org/github.com/udhos/equalfile

Example:

例子:

package main

import (
    "fmt"
    "os"
    "github.com/udhos/equalfile"
 )

func main() {
    if len(os.Args) != 3 {
        fmt.Printf("usage: equal file1 file2\n")
        os.Exit(2)
    }

    file1 := os.Args[1]
    file2 := os.Args[2]

    equal, err := equalfile.CompareFile(file1, file2)
    if err != nil {
        fmt.Printf("equal: error: %v\n", err)
        os.Exit(3)
    }

    if equal {
        fmt.Println("equal: files match")
        os.Exit(0)
    }

    fmt.Println("equal: files differ")
    os.Exit(1)
}