如何在不使用引用的情况下删除f#序列中的重复项?

I have a sorted sequence and want to go through it and return the unique entries in the sequence. I can do it using the following function, but it uses reference variables and I don't think it's the correct way of solving the problem.

我有一个排序过的序列，想要遍历它并返回序列中唯一的元素。我可以用下面的函数来做，但是它使用了引用变量，我认为这不是解决问题的正确方法。

    let takeFirstCell sectors = 
        let currentRNCId = ref -1
        let currentCellId = ref -1
        seq {
            for sector in sectors do
                if sector.RNCId <> !currentRNCId || sector.CellId <> !currentCellId then
                    currentRNCId := sector.RNCId
                    currentCellId := sector.CellId
                    yield sector
        }

How can I do this in a functional way?

我怎么能用函数来做呢?

6 个解决方案

#1

[1;1;1;2;2;2;3;3;3]
|> Seq.distinctBy id
|> printfn "%A"

#2

Seq.distinct (1::[1..5]) returns seq [1; 2; 3; 4; 5]. Is that what you meant?

Seq。区别(1::[1..5])返回seq [1;2;3;4;5)。你是这个意思吗?

#3

distinct and distinctBy both use Dictionary and therefore require hashing and a bit of memory for storing unique items. If your sequence is already sorted, you can use the following approach (similar to yours). It's nearly twice as fast and has constant memory use, making it usable for sequences of any size.

不同的和不同的都使用字典，因此需要散列和一些内存来存储唯一的项。如果序列已经排序，您可以使用以下方法(类似于您的方法)。它的速度几乎是它的两倍，并且具有持久的内存使用，使它可以用于任何大小的序列。

let distinctWithoutHash (items:seq<_>) =
  seq {
    use e = items.GetEnumerator()
    if e.MoveNext() then
      let prev = ref e.Current
      yield !prev
      while e.MoveNext() do
        if e.Current <> !prev then 
          yield e.Current
          prev := e.Current
  }

let items = Seq.init 1000000 (fun i -> i / 2)
let test f = items |> f |> (Seq.length >> printfn "%d")

test Seq.distinct        //Real: 00:00:01.038, CPU: 00:00:01.435, GC gen0: 47, gen1: 1, gen2: 1
test distinctWithoutHash //Real: 00:00:00.622, CPU: 00:00:00.624, GC gen0: 44, gen1: 0, gen2: 0

I couldn't figure out a way to use mutables instead of refs (short of hand-coding an enumerator), ~~which I'm sure would speed it up considerably~~ (I tried it--it makes no difference).

我想不出一种方法来使用mutables而不是refs(除了手工编写枚举数)，我确信这会大大加快它的速度(我尝试过了——这没什么区别)。

#4

Just initialize a unique collection (like a set) with the sequence like this:

只需用如下序列初始化一个唯一集合(如集合):

set [1; 2; 3; 3; 4; 5; 5];;
=> val it : Set<int> = set [1; 2; 3; 4; 5]

#5

In my case I could not use Seq.distinct because I needed to preserve order of list elements. I used solution from http://ocaml.org/learn/tutorials/99problems.html. I think it is quite short

在我的情况下，我不能使用Seq。因为我需要保持列表元素的顺序。我使用了http://ocaml.org/learn/tutorials/99problems.html中的解决方案。我觉得很短

let rec compress = function
    | a :: (b :: _ as t) -> if a = b then compress t else a :: compress t
    | smaller -> smaller

#6

The solution below, preserves order of elements and returns only the first occurance of an element in a generic list. Of course this generates a new List with the redundant items removed.

下面的解决方案保留了元素的顺序，并且只返回泛型列表中的第一个元素。当然，这会生成一个删除冗余项的新列表。

//  ****  Returns a list having subsequent redundant elements removed
let removeDuplicates(lst : 'a list) = 
    let f item acc =
        match acc with 
        | [] -> [item]
        | _ ->
            match List.exists(fun x -> x = item) acc with
            | false -> item :: acc
            | true -> acc
    lst 
    |> List.rev
    |> fun x -> List.foldBack f x []
    |> List.rev
//  **** END OF FUNCTION removeDuplicates

val removeDuplicates : 'a list -> 'a list when 'a : equality
val testList : int list = [1; 4; 3; 1; 2; 2; 1; 1; 3; 4; 3]
val tryAbove : int list = [1; 4; 3; 2]

#1

[1;1;1;2;2;2;3;3;3]
|> Seq.distinctBy id
|> printfn "%A"

#2