使用c中的链表进行排序

Im implementing a Sort. The program reads from a text file with ASCII characters. There are two elements per line separated by spaces. Suppose the input is "a b". This defines a precedence relationship between a and b, saying that "a must occur before b".

我实现。该程序从带有ASCII字符的文本文件中读取数据。每行有两个元素，由空格分隔。假设输入是“a b”。这定义了a和b之间的优先关系，说“a必须在b之前发生”。

So if the file is

如果文件是。

a b
d c
b d

the output is

输出是

a
b
d
c

I have created two linked lists

我创建了两个链表

bigList: to store the unique elements (count to keep track of preceding elements)
bigList:存储唯一的元素(计数以跟踪前面的元素)
smallList: to store preceding elements
smallList:存储前面的元素。
List item
列表项

Summary of what my code does

我的代码所做的总结

reads the file line by line
逐行读取文件。
grabs the two elements per line
获取每行的两个元素
checks whether if they are already present, if not inserts them
检查它们是否已经存在，如果没有插入的话
prints out the result based on the count number
根据计数号打印结果

it actually prints out all the elements in the file, like for the above input my output is

它实际上打印出文件中的所有元素，就像上面输入的输出一样

a
b
b
d
d
c

I am new to C programming and please let me know what I'm doing wrong.

我是C编程新手，请告诉我我做错了什么。

1 个解决方案

#1

I don't know a whole lot about topological sort, but I think I know enough to leave an intelligent comment. Anyone should feel free to edit this response.

我不太了解拓扑排序，但我想我知道的足够多，可以留下一个明智的评论。任何人都可以随意编辑这个回复。

I see several issues with this implementation. Some related to C, others related to the algorithm, so let me go through them one at a time.

我看到了这个实现的几个问题。有些和C有关，有些和算法有关，我一次讲一个。

Problem definition. Topological sort is indeed defined as a precedence of elements in a directed graph. However, this sentence alone does not define the problem completely. Specifically, topological sort is a precedence of elements of a graph starting with a specified source vertex. As an examples suppose you have the following directed graph:

问题的定义。拓扑排序确实被定义为有向图中元素的优先级。然而，仅仅这句话并不能完全定义问题。具体地说，拓扑排序是从指定的源顶点开始的图形元素的优先级。作为一个例子，假设你有以下有向图:

a -> b
b -> c
c -> a

If you start with vertex a, your topological ordering should be {a, b, c}. If you start with vertex c, your topological order should be {c, a, b}. So the problem definition makes no sense without a source vertex. One choice for such vertex could be some vertex of the graph which has no edged pointing to it, i.e. every incident edge is an outgoing edge.

如果你从顶点a开始，你的拓扑排序应该是{a, b, c}。如果你从顶点c开始，你的拓扑顺序应该是{c, a, b}。因此，没有源顶点的问题定义是没有意义的。这种顶点的一个选择可以是没有边指向的图的某个顶点，也就是说，每个入射边都是一条出边。

Another thing to keep in mind is graph connectedness. It's not always possible to get to any vertex from any other vertex. So it's worth keeping in mind when implementing such algorithms.

另一件需要记住的事情是图的连通性。从任何其他顶点到达任何顶点都是不可能的。因此，在实现这些算法时，需要记住这一点。

Good data structures are key to good algorithms. If you want to sort things in a directed graph, your best bet is to create a directed graph data structure, which itself would involve creating a Node data structure and an Edge data structure. I suggest looking up adjacency lists. Once you have such data structure in place, it's a matter of running breadth first search on the graph, and you get your topological precedence as a neat consequence.

好的数据结构是好的算法的关键。如果想在有向图中排序，最好的方法是创建有向图数据结构，这本身就需要创建节点数据结构和边缘数据结构。我建议查找邻接表。一旦有了这样的数据结构，就需要首先在图上执行广度搜索，然后得到拓扑优先级作为一个简洁的结果。

When implementing an adjacency list, you still need to store all of your elements in one place. A linked list is generally not the best way to do so because it takes constant time to insert into one (assuming to sorting on data), it takes linear time to search through one. That's way suboptimal. As @David RF suggested, Red-Black trees and AVL trees would be the way to go. However, I wouldn't start with this optimization. As long as you have a sound working algorithm, you can always improve your storage data structure. After all, the interface to linked lists and search trees is the same.

在实现邻接列表时，仍然需要将所有元素存储在一个地方。链表通常不是最好的方法，因为插入一个链表需要常数时间(假设对数据进行排序)，搜索一个链表需要线性时间。这是次优的方法。正如@David RF所建议的那样，红黑树和AVL树将是一条路。但是，我不会从这个优化开始。只要您有一个良好的工作算法，您总是可以改进您的存储数据结构。毕竟，链接列表和搜索树的界面是一样的。

The algorithm can be fast, given that you use the right algorithm. I haven't dealt with topological sorts in practice, so I don't know every intricacy and every edge case. But! If you do it with breadth first search using conventional node-edge data structures (note that edges can be implicitly defined within nodes) your search itself should take linear time using breadth-first search.

如果你使用了正确的算法，算法可以很快。我在实践中没有涉及到拓扑排序，所以我不知道每一个复杂和每一个边界情况。但是!如果您使用传统的节点边缘数据结构进行广度优先搜索(注意，可以在节点中隐式地定义边缘)，那么使用广度优先搜索，搜索本身就需要线性时间。

I've read through your algorithm, and I have to admit that I'm not quite grasping your concept of a big list and a small list. The ambiguous names don't really help. Perhaps it does the job with a single tiny bug hiding somewhere, but it's not too readable. Maybe someone else can comment on your current implementation.

我读了你的算法，我不得不承认我没有很好地理解你的大列表和小列表的概念。这些模棱两可的名字并没有真正的帮助。也许它是用一个小错误隐藏在某个地方，但是它不太容易读。也许其他人可以评论您当前的实现。

#1