MIT的《深度学习》精读(23)
There are two main ways of measuring the depth of a model. The first view is based on the number of sequential instructions that must be executed to evaluate the architecture. We can think of this as the length of the longest path through a flow chart that describes how to compute each of the model’s outputs given its inputs. Just as two equivalent computer programs will have different lengths depending on which language the program is written in, the same function may be drawn as a flowchart with different depths depending on which functions we allow to be used as individual steps in the flowchart. Figure 1.3 illustrates how this choice of language can give two different measurements for the same architecture.
深度学习的模型分类主要有两种主要度量方式,第一种度量方式是以模型架构执行指令的数量来分类。我们可以认为输入到输出的模型,数据所经历流程图的最长的路径长度作为判断标准。比如有两个功能相同的程序,使用不同的语言来开发,所编写出来的程序长度不一样。即使是同样的函数,使用流程图来画出来也会有不同的长度,因为取决于你采用多大的功能表示一步的工作。图1.3说明了同样的架构之下,采用不语言来描述时有不同的度量方式。
Figure 1.3: Illustration of computational graphs mapping an input to an output where each node performs an operation. Depth is the length of the longest path from input to output but depends on the definition of what constitutes a possible computational step. The computation depicted in these graphs is the output of a logistic regression model, σ(wT x), where σ is the logistic sigmoid function. If we use addition, multiplication and logistic sigmoids as the elements of our computer language, then this model has depth three. If we view logistic regression as an element itself, then this model has depth one.
图1.3:从输入到输出,每个节点在计算图里的说明。从输入到输出,数据计算的最长路径是取决于每一步使用什么样的计算。在这些图中所描述的计算是一个logistic回归模型的输出,σ(wT x)函数中的σ是逻辑sigmoid函数。如果我们使用加法、乘法和逻辑sigmoid函数作为计算语言,那么这个模型的深度是三。如果我们采用逻辑回归作一个步计算,那么这个模型的深度是一。