Convolutional Neural Networks in Visual Computing_A Concise Guide-CRC(2018).pdf

时间:2021-03-01 15:26:44
【文件属性】:

文件名称:Convolutional Neural Networks in Visual Computing_A Concise Guide-CRC(2018).pdf

文件大小:5.85MB

文件格式:PDF

更新时间:2021-03-01 15:26:44

人工智能 Neural 深度学习

Deep learning architectures have attained incredible popularity in recent years due to their phenomenal success in, among other appli- cations, computer vision tasks. Particularly, convolutional neural networks (CNNs) have been a signi cant force contributing to state- of-the-art results. e jargon surrounding deep learning and CNNs can often lead to the opinion that it is too labyrinthine for a beginner to study and master. Having this in mind, this book covers the funda- mentals of deep learning for computer vision, designing and deploying CNNs, and deep computer vision architecture. is concise book was intended to serve as a beginner’s guide for engineers, undergraduate seniors, and graduate students who seek a quick start on learning and/ or building deep learning systems of their own. Written in an easy- to-read, mathematically nonabstruse tone, this book aims to provide a gentle introduction to deep learning for computer vision, while still covering the basics in ample depth. e core of this book is divided into ve chapters. Chapter 1 pro- vides a succinct introduction to image representations and some com- puter vision models that are contemporarily referred to as hand-carved. e chapter provides the reader with a fundamental understanding of image representations and an introduction to some linear and non- linear feature extractors or representations and to properties of these representations. Onwards, this chapter also demonstrates detection xi xii PrefaCe of some basic image entities such as edges. It also covers some basic machine learning tasks that can be performed using these representa- tions. e chapter concludes with a study of two popular non-neural computer vision modeling techniques. Chapter 2 introduces the concepts of regression, learning machines, and optimization. is chapter begins with an introduction to super- vised learning. e rst learning machine introduced is the linear regressor. e rst solution covered is the analytical solution for least squares. is analytical solution is studied alongside its maximum- likelihood interpretation. e chapter moves on to nonlinear models through basis function expansion. e problem of over tting and gen- eralization through cross-validation and regularization is further intro- duced. e latter part of the chapter introduces optimization through gradient descent for both convex and nonconvex error surfaces. Further expanding our study with various types of gradient descent methods and the study of geometries of various regularizers, some modi cations to the basic gradient descent method, including second-order loss mini- mization techniques and learning with momentum, are also presented. Chapters 3 and 4 are the crux of this book. Chapter 3 builds on Chapter 2 by providing an introduction to the Rosenblatt perceptron and the perceptron learning algorithm. e chapter then introduces a logistic neuron and its activation. e single neuron model is studied in both a two-class and a multiclass setting. e advantages and draw- backs of this neuron are studied, and the XOR problem is introduced. e idea of a multilayer neural network is proposed as a solution to the XOR problem, and the backpropagation algorithm, introduced along with several improvements, provides some pragmatic tips that help in engineering a better, more stable implementation. Chapter 4 introduces the convpool layer and the CNN. It studies various proper- ties of this layer and analyzes the features that are extracted for a typi- cal digit recognition dataset. is chapter also introduces four of the most popular contemporary CNNs, AlexNet, VGG, GoogLeNet, and ResNet, and compares their architecture and philosophy. Chapter 5 further expands and enriches the discussion of deep architectures by studying some modern, novel, and pragmatic uses of CNNs. e chapter is broadly divided into two contiguous sections. e rst part deals with the nifty philosophy of using download- able, pretrained, and o -the-shelf networks. Pretrained networks are essentially trained on a wholesome dataset and made available for the PrefaCe xiii public-at-large to ne-tune for a novel task. ese are studied under the scope of generality and transferability. Chapter 5 also studies the compression of these networks and alternative methods of learning a new task given a pretrained network in the form of mentee networks. e second part of the chapter deals with the idea of CNNs that are not used in supervised learning but as generative networks. e sec- tion brie y studies autoencoders and the newest novelty in deep com- puter vision: generative adversarial networks (GANs). e book comes with a website (convolution.network) which is a supplement and contains code and implementations, color illustra- tions of some gures, errata and additional materials. is book also led to a graduate level course that was taught in the Spring of 2017 at Arizona State University, lectures and materials for which are also available at the book website. Figure 1 in Chapter 1 of the book is an original image (original.jpg), that I shot and for which I hold the rights. It is a picture of the monu- ment valley, which as far as imagery goes is representative of the south- west, where ASU is. e art in memory.png was painted in the style of Salvador Dali, particularly of his painting “the persistence of memory” which deals in abstract about the concept of the mind hallucinating and picturing and processing objects in shapeless forms, much like what some representations of the neural networks we study in the book are. e art in memory.png is not painted by a human but by a neural network similar to the ones we discuss in the book. Ergo the connec- tion to the book. Below is the citation reference.


网友评论