文件名称:Information-Theoretic Aspects of Neural Networks
文件大小:9.11MB
文件格式:TGZ
更新时间:2011-10-27 07:27:34
Information-Theoretic Neural-Networks
Preface Chapter 1—Introduction 1.1 Neuroinformatics 1.1.1 Neural Memory: Neural Information Storage 1.1.2 Information-Traffic in the Neurocybernetic System 1.2 Information-Theoretic Framework of Neurocybernetics 1.3 Entropy, Thermodynamics and Information Theory 1.4 Information-Theoretics and Neural Network Training 1.4.1 Cross-Entropy Based Error-Measures 1.4.2 Symmetry Aspects of Divergence Function 1.4.3 Csiszár’s Generalized Error-Measures 1.4.4 Jaynes’ Rationale of Maximum Entropy 1.4.5 Mutual Information Maximization 1.5 Dynamics of Neural Learning in the Information-Theoretic Plane 1.6 Neural Nonlinear Activity in the Information-Theoretic Plane 1.7 Degree of Neural Complexity and Maximum Entropy 1.8 Concluding Remarks Bibliography Appendix 1.1 Concepts and Definitions in Information Theory Appendix 1.2 Functional Equations Related to Information Theory Appendix 1.3 A Note on Generalized Information Functions Chapter 2—Neural Complex: A Nonlinear C3I System? 2.1 Introduction 2.2 Neural Information Processing: CI Protocols 2.3 Nonlinear Neuron Activity 2.4 Bernoulli-Riccati Equations 2.5 Nonlinear Neural Activity: Practical Considerations 2.5.1 Stochastical Response of Neurons Under Activation 2.5.2 Representation of a Neuron as an Input-Dictated Cybernetic Regulator with a Quadratic Cost-function 2.5.3 Generalized Information-Theoretic Entropy Measure 2.5.4 Influence of Nonlinear Neuronal Activity on Information 2.5.5 Significance of the Parameter Q in the Generalized Bernoulli Function LQ(.) Depicting the Neuronal Input-output Relations 2.6 Entropy / Information Flow across a Neural Nonlinear Process 2.7 Nonsigmoidal Activation Functions 2.8 Definitions and Lemmas on Certain Classes of Nonlinear Functions 2.9 Concluding Remarks Bibliography Appendix 2.1 Linear Approximation of Sigmoidal Functions Appendix 2.2 Statistical Mechanics Attributes of the Neural Complex: Langevin’s Theory of Dipole Polarization Chapter 3—Nonlinear and Informatic Aspects of Fuzzy Neural Activity 3.1 Introduction 3.2 What is Fuzzy Activity? 3.3 Crisp Sets versus Fuzzy Sets 3.3.1 Symbols and Notations 3.3.2 Concept of α-cut: 3.3.3 Height of a Fuzzy Set 3.4 Membership Attributions to a Fuzzy Set 3.5 Fuzzy Neural Activity 3.6 Fuzzy Differential Equations 3.6.1 A Theorem on the Ordinary Differential Equation Mapping a Region of Uncertainty 3.7 Membership Attributions to Fuzzy Sets via Activation Function 3.8 Neural Architecture with a Fuzzy Sigmoid 3.9 Fuzzy Considerations, Uncertainty and Information 3.10 Information-Theoretics of Crisp and Fuzzy Sets 3.10.1 Hartley-Shannon Function: A Nonspecificity Measure 3.10.2 Uncertainty:A Measure of Fuzziness 3.10.3 Shannon Entropy: An Uncertainty Measure of Discord or Conflict 3.10.4 Theorems on Fuzzy Entropy 3.10.5 Fuzzy Mutual Entropy Versus Shannon Entropy 3.10.6 Real Space and Fuzzy Cubes 3.10.7 Fuzzy Chain: Definition and Concept 3.11 Fuzzy Neuroinformatics 3.12 Concluding Remarks Bibliography Chapter 4—Csiszár’s Information-Theoretic Error-Measures for Neural Network Optimizations 4.1 Introduction 4.2 Disorganization and Entropy Considerations in Neural Networks 4.3 Information-Theoretic Error-Measures 4.3.1 Square-Error (SE) Measure in the Parametric Space 4.3.2 Relative Entropy (RE) Error Measure 4.3.3 Kullback-Leibler Family of Error-Measures 4.3.4 Generalized Directed Divergence of Degree 1 4.3.4 Csiszár’s Family of Minimum Directed Divergence Measures 4.3.5 Other Possible Cases of Csiszár’s Function Φ(x) 4.4 Neural Nonlinear Response vs. Optimization Algorithms 4.5 A Multilayer Perceptron Training with Information-Theoretic Cost-Functions 4.5.1 Description of the Network and On-Line Experiments 4.5.2 Training Phase 4.6 Results on Neural Network Training with Csiszár’s Error-Measures 4.6.1 Unusable Error-Measures 4.6.2 Usable Error-Metrics 4.6.3 Scope of the Information-Theoretic Error-Measures for Neural Network Optimization 4.6.4 Note on the Nonlinear Activity Function Used 4.6.5 Gradient-Descent Algorithm 4.6.6 Hidden Layer Considerations 4.7 Symmetrized Information-Theoretic Error-Metrics 4.8 One-Sided Error-Measures and Implementation of Symmetrization 4.9 Efficacy of the Error-Measures in Neural Network Training 4.9.1 Square-Error (SE) Measure 4.9.2 Relative Entropy (RE) Error-Measure 4.9.3 Kullback-Leibler Family of Error-Measure 4.9.4 Generalized Jensen Error-Measure 4.9.5 Csiszár’s Family of Error-Metrics 4.9.6 Symmetrized Havrda and Charvát Error-Measure 4.9.7 Symmetrized Sharma and Mittal Error Measure 4.9.8 Symmetrized Rényi Measure 4.9.9 Symmetrized Kapur Type 2 Error-Measure 4.10 Generalized Csiszár’s Symmetrized Error-Measures 4.10.1 Error Measure #1 4.10.2 Error Measures #2 and #3 4.10.3 Symmetrized Generalized Csiszár Error-Measures # 4 and # 5 4.11 Concluding Remarks Bibliography Chapter 5—Dynamics of Neural Learning in the Information-Theoretic Plane 5.1 Introduction 5.2 Stochastical Neural Dynamics 5.3 Stochastical Dynamics of the Error-Measure (ε) 5.4 Random Walk Paradigm of ε(t) Dynamics 5.5 Evolution of ε(t): Representation via the Fokker-Planck Equation 5.6 Logistic Growth Model of ε(t) 5.7 Convergence Considerations 5.7.1 Stochastical Equilibrium 5.7.2 Definitions and Theorems 5.7.3 On-Line Simulation and Results 5.8 Further Considerations on the Dynamics of ε(t) 5.8.1 Competing Augmentative and Annihilative Information Species 5.8.2 Terminal Attractor Dynamics of Error-Metrics 5.9 Dynamics of Fuzzy Uncertainty 5.9.1 Fuzzy Uncertainty and Related Error-Metrics 5.9.2 Dynamics of the Mutual Information Error-Metric 5.10 Concluding Remarks Bibliography Chapter 6—Informatic Perspectives of Complexity Measure in Neural Networks 6.1 Introduction 6.2 Neural Complexity 6.3 Complexity Measure 6.4 Neural Networks: Simple and Complex 6.5 Neural Complexity versus Neural Entropy 6.6 Neural Network Training via Complexity Parameter 6.7 Calculation of and 6.8 Perceptron Training: Simulated Results 6.9 Concluding Remark Bibliography Appendix 6.1 Maximum Entropy Principles 6A.1 General 6A.2 Jaynes’ Maximum Entropy Considerations 6A.3 Inverse Entropy Principles 6A.4 Maximum Entropy versus Minimum Entropy in Optimization Strategies: A Summary 6A.5 Optimization Postulation in a Nut shell Chapter 7—Information-Theoretic Aspects of Neural Stochastic Resonance 7.1 Introduction 7.1.1 The Emergence of SR 7.1.2 Characteristics of SR 7.1.3 Physical Description of SR 7.2 Inter-Event Histograms (IIH) and Stochastic Resonance 7.2.1 Biological Characteristics of IIH 7.2.2 Computer Simulations of Stochastic Resonance 7.3 A Neural Network under SR-Based Learning 7.3.1 Training Phase 7.3.2 Cost-Function 7.3.3 Gross-Features of Network Uncertainty 7.4 Simulation Results 7.5 Concluding Remarks Bibliography Chapter 8—Neural Informatics and Genetic Algorithms 8.1 Entropy, Thermobiodynamics and Bioinformatics 8.1.2 Informatics of DNA Molecules 8.2 Genetic Code 8.2.1 Biological Cells, Genetic Mapping and Genetic Algorithms 8.3 Search Algorithms 8.3.1 Cybernetics, Complex Systems and Optimization 8.3.2 Problem Description 8.3.3 Integer Programming 8.3.4 Simulated Annealing 8.3.5 Hybrid Method 8.3.6 Genetic Algorithms 8.3.7 Paradigm of Genetic Algorithms 8.3.8 Genetic Algorithm Based Optimization 8.4 Simple Genetic Algorithm (SGA) 8.4.1 Fitness Attribution and Reproduction 8.4.2 Crossover 8.4.3 Mutation 8.5 Genetic Algorithms and Neural Networks 8.5.1 Neural Networks: A Review 8.5.2 Neural Network Training Algorithms 8.5.3 Application of Genetic Algorithms to Artificial Neural Networks 8.6 Information-Theoretics of Genetic Selection Algorithm 8.6.1 Genetic Information 8.6.2 Use of IT Plus GA in Neural Networks 8.6.3 Information-Theoretics Aspect of the GAs for ANN Training 8.7 A Test ANN Architecture Deploying GA and IT Concepts 8.7.1 Test ANN for Simulation Studies 8.7.2 Two-Point Crossover 8.8.3 Information-Theoretic (Diversity) Check for Crossover Pair-Selection 8.7.3 Cost-Function Calculation 8.8 Description of the Algorithm 8.9 Experimental Simulations 8.9.1 SET T: Training Bit-Maps 8.9.2 SET:P - Prediction Phase 8.9.1 Remarks on Test Optimization 8.10 Concluding Remarks Bibliography Subject Index