theano sparse_block_dot

theano 中的一个函数 sparse_block_dot;

Function:

for b in range(batch_size):

    for j in range(o.shape[1]):

        for i in range(h.shape[1]):

            o[b, j, :] += numpy.dot(h[b, i], W[iIdx[b, i], oIdx[b, j]])

Image Example

theano sparse_block_dot

Input Parameter

- W (iBlocks, oBlocks, iSize, oSize) – weight matrix

- h (batch, iWin, iSize) – input from lower layer (sparse)

- inputIdx (batch, iWin) – indexes of the input blocks

- b (oBlocks, oSize) – bias vector

- outputIdx (batch, oWin) – indexes of the output blocks

Return

- dot(W[i, j], h[i]) + b[j] #but b[j] is only added once

- shape: (batch, oWin, oSize)

Applications

used form calculating theano.tensor.nnet.h_softmax;

Codes



def h_softmax(x, batch_size, n_outputs, n_classes, n_outputs_per_class,

              W1, b1, W2, b2, target=None):

   "Two-level hierarchical softmax."

    # First softmax that computes the probabilities of belonging to each class

    class_probs = theano.tensor.nnet.softmax(tensor.dot(x, W1) + b1)

    if target is None:  # Computes the probabilites of all the outputs

        # Second softmax that computes the output probabilities

        activations = tensor.tensordot(x, W2, (1, 1)) + b2

        output_probs = theano.tensor.nnet.softmax(

            activations.reshape((-1, n_outputs_per_class)))

        output_probs = output_probs.reshape((batch_size, n_classes, -1))

        output_probs = class_probs.dimshuffle(0, 1, 'x') * output_probs

        output_probs = output_probs.reshape((batch_size, -1))

        # output_probs.shape[1] is n_classes * n_outputs_per_class, which might

        # be greater than n_outputs, so we ignore the potential irrelevant

        # outputs with the next line:

        output_probs = output_probs[:, :n_outputs]

    else:  # Computes the probabilities of the outputs specified by the targets

        target = target.flatten()

        # Classes to which belong each target

        target_classes = target // n_outputs_per_class

        # Outputs to which belong each target inside a class

        target_outputs_in_class = target % n_outputs_per_class

        # Second softmax that computes the output probabilities

        activations = sparse_block_dot(

            W2.dimshuffle('x', 0, 1, 2), x.dimshuffle(0, 'x', 1),

            tensor.zeros((batch_size, 1), dtype='int32'), b2,

            target_classes.dimshuffle(0, 'x'))

        output_probs = theano.tensor.nnet.softmax(activations.dimshuffle(0, 2))

        target_class_probs = class_probs[tensor.arange(batch_size),

                                         target_classes]

        output_probs = output_probs[tensor.arange(batch_size),

                                    target_outputs_in_class]

        output_probs = target_class_probs * output_probs

    return output_probs

theano sparse_block_dot的更多相关文章

Deconvolution Using Theano
Transposed Convolution, 也叫Fractional Strided Convolution, 或者流行的(错误)称谓: 反卷积, Deconvolution. 定义请参考tuto ...
Theano printing
Theano printing To visualize the internal relation graph of theano variables. Installing conda insta ...
Theano Graph Structure
Graph Structure Graph Definition theano's symbolic mathematical computation, which is composed of: A ...
Theano Inplace
Theano Inplace inplace Computation computation that destroy their inputs as a side-effect. Example i ...
broadcasting Theano vs&period; Numpy
broadcasting Theano vs. Numpy broadcast mechanism allows a scalar may be added to a matrix, a vector ...
theano scan optimization
selected from Theano Doc Optimizing Scan performance Minimizing Scan Usage performan as much of the ...
ubuntu系统theano和keras的安装
说明:系统是unbuntu14.04LTS,32位的操作系统,以前安装了python3.4,现在想要安装theano和keras.步骤如下: 1,安装pip sudo apt-get install ...
theano学习
import numpy import theano.tensor as T from theano import function x = T.dscalar('x') y = T.dscalar( ...
Theano 学习笔记(一)
Theano 学习笔记(一) theano 为什么要定义共享变量? 定义共享变量的原因在于GPU的使用,如果不定义共享的话,那么当GPU调用这些变量时,遇到一次就要调用一次,这样就会花费大量时间在数据 ...

随机推荐

json与JavaScript对象互换
1,json字符串转化为JavaScript对象: 方法:JSON.parse(string) eg:var account = '{"name":"jaytan&quo ...
在非UI线程中自制Dispatcher
在C#中,Task.Run当然是一个很好的启动新并行任务的机制,但是因为使用这个方法时,每次新的任务都会在一个新的线程中(其实就是线程池中的线程)运行这样会造成某些情形下现场调度的相对困难,即使我隔 ...
Android入门(二十一)解析XML
原文链接:http://www.orlion.ga/685/ 解析XML常用的方式有两种,一种是PULL解析一种是SAX解析. 假设解析数据为: <apps> <app&gt ...
&lt&semi;&lt&semi;人性的弱点&gt&semi;&gt&semi;读书笔记
书名的英文名其实是<< How to win friends and influence people & how to stop worrying and start livin ...
paip&period;提升用户体验--提升java的热部署热更新能力
paip.提升用户体验--提升java的热部署热更新能力想让java做到php那么好的热部署能力 "fix online"/在线修复吗??直接在服务器上修改源码生效,无需重启应 ...
Pox组件
最近在学习Pox,为了加深印象,对Pox wiki中的Pox组件写了些笔记. 按照组件的功能进行分类: L2层地址学习.洪泛 forwarding.hub forwarding.l2_lear ...
在线API大全
之前整理过几个经常使用api地址,在经常使用在线API集合博文中. 近期浏览网页的时候,又发现一个很完整的api的大全,分享出来,建议大家收藏起来,用的时候方便查询. 经常使用API文档索引http: ...
CentOS系统搭建gitolite服务
1.安装相关支持软件 a.$yum install perl-Time-HiRes openssh-server perl -y b.$yum -y install git 2.服务端操作:创建git ...
postgresql命令
连接数据库, 默认的用户和数据库是postgrespsql -U user -d dbname 切换数据库,相当于mysql的use dbname\c dbname列举数据库,相当于mysql的sho ...
再读c++primer plus 003
1.如果函数返回一个结构而不是一个指向结构的引用,将把整个结构复制到一个临时位置,再将这个拷贝复制给dup.但在返回值为引用时,将直接复制给变量,其效率更高. 2.返回引用时最重要的一点是,应避免返回 ...