scipy.io。loadmat嵌套结构(即字典)

时间:2022-05-03 06:44:29

Using the given routines (how to load Matlab .mat files with scipy), I could not access deeper nested structures to recover them into dictionaries

使用给定的例程(如何使用scipy加载Matlab .mat文件),我无法访问更深的嵌套结构,从而将它们恢复到字典中。

To present the problem I run into in more detail, I give the following toy example:

为了更详细地介绍我遇到的问题,我给出了如下的玩具示例:

load scipy.io as spio
a = {'b':{'c':{'d': 3}}}
# my dictionary: a['b']['c']['d'] = 3
spio.savemat('xy.mat',a)

Now I want to read the mat-File back into python. I tried the following:

现在我想把这个文件重新读入python。我试着以下:

vig=spio.loadmat('xy.mat',squeeze_me=True)

If I now want to access the fields I get:

如果我现在想访问我得到的字段:

>> vig['b']
array(((array(3),),), dtype=[('c', '|O8')])
>> vig['b']['c']
array(array((3,), dtype=[('d', '|O8')]), dtype=object)
>> vig['b']['c']['d']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/<ipython console> in <module>()

ValueError: field named d not found.

However, by using the option struct_as_record=False the field could be accessed:

但是,通过使用选项struct_as_record=False,该字段可以被访问:

v=spio.loadmat('xy.mat',squeeze_me=True,struct_as_record=False)

Now it was possible to access it by

现在可以通过它来访问它了。

>> v['b'].c.d
array(3)

5 个解决方案

#1


33  

Here are the functions, which reconstructs the dictionaries just use this loadmat instead of scipy.io's loadmat:

这里有一些函数,它们重新构造了字典,只是使用这个loadmat而不是scipy。io loadmat:

import scipy.io as spio

def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

def _check_keys(dict):
    '''
    checks if entries in dictionary are mat-objects. If yes
    todict is called to change them to nested dictionaries
    '''
    for key in dict:
        if isinstance(dict[key], spio.matlab.mio5_params.mat_struct):
            dict[key] = _todict(dict[key])
    return dict        

def _todict(matobj):
    '''
    A recursive function which constructs from matobjects nested dictionaries
    '''
    dict = {}
    for strg in matobj._fieldnames:
        elem = matobj.__dict__[strg]
        if isinstance(elem, spio.matlab.mio5_params.mat_struct):
            dict[strg] = _todict(elem)
        else:
            dict[strg] = elem
    return dict

#2


8  

Just an enhancement to mergen's answer, which unfortunately will stop recursing if it reaches a cell array of objects. The following version will make lists of them instead, and continuing the recursion into the cell array elements if possible.

只要增强合并的答案,不幸的是,如果它到达一个对象数组,就会停止递归。下面的版本将会列出它们的列表,如果可能的话,继续递归到单元数组元素中。

import scipy
import numpy as np


def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    def _check_keys(d):
        '''
        checks if entries in dictionary are mat-objects. If yes
        todict is called to change them to nested dictionaries
        '''
        for key in d:
            if isinstance(d[key], spio.matlab.mio5_params.mat_struct):
                d[key] = _todict(d[key])
        return d

    def _todict(matobj):
        '''
        A recursive function which constructs from matobjects nested dictionaries
        '''
        d = {}
        for strg in matobj._fieldnames:
            elem = matobj.__dict__[strg]
            if isinstance(elem, spio.matlab.mio5_params.mat_struct):
                d[strg] = _todict(elem)
            elif isinstance(elem, np.ndarray):
                d[strg] = _tolist(elem)
            else:
                d[strg] = elem
        return d

    def _tolist(ndarray):
        '''
        A recursive function which constructs lists from cellarrays
        (which are loaded as numpy ndarrays), recursing into the elements
        if they contain matobjects.
        '''
        elem_list = []
        for sub_elem in ndarray:
            if isinstance(sub_elem, spio.matlab.mio5_params.mat_struct):
                elem_list.append(_todict(sub_elem))
            elif isinstance(sub_elem, np.ndarray):
                elem_list.append(_tolist(sub_elem))
            else:
                elem_list.append(sub_elem)
        return elem_list
    data = scipy.io.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

#3


2  

Found a solution, one can access the content of the "scipy.io.matlab.mio5_params.mat_struct object" can be investigated via:

找到一个解决方案,你可以访问“scipy.io.matlab.mio5_params”的内容。mat_struct对象可以通过以下方式进行研究:

v['b'].__dict__['c'].__dict__['d']

#4


0  

I was advised on the scipy mailing list (https://mail.python.org/pipermail/scipy-user/) that there are two more ways to access this data.

我在scipy邮件列表中得到了建议(https://mail.python.org/pipermail/scipy-user/),有两种方法可以访问这些数据。

This works:

如此:

import scipy.io as spio
vig=spio.loadmat('xy.mat')
print vig['b'][0, 0]['c'][0, 0]['d'][0, 0]

Output on my machine: 3

我的机器上的输出:3。

The reason for this kind of access: "For historic reasons, in Matlab everything is at least a 2D array, even scalars." So scipy.io.loadmat mimics Matlab behavior per default.

这种访问的原因是:“出于历史原因,在Matlab中,所有东西都至少是一个2D数组,甚至是标量。”所以scipy.io。loadmat模仿Matlab的默认行为。

#5


0  

Another method that works:

另一种方法,工作原理:

import scipy.io as spio
vig=spio.loadmat('xy.mat',squeeze_me=True)
print vig['b']['c'].item()['d']

Output:

输出:

3

3

I learned this method on the scipy mailing list, too. I certainly don't understand (yet) why '.item()' has to be added in, and:

我也在scipy邮件列表中学习了这个方法。我当然不明白为什么“。item()”必须添加进去,并且:

print vig['b']['c']['d']

will throw an error instead:

将会抛出一个错误:

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

IndexError:只有整数、片(:)、省略号(…)、numpy。newaxis (None)和integer或boolean数组都是有效的索引。

but I'll be back to supplement the explanation when I know it. Explanation of numpy.ndarray.item (from thenumpy reference): Copy an element of an array to a standard Python scalar and return it.

但是当我知道的时候我会回来补充解释的。numpy.ndarray的解释。项(来自于umpy引用):将一个数组的元素复制到一个标准的Python标量并返回它。

(Please notice that this answer is basically the same as the comment of hpaulj to the initial question, but I felt that the comment is not 'visible' or understandable enough. I certainly did not notice it when I searched for a solution for the first time, some weeks ago).

(请注意,这个回答与hpaulj对最初问题的评论基本相同,但我觉得这个评论是不“可见”的,或者是可以理解的。几个星期前,我第一次寻找解决方案时,我肯定没有注意到它。

#1


33  

Here are the functions, which reconstructs the dictionaries just use this loadmat instead of scipy.io's loadmat:

这里有一些函数,它们重新构造了字典,只是使用这个loadmat而不是scipy。io loadmat:

import scipy.io as spio

def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

def _check_keys(dict):
    '''
    checks if entries in dictionary are mat-objects. If yes
    todict is called to change them to nested dictionaries
    '''
    for key in dict:
        if isinstance(dict[key], spio.matlab.mio5_params.mat_struct):
            dict[key] = _todict(dict[key])
    return dict        

def _todict(matobj):
    '''
    A recursive function which constructs from matobjects nested dictionaries
    '''
    dict = {}
    for strg in matobj._fieldnames:
        elem = matobj.__dict__[strg]
        if isinstance(elem, spio.matlab.mio5_params.mat_struct):
            dict[strg] = _todict(elem)
        else:
            dict[strg] = elem
    return dict

#2


8  

Just an enhancement to mergen's answer, which unfortunately will stop recursing if it reaches a cell array of objects. The following version will make lists of them instead, and continuing the recursion into the cell array elements if possible.

只要增强合并的答案,不幸的是,如果它到达一个对象数组,就会停止递归。下面的版本将会列出它们的列表,如果可能的话,继续递归到单元数组元素中。

import scipy
import numpy as np


def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    def _check_keys(d):
        '''
        checks if entries in dictionary are mat-objects. If yes
        todict is called to change them to nested dictionaries
        '''
        for key in d:
            if isinstance(d[key], spio.matlab.mio5_params.mat_struct):
                d[key] = _todict(d[key])
        return d

    def _todict(matobj):
        '''
        A recursive function which constructs from matobjects nested dictionaries
        '''
        d = {}
        for strg in matobj._fieldnames:
            elem = matobj.__dict__[strg]
            if isinstance(elem, spio.matlab.mio5_params.mat_struct):
                d[strg] = _todict(elem)
            elif isinstance(elem, np.ndarray):
                d[strg] = _tolist(elem)
            else:
                d[strg] = elem
        return d

    def _tolist(ndarray):
        '''
        A recursive function which constructs lists from cellarrays
        (which are loaded as numpy ndarrays), recursing into the elements
        if they contain matobjects.
        '''
        elem_list = []
        for sub_elem in ndarray:
            if isinstance(sub_elem, spio.matlab.mio5_params.mat_struct):
                elem_list.append(_todict(sub_elem))
            elif isinstance(sub_elem, np.ndarray):
                elem_list.append(_tolist(sub_elem))
            else:
                elem_list.append(sub_elem)
        return elem_list
    data = scipy.io.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

#3


2  

Found a solution, one can access the content of the "scipy.io.matlab.mio5_params.mat_struct object" can be investigated via:

找到一个解决方案,你可以访问“scipy.io.matlab.mio5_params”的内容。mat_struct对象可以通过以下方式进行研究:

v['b'].__dict__['c'].__dict__['d']

#4


0  

I was advised on the scipy mailing list (https://mail.python.org/pipermail/scipy-user/) that there are two more ways to access this data.

我在scipy邮件列表中得到了建议(https://mail.python.org/pipermail/scipy-user/),有两种方法可以访问这些数据。

This works:

如此:

import scipy.io as spio
vig=spio.loadmat('xy.mat')
print vig['b'][0, 0]['c'][0, 0]['d'][0, 0]

Output on my machine: 3

我的机器上的输出:3。

The reason for this kind of access: "For historic reasons, in Matlab everything is at least a 2D array, even scalars." So scipy.io.loadmat mimics Matlab behavior per default.

这种访问的原因是:“出于历史原因,在Matlab中,所有东西都至少是一个2D数组,甚至是标量。”所以scipy.io。loadmat模仿Matlab的默认行为。

#5


0  

Another method that works:

另一种方法,工作原理:

import scipy.io as spio
vig=spio.loadmat('xy.mat',squeeze_me=True)
print vig['b']['c'].item()['d']

Output:

输出:

3

3

I learned this method on the scipy mailing list, too. I certainly don't understand (yet) why '.item()' has to be added in, and:

我也在scipy邮件列表中学习了这个方法。我当然不明白为什么“。item()”必须添加进去,并且:

print vig['b']['c']['d']

will throw an error instead:

将会抛出一个错误:

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

IndexError:只有整数、片(:)、省略号(…)、numpy。newaxis (None)和integer或boolean数组都是有效的索引。

but I'll be back to supplement the explanation when I know it. Explanation of numpy.ndarray.item (from thenumpy reference): Copy an element of an array to a standard Python scalar and return it.

但是当我知道的时候我会回来补充解释的。numpy.ndarray的解释。项(来自于umpy引用):将一个数组的元素复制到一个标准的Python标量并返回它。

(Please notice that this answer is basically the same as the comment of hpaulj to the initial question, but I felt that the comment is not 'visible' or understandable enough. I certainly did not notice it when I searched for a solution for the first time, some weeks ago).

(请注意,这个回答与hpaulj对最初问题的评论基本相同,但我觉得这个评论是不“可见”的,或者是可以理解的。几个星期前,我第一次寻找解决方案时,我肯定没有注意到它。