I have data that are acquired every 3 seconds. Initially, they always begin within a narrow baseline range (i.e. 100±10) but after ~30 seconds they begin to increase in value.
我有每3秒获取的数据。最初,它们总是在狭窄的基线范围内(即100±10)开始,但在约30秒后它们开始增加值。
Here's an example.
这是一个例子。
The issue is that for every experiment, the initial baseline value may start at a different point in the y-axis (i.e. 100, 250, 35) due to variations in equipment calibration.
问题在于,对于每个实验,由于设备校准的变化,初始基线值可能从y轴的不同点开始(即100,250,35)。
Although the relative signal enhancement at ~30 seconds behaves the same across different experiments, there may be an offset along the y-axis.
尽管~30秒的相对信号增强在不同实验中表现相同,但沿y轴可能存在偏移。
My intention is to measure the AUC of these curves. Because of the offset between experiments, they are not comparable, although they could potentially be identical in shape and enhancement ratio.
我的目的是测量这些曲线的AUC。由于实验之间的偏差,它们无法比较,尽管它们的形状和增强比可能相同。
Therefore I need to normalize the data so that regardless of offset they all have comparable baseline initial values. This could be set to 0.
因此,我需要对数据进行标准化,以便无论偏移如何,它们都具有可比较的基线初始值。这可以设置为0。
Can you give me any suggestions on how to accomplish the normalization on Matlab?
你能给我一些关于如何在Matlab上完成规范化的建议吗?
Ideally the output data should be of relative signal enhancement (in percent relative to baseline).
理想情况下,输出数据应具有相对信号增强(相对于基线的百分比)。
For example, the baseline values above would hover around 0±10 (instead of the raw original value of ~139) and with enhancement they would build up to ~65% (instead of the original raw value of ~230).
例如,上面的基线值将徘徊在0±10左右(而不是〜139的原始原始值),并且通过增强它们将构建高达~65%(而不是原始原始值~230)。
Sample data:
index SQMean
_____ ____________
'0' '139.428574'
'1' '133.298706'
'2' '135.961044'
'3' '143.688309'
'4' '133.298706'
'5' '133.181824'
'6' '134.896103'
'7' '146.415588'
'8' '142.324677'
'9' '128.168839'
'10' '146.116882'
'11' '146.766235'
'12' '134.675323'
'13' '138.610382'
'14' '140.558441'
'15' '128.662338'
'16' '138.480515'
'17' '153.610382'
'18' '156.207794'
'19' '183.428574'
'20' '220.324677'
'21' '224.324677'
'22' '230.415588'
'23' '226.766235'
'24' '223.935059'
'25' '229.922073'
'26' '234.389618'
'27' '235.493500'
'28' '225.727280'
'29' '241.623383'
'30' '225.805191'
'31' '240.896103'
'32' '224.090912'
'33' '230.467529'
'34' '248.285721'
'35' '233.779221'
'36' '225.532471'
'37' '247.337662'
'38' '233.000000'
'39' '241.740265'
'40' '235.688309'
'41' '238.662338'
'42' '236.636368'
'43' '236.025970'
'44' '234.818176'
'45' '240.974030'
'46' '251.350647'
'47' '241.857147'
'48' '242.623383'
'49' '245.714279'
'50' '250.701294'
'51' '229.415588'
'52' '236.909088'
'53' '243.779221'
'54' '244.532471'
'55' '241.493500'
'56' '245.480515'
'57' '244.324677'
'58' '244.025970'
'59' '231.987015'
'60' '238.740265'
'61' '239.532471'
'62' '232.363632'
'63' '242.454544'
'64' '243.831161'
'65' '229.688309'
'66' '239.493500'
'67' '247.324677'
'68' '245.324677'
'69' '244.662338'
'70' '238.610382'
'71' '243.324677'
'72' '234.584412'
'73' '235.181824'
'74' '228.974030'
'75' '228.246750'
'76' '230.519485'
'77' '231.441559'
'78' '236.324677'
'79' '229.935059'
'80' '238.701294'
'81' '236.441559'
'82' '244.350647'
'83' '233.714279'
'84' '243.753250'
2 个解决方案
#1
2
Close to what was mentioned by Shai:
接近Shai所提到的:
blwindow = 1:nrSamp;
DataNorm = 100*(Data/mean(Data(blwindow))-1)
Set the window to the right size, however you want to determine it, it depends on your data. Output DataNorm is in %.
将窗口设置为正确的大小,但是您要确定它,它取决于您的数据。输出DataNorm以%为单位。
#2
1
Usually this kind of problems requires some more specific knowledge about the data you are measuring (range, noise level, if you know when the actual data starts etc.) and the results you are trying to achieve. However, based on your question only and by looking at your example graph, I'd do something like this (assuming your data is in two arrays, time
and data
):
通常这类问题需要一些关于您正在测量的数据的更具体的知识(范围,噪声水平,如果您知道实际数据何时开始等)以及您尝试实现的结果。但是,根据您的问题并查看您的示例图表,我会做这样的事情(假设您的数据是两个数组,时间和数据):
initialTimeMax = 25; % take first 25 s
baseSample = data(time <= initialTimeMax); % take part of the data corresponding to the first 25 s
baseSampleAverage = mean(baseSample); % take average to deal with noise
data = data - baseSampleAverage;
If you don't know when your data starts, you can apply a smoothing filter, then take a derivative, find the x-position of its maximum, and set initialTimeMax
to this x-position.
如果您不知道数据何时开始,则可以应用平滑滤波器,然后取导数,找到其最大值的x位置,并将initialTimeMax设置为此x位置。
#1
2
Close to what was mentioned by Shai:
接近Shai所提到的:
blwindow = 1:nrSamp;
DataNorm = 100*(Data/mean(Data(blwindow))-1)
Set the window to the right size, however you want to determine it, it depends on your data. Output DataNorm is in %.
将窗口设置为正确的大小,但是您要确定它,它取决于您的数据。输出DataNorm以%为单位。
#2
1
Usually this kind of problems requires some more specific knowledge about the data you are measuring (range, noise level, if you know when the actual data starts etc.) and the results you are trying to achieve. However, based on your question only and by looking at your example graph, I'd do something like this (assuming your data is in two arrays, time
and data
):
通常这类问题需要一些关于您正在测量的数据的更具体的知识(范围,噪声水平,如果您知道实际数据何时开始等)以及您尝试实现的结果。但是,根据您的问题并查看您的示例图表,我会做这样的事情(假设您的数据是两个数组,时间和数据):
initialTimeMax = 25; % take first 25 s
baseSample = data(time <= initialTimeMax); % take part of the data corresponding to the first 25 s
baseSampleAverage = mean(baseSample); % take average to deal with noise
data = data - baseSampleAverage;
If you don't know when your data starts, you can apply a smoothing filter, then take a derivative, find the x-position of its maximum, and set initialTimeMax
to this x-position.
如果您不知道数据何时开始,则可以应用平滑滤波器,然后取导数,找到其最大值的x位置,并将initialTimeMax设置为此x位置。