I know a way of finding and identifying missing values for a particular variable.
我知道一种查找和确定特定变量缺失值的方法。
For the variable avedmajor
, I could do
对于变量avedmajor,我可以这样做
tab avedmajor, m
Then,
然后,
gen avedmajormissing=0
replace avedmajormissing=1 if avedmajor==.
But how to see if my dataset has missing values in any of the variables without going through each one of them?
但是,如何查看我的数据集是否在没有遍历每一个变量的情况下丢失了任何一个变量的值呢?
Thanks.
谢谢。
4 个解决方案
#1
3
One command is:
一个命令是:
misstable summarize
But see also:
还看到:
help missing##useful
and more generally:
和更普遍的:
help missing
#2
2
I'd add mdesc
command to proposed solutions. According to description mdesc
:
我将在建议的解决方案中添加mdesc命令。根据描述mdesc:
Produces a table with the number of missing values, total number of cases, and percent missing for each variable in varlist. mdesc works with both numeric and character variables.
生成一个表,其中包含varlist中每个变量的缺失值、总案例数和缺失百分比。mdesc同时处理数字和字符变量。
So advantage to misstable
solution is that it works with both numeric and string variables in one go.
所以misstable解决方案的优点是它可以一次同时处理数值变量和字符串变量。
sysuse auto
mdesc
Gives a nice overview of missings:
对缺失做一个很好的概述:
Variable | Missing Total Percent Missing
----------------+-----------------------------------------------
make | 0 74 0.00
price | 0 74 0.00
mpg | 0 74 0.00
rep78 | 5 74 6.76
headroom | 0 74 0.00
trunk | 0 74 0.00
weight | 0 74 0.00
length | 0 74 0.00
turn | 0 74 0.00
displacement | 0 74 0.00
gear_ratio | 0 74 0.00
foreign | 0 74 0.00
----------------+-----------------------------------------------
#3
1
Various commands help. See e.g. codebook
. For one user-written command, install nmissing
.
各种命令的帮助。看到如电报密码本。对于一个用户编写的命令,安装nmissing。
. search nmissing, historical
Search of official help files, FAQs, Examples, SJs, and STBs
FAQ . . . . . . Can I quickly see how many missing values a variable has?
. . . . . . . . . . . . . . . . . . UCLA Academic Technology Services
7/08 http://www.ats.ucla.edu/stat/stata/faq/nmissing.htm
Example . . . . . . . . . . . . . . . . . . . . Useful non-UCLA Stata programs
. . . . . . . . . . . . . . . . . . UCLA Academic Technology Services
7/08 http://www.ats.ucla.edu/stat/ado/world/
SJ-5-4 dm67_3 . . . . . . . . . . Software update for nmissing and npresent
(help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox
Q4/05 SJ 5(4):607
now produces saved results
SJ-3-4 sg67_2 . . . . . . . . . . Software update for nmissing and npresent
(help nmissing, npresent if installed) . . . . . . . . . . N. J. Cox
Q4/03 SJ 3(4):449
updated to include support for by, options for checking
string values that contain spaces or periods, documentation
of extended missing values .a to .z, and improved output
STB-60 dm67.1 . . . . Enhancements to numbers of missing and present values
(help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox
3/01 pp.2--3; STB Reprints Vol 10, pp.7--9
updated with option for reporting on observations
STB-49 dm67 . . . . . . . . . . . . . Numbers of missing and present values
(help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox
5/99 pp.7--8; STB Reprints Vol 9, pp.26--27
commands to list the numbers of missing values and nonmissing
values in each variable in varlist
Here is an example:
这是一个例子:
. webuse nlswork
(National Longitudinal Survey. Young Women 14-26 years of age in 1968)
. nmissing
age 24
msp 16
nev_mar 16
grade 2
not_smsa 8
c_city 8
south 8
ind_code 341
occ_code 121
union 9296
wks_ue 5704
tenure 433
hours 67
wks_work 703
#4
1
Another option would be misschk
from the SPost site. Type findit misschk
to install it. Here's an example:
另一个选项是SPost站点的misschk。类型findit misschk安装它。这里有一个例子:
sysuse auto,clear
replace price=. if (_n==1|_n==3) // additional missing values
misschk
Without specifying the varlist
, misschk
just checks all variables.
不指定varlist, misschk只检查所有变量。
The standard output gives you the number as well as percentage of missing values on each variable.
标准输出给出每个变量上缺失值的数量和百分比。
Variables examined for missing values
# Variable # Missing % Missing
--------------------------------------------
1 price 2 2.7
2 mpg 0 0.0
3 rep78 5 6.8
4 headroom 0 0.0
5 trunk 0 0.0
6 weight 0 0.0
7 length 0 0.0
8 turn 0 0.0
9 displacement 0 0.0
10 gear_ratio 0 0.0
11 foreign 0 0.0
It also counts all the different missing patterns.
它还计算所有不同的缺失模式。
Missing for |
which |
variables? | Freq. Percent Cum.
---------------+-----------------------------------
1_3__ _____ _ | 1 1.35 1.35
1____ _____ _ | 1 1.35 2.70
__3__ _____ _ | 4 5.41 8.11
_____ _____ _ | 68 91.89 100.00
---------------+-----------------------------------
Total | 74 100.00
Lastly, it summarizes the amount of missing values by cases.
最后,根据案例总结了缺失值的数量。
Missing for |
how many |
variables? | Freq. Percent Cum.
------------+-----------------------------------
0 | 68 91.89 91.89
1 | 5 6.76 98.65
2 | 1 1.35 100.00
------------+-----------------------------------
Total | 74 100.00
misschk
also has a couple of other neat features with additional options you can find out about with help misschk
.
misschk还有一些其他的整洁的特性,你可以在帮助misschk中找到更多的选项。
#1
3
One command is:
一个命令是:
misstable summarize
But see also:
还看到:
help missing##useful
and more generally:
和更普遍的:
help missing
#2
2
I'd add mdesc
command to proposed solutions. According to description mdesc
:
我将在建议的解决方案中添加mdesc命令。根据描述mdesc:
Produces a table with the number of missing values, total number of cases, and percent missing for each variable in varlist. mdesc works with both numeric and character variables.
生成一个表,其中包含varlist中每个变量的缺失值、总案例数和缺失百分比。mdesc同时处理数字和字符变量。
So advantage to misstable
solution is that it works with both numeric and string variables in one go.
所以misstable解决方案的优点是它可以一次同时处理数值变量和字符串变量。
sysuse auto
mdesc
Gives a nice overview of missings:
对缺失做一个很好的概述:
Variable | Missing Total Percent Missing
----------------+-----------------------------------------------
make | 0 74 0.00
price | 0 74 0.00
mpg | 0 74 0.00
rep78 | 5 74 6.76
headroom | 0 74 0.00
trunk | 0 74 0.00
weight | 0 74 0.00
length | 0 74 0.00
turn | 0 74 0.00
displacement | 0 74 0.00
gear_ratio | 0 74 0.00
foreign | 0 74 0.00
----------------+-----------------------------------------------
#3
1
Various commands help. See e.g. codebook
. For one user-written command, install nmissing
.
各种命令的帮助。看到如电报密码本。对于一个用户编写的命令,安装nmissing。
. search nmissing, historical
Search of official help files, FAQs, Examples, SJs, and STBs
FAQ . . . . . . Can I quickly see how many missing values a variable has?
. . . . . . . . . . . . . . . . . . UCLA Academic Technology Services
7/08 http://www.ats.ucla.edu/stat/stata/faq/nmissing.htm
Example . . . . . . . . . . . . . . . . . . . . Useful non-UCLA Stata programs
. . . . . . . . . . . . . . . . . . UCLA Academic Technology Services
7/08 http://www.ats.ucla.edu/stat/ado/world/
SJ-5-4 dm67_3 . . . . . . . . . . Software update for nmissing and npresent
(help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox
Q4/05 SJ 5(4):607
now produces saved results
SJ-3-4 sg67_2 . . . . . . . . . . Software update for nmissing and npresent
(help nmissing, npresent if installed) . . . . . . . . . . N. J. Cox
Q4/03 SJ 3(4):449
updated to include support for by, options for checking
string values that contain spaces or periods, documentation
of extended missing values .a to .z, and improved output
STB-60 dm67.1 . . . . Enhancements to numbers of missing and present values
(help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox
3/01 pp.2--3; STB Reprints Vol 10, pp.7--9
updated with option for reporting on observations
STB-49 dm67 . . . . . . . . . . . . . Numbers of missing and present values
(help nmissing if installed) . . . . . . . . . . . . . . . N. J. Cox
5/99 pp.7--8; STB Reprints Vol 9, pp.26--27
commands to list the numbers of missing values and nonmissing
values in each variable in varlist
Here is an example:
这是一个例子:
. webuse nlswork
(National Longitudinal Survey. Young Women 14-26 years of age in 1968)
. nmissing
age 24
msp 16
nev_mar 16
grade 2
not_smsa 8
c_city 8
south 8
ind_code 341
occ_code 121
union 9296
wks_ue 5704
tenure 433
hours 67
wks_work 703
#4
1
Another option would be misschk
from the SPost site. Type findit misschk
to install it. Here's an example:
另一个选项是SPost站点的misschk。类型findit misschk安装它。这里有一个例子:
sysuse auto,clear
replace price=. if (_n==1|_n==3) // additional missing values
misschk
Without specifying the varlist
, misschk
just checks all variables.
不指定varlist, misschk只检查所有变量。
The standard output gives you the number as well as percentage of missing values on each variable.
标准输出给出每个变量上缺失值的数量和百分比。
Variables examined for missing values
# Variable # Missing % Missing
--------------------------------------------
1 price 2 2.7
2 mpg 0 0.0
3 rep78 5 6.8
4 headroom 0 0.0
5 trunk 0 0.0
6 weight 0 0.0
7 length 0 0.0
8 turn 0 0.0
9 displacement 0 0.0
10 gear_ratio 0 0.0
11 foreign 0 0.0
It also counts all the different missing patterns.
它还计算所有不同的缺失模式。
Missing for |
which |
variables? | Freq. Percent Cum.
---------------+-----------------------------------
1_3__ _____ _ | 1 1.35 1.35
1____ _____ _ | 1 1.35 2.70
__3__ _____ _ | 4 5.41 8.11
_____ _____ _ | 68 91.89 100.00
---------------+-----------------------------------
Total | 74 100.00
Lastly, it summarizes the amount of missing values by cases.
最后,根据案例总结了缺失值的数量。
Missing for |
how many |
variables? | Freq. Percent Cum.
------------+-----------------------------------
0 | 68 91.89 91.89
1 | 5 6.76 98.65
2 | 1 1.35 100.00
------------+-----------------------------------
Total | 74 100.00
misschk
also has a couple of other neat features with additional options you can find out about with help misschk
.
misschk还有一些其他的整洁的特性,你可以在帮助misschk中找到更多的选项。