我有一个85038行的excel表格,我如何随机选择其中的10% ?

时间:2023-01-19 09:17:00

There are 5 columns (first name, email, userid, app name) and I want to randomly select 10% of these rows and export it eventually to a CSV while maintaining the column headers i listed above. thanks a million

有5个列(名字、电子邮件、用户id、应用程序名),我想随机选择其中的10%,并最终导出为CSV,同时维护上面列出的列标题。谢谢一百万

4 个解决方案

#1


9  

I don't know how random you want this to be but adding a column containing =RANDBETWEEN(1,85038) copied down to suit, then sorting that column and selecting the first 8,504 rows should give quite an 'arbitrary' result.

我不知道您希望它有多大的随机性,但是添加一个包含=RANDBETWEEN(1,85038)的列以适应需要,然后对该列进行排序并选择前8,504行应该会得到相当“任意”的结果。

#2


0  

Are you familiar with SQL and the Microsoft Query functionality in Excel (Data ->...-> From Microsoft Query)?

您是否熟悉SQL和Excel中的Microsoft查询功能(Data ->…)从微软- >查询)?

If yes then use this

如果是,那就用这个

( SELECT "first name, email, userid, app name" )
UNION   
( SELECT TOP 8503 t.[first name] & "," & t.[email] & "," &  t.[userid] & "," & t.[app name] 
FROM [Sheet1$] AS t ORDER BY RND() )

Then copy an paste to an empty text file and save as CSV

然后将粘贴复制到一个空的文本文件并保存为CSV

You can also use my SQL addin for this http://blog.tkacprow.pl/?page_id=130

您还可以为这个http://blog.tkacprow.pl/?page_id=130使用我的SQL插件

EDIT 1: I assumed that "Sheet1" is the name of your worksheet

编辑1:我假设“Sheet1”是您工作表的名称

#3


0  

Here's a possible solution for you using Array Formula.
Suppose you have data in Column A (in this example I used 100 data only).

这是使用数组公式的一种可能的解决方案。假设在A列中有数据(在本例中,我只使用了100个数据)。

我有一个85038行的excel表格,我如何随机选择其中的10% ?

Now in C2, type the following formula: (Credits to Oscar.)

现在在C2中,输入以下公式:(归功于奥斯卡)

=IF(ROW(A1)<=0.1*COUNTA($A$2:$A$101),INDEX($A$2:$A$101, LARGE(MATCH(ROW($A$2:$A$101), ROW($A$2:$A$101))*NOT(COUNTIF($C$1:C1, $A$2:$A$101)), RANDBETWEEN(1,ROWS($A$2:$A$101)-ROW(A1)+1))),"")

=如果(行(A1)< = 0.1 * COUNTA($ $ 2:$ 101美元),指数(2美元:101美元,美元大(匹配(行($ $ 2:$ 101美元)、行($ $ 2:$ 101美元))*不是(条件统计($ C $ 1:C1,$ 2美元:101美元)),RANDBETWEEN(1行($ $ 2:$ 101美元)行(A1)+ 1)))," ")

Use Ctrl+Shift+Enter to get the formula to work.
Using just Enter will return #N/A.
Then to get the rest of the values, just drag the formula down.
In this example, I just auto-fill up to C20.

使用Ctrl+Shift+Enter键使公式生效。使用just Enter会返回#N/A。然后要得到其余的值,只要把公式拖下来。在这个例子中,我只是自动填充C20。

Note: Randbetween is volatile. So recalculation happens everytime you change something. If you are to return 8k data, that would be a lot of recalculation. It may take a while.

注:Randbetween波动。所以每当你改变一些东西的时候就会重新计算。如果要返回8k数据,就需要进行大量的重新计算。这可能需要一段时间。

#4


0  

I personally used a handy and useful plugin or lets say add-on specially for the Microsoft Excel 2016 / 64 bit. It is called Kutools. You can freely download and use it via this link:

我个人使用了一个方便和有用的插件,或者让我们说附加组件,特别为微软Excel 2016 / 64位。它被称为Kutools。你可透过以下连结免费下载及使用:

Download Link (for both 32 Bit & 64 Bit) Kutools Website

下载链接(32位和64位)库工具网站

After downloading and installing you can select random number of rows from the kutools tab-> Range -> Sort Range Randomly ->Select then you can enter the amount of your need to select the rows from and that's it.

下载并安装后,您可以从kutools选项卡->范围->排序范围随机选择->选择然后您可以输入您需要从其中选择的行数量,仅此而已。

Fig of Kutool tab Fig of Select Tab

选择选项卡的Kutool选项卡图。

#1


9  

I don't know how random you want this to be but adding a column containing =RANDBETWEEN(1,85038) copied down to suit, then sorting that column and selecting the first 8,504 rows should give quite an 'arbitrary' result.

我不知道您希望它有多大的随机性,但是添加一个包含=RANDBETWEEN(1,85038)的列以适应需要,然后对该列进行排序并选择前8,504行应该会得到相当“任意”的结果。

#2


0  

Are you familiar with SQL and the Microsoft Query functionality in Excel (Data ->...-> From Microsoft Query)?

您是否熟悉SQL和Excel中的Microsoft查询功能(Data ->…)从微软- >查询)?

If yes then use this

如果是,那就用这个

( SELECT "first name, email, userid, app name" )
UNION   
( SELECT TOP 8503 t.[first name] & "," & t.[email] & "," &  t.[userid] & "," & t.[app name] 
FROM [Sheet1$] AS t ORDER BY RND() )

Then copy an paste to an empty text file and save as CSV

然后将粘贴复制到一个空的文本文件并保存为CSV

You can also use my SQL addin for this http://blog.tkacprow.pl/?page_id=130

您还可以为这个http://blog.tkacprow.pl/?page_id=130使用我的SQL插件

EDIT 1: I assumed that "Sheet1" is the name of your worksheet

编辑1:我假设“Sheet1”是您工作表的名称

#3


0  

Here's a possible solution for you using Array Formula.
Suppose you have data in Column A (in this example I used 100 data only).

这是使用数组公式的一种可能的解决方案。假设在A列中有数据(在本例中,我只使用了100个数据)。

我有一个85038行的excel表格,我如何随机选择其中的10% ?

Now in C2, type the following formula: (Credits to Oscar.)

现在在C2中,输入以下公式:(归功于奥斯卡)

=IF(ROW(A1)<=0.1*COUNTA($A$2:$A$101),INDEX($A$2:$A$101, LARGE(MATCH(ROW($A$2:$A$101), ROW($A$2:$A$101))*NOT(COUNTIF($C$1:C1, $A$2:$A$101)), RANDBETWEEN(1,ROWS($A$2:$A$101)-ROW(A1)+1))),"")

=如果(行(A1)< = 0.1 * COUNTA($ $ 2:$ 101美元),指数(2美元:101美元,美元大(匹配(行($ $ 2:$ 101美元)、行($ $ 2:$ 101美元))*不是(条件统计($ C $ 1:C1,$ 2美元:101美元)),RANDBETWEEN(1行($ $ 2:$ 101美元)行(A1)+ 1)))," ")

Use Ctrl+Shift+Enter to get the formula to work.
Using just Enter will return #N/A.
Then to get the rest of the values, just drag the formula down.
In this example, I just auto-fill up to C20.

使用Ctrl+Shift+Enter键使公式生效。使用just Enter会返回#N/A。然后要得到其余的值,只要把公式拖下来。在这个例子中,我只是自动填充C20。

Note: Randbetween is volatile. So recalculation happens everytime you change something. If you are to return 8k data, that would be a lot of recalculation. It may take a while.

注:Randbetween波动。所以每当你改变一些东西的时候就会重新计算。如果要返回8k数据,就需要进行大量的重新计算。这可能需要一段时间。

#4


0  

I personally used a handy and useful plugin or lets say add-on specially for the Microsoft Excel 2016 / 64 bit. It is called Kutools. You can freely download and use it via this link:

我个人使用了一个方便和有用的插件,或者让我们说附加组件,特别为微软Excel 2016 / 64位。它被称为Kutools。你可透过以下连结免费下载及使用:

Download Link (for both 32 Bit & 64 Bit) Kutools Website

下载链接(32位和64位)库工具网站

After downloading and installing you can select random number of rows from the kutools tab-> Range -> Sort Range Randomly ->Select then you can enter the amount of your need to select the rows from and that's it.

下载并安装后,您可以从kutools选项卡->范围->排序范围随机选择->选择然后您可以输入您需要从其中选择的行数量,仅此而已。

Fig of Kutool tab Fig of Select Tab

选择选项卡的Kutool选项卡图。