There are 5 columns (first name, email, userid, app name) and I want to randomly select 10% of these rows and export it eventually to a CSV while maintaining the column headers i listed above. thanks a million
有5个列(名字、电子邮件、用户id、应用程序名),我想随机选择其中的10%,并最终导出为CSV,同时维护上面列出的列标题。谢谢一百万
4 个解决方案
#1
9
I don't know how random you want this to be but adding a column containing =RANDBETWEEN(1,85038)
copied down to suit, then sorting that column and selecting the first 8,504 rows should give quite an 'arbitrary' result.
我不知道您希望它有多大的随机性,但是添加一个包含=RANDBETWEEN(1,85038)的列以适应需要,然后对该列进行排序并选择前8,504行应该会得到相当“任意”的结果。
#2
0
Are you familiar with SQL and the Microsoft Query functionality in Excel (Data ->...-> From Microsoft Query)?
您是否熟悉SQL和Excel中的Microsoft查询功能(Data ->…)从微软- >查询)?
If yes then use this
如果是,那就用这个
( SELECT "first name, email, userid, app name" )
UNION
( SELECT TOP 8503 t.[first name] & "," & t.[email] & "," & t.[userid] & "," & t.[app name]
FROM [Sheet1$] AS t ORDER BY RND() )
Then copy an paste to an empty text file and save as CSV
然后将粘贴复制到一个空的文本文件并保存为CSV
You can also use my SQL addin for this http://blog.tkacprow.pl/?page_id=130
您还可以为这个http://blog.tkacprow.pl/?page_id=130使用我的SQL插件
EDIT 1: I assumed that "Sheet1" is the name of your worksheet
编辑1:我假设“Sheet1”是您工作表的名称
#3
0
Here's a possible solution for you using Array Formula.
Suppose you have data in Column A (in this example I used 100 data only).
这是使用数组公式的一种可能的解决方案。假设在A列中有数据(在本例中,我只使用了100个数据)。
Now in C2, type the following formula: (Credits to Oscar.)
现在在C2中,输入以下公式:(归功于奥斯卡)
=IF(ROW(A1)<=0.1*COUNTA($A$2:$A$101),INDEX($A$2:$A$101, LARGE(MATCH(ROW($A$2:$A$101), ROW($A$2:$A$101))*NOT(COUNTIF($C$1:C1, $A$2:$A$101)), RANDBETWEEN(1,ROWS($A$2:$A$101)-ROW(A1)+1))),"")
=如果(行(A1)< = 0.1 * COUNTA($ $ 2:$ 101美元),指数(2美元:101美元,美元大(匹配(行($ $ 2:$ 101美元)、行($ $ 2:$ 101美元))*不是(条件统计($ C $ 1:C1,$ 2美元:101美元)),RANDBETWEEN(1行($ $ 2:$ 101美元)行(A1)+ 1)))," ")
Use Ctrl+Shift+Enter to get the formula to work.
Using just Enter will return #N/A.
Then to get the rest of the values, just drag the formula down.
In this example, I just auto-fill up to C20.
使用Ctrl+Shift+Enter键使公式生效。使用just Enter会返回#N/A。然后要得到其余的值,只要把公式拖下来。在这个例子中,我只是自动填充C20。
Note: Randbetween is volatile. So recalculation happens everytime you change something. If you are to return 8k data, that would be a lot of recalculation. It may take a while.
注:Randbetween波动。所以每当你改变一些东西的时候就会重新计算。如果要返回8k数据,就需要进行大量的重新计算。这可能需要一段时间。
#4
0
I personally used a handy and useful plugin or lets say add-on specially for the Microsoft Excel 2016 / 64 bit. It is called Kutools. You can freely download and use it via this link:
我个人使用了一个方便和有用的插件,或者让我们说附加组件,特别为微软Excel 2016 / 64位。它被称为Kutools。你可透过以下连结免费下载及使用:
Download Link (for both 32 Bit & 64 Bit) Kutools Website
下载链接(32位和64位)库工具网站
After downloading and installing you can select random number of rows from the kutools tab-> Range -> Sort Range Randomly ->Select then you can enter the amount of your need to select the rows from and that's it.
下载并安装后,您可以从kutools选项卡->范围->排序范围随机选择->选择然后您可以输入您需要从其中选择的行数量,仅此而已。
Fig of Kutool tab Fig of Select Tab
选择选项卡的Kutool选项卡图。
#1
9
I don't know how random you want this to be but adding a column containing =RANDBETWEEN(1,85038)
copied down to suit, then sorting that column and selecting the first 8,504 rows should give quite an 'arbitrary' result.
我不知道您希望它有多大的随机性,但是添加一个包含=RANDBETWEEN(1,85038)的列以适应需要,然后对该列进行排序并选择前8,504行应该会得到相当“任意”的结果。
#2
0
Are you familiar with SQL and the Microsoft Query functionality in Excel (Data ->...-> From Microsoft Query)?
您是否熟悉SQL和Excel中的Microsoft查询功能(Data ->…)从微软- >查询)?
If yes then use this
如果是,那就用这个
( SELECT "first name, email, userid, app name" )
UNION
( SELECT TOP 8503 t.[first name] & "," & t.[email] & "," & t.[userid] & "," & t.[app name]
FROM [Sheet1$] AS t ORDER BY RND() )
Then copy an paste to an empty text file and save as CSV
然后将粘贴复制到一个空的文本文件并保存为CSV
You can also use my SQL addin for this http://blog.tkacprow.pl/?page_id=130
您还可以为这个http://blog.tkacprow.pl/?page_id=130使用我的SQL插件
EDIT 1: I assumed that "Sheet1" is the name of your worksheet
编辑1:我假设“Sheet1”是您工作表的名称
#3
0
Here's a possible solution for you using Array Formula.
Suppose you have data in Column A (in this example I used 100 data only).
这是使用数组公式的一种可能的解决方案。假设在A列中有数据(在本例中,我只使用了100个数据)。
Now in C2, type the following formula: (Credits to Oscar.)
现在在C2中,输入以下公式:(归功于奥斯卡)
=IF(ROW(A1)<=0.1*COUNTA($A$2:$A$101),INDEX($A$2:$A$101, LARGE(MATCH(ROW($A$2:$A$101), ROW($A$2:$A$101))*NOT(COUNTIF($C$1:C1, $A$2:$A$101)), RANDBETWEEN(1,ROWS($A$2:$A$101)-ROW(A1)+1))),"")
=如果(行(A1)< = 0.1 * COUNTA($ $ 2:$ 101美元),指数(2美元:101美元,美元大(匹配(行($ $ 2:$ 101美元)、行($ $ 2:$ 101美元))*不是(条件统计($ C $ 1:C1,$ 2美元:101美元)),RANDBETWEEN(1行($ $ 2:$ 101美元)行(A1)+ 1)))," ")
Use Ctrl+Shift+Enter to get the formula to work.
Using just Enter will return #N/A.
Then to get the rest of the values, just drag the formula down.
In this example, I just auto-fill up to C20.
使用Ctrl+Shift+Enter键使公式生效。使用just Enter会返回#N/A。然后要得到其余的值,只要把公式拖下来。在这个例子中,我只是自动填充C20。
Note: Randbetween is volatile. So recalculation happens everytime you change something. If you are to return 8k data, that would be a lot of recalculation. It may take a while.
注:Randbetween波动。所以每当你改变一些东西的时候就会重新计算。如果要返回8k数据,就需要进行大量的重新计算。这可能需要一段时间。
#4
0
I personally used a handy and useful plugin or lets say add-on specially for the Microsoft Excel 2016 / 64 bit. It is called Kutools. You can freely download and use it via this link:
我个人使用了一个方便和有用的插件,或者让我们说附加组件,特别为微软Excel 2016 / 64位。它被称为Kutools。你可透过以下连结免费下载及使用:
Download Link (for both 32 Bit & 64 Bit) Kutools Website
下载链接(32位和64位)库工具网站
After downloading and installing you can select random number of rows from the kutools tab-> Range -> Sort Range Randomly ->Select then you can enter the amount of your need to select the rows from and that's it.
下载并安装后,您可以从kutools选项卡->范围->排序范围随机选择->选择然后您可以输入您需要从其中选择的行数量,仅此而已。
Fig of Kutool tab Fig of Select Tab
选择选项卡的Kutool选项卡图。