I'm stumped on a particular problem with some data that's imported to me. I have zero control over how the data is coming in. (Just to clarify that point).
我被一些导入的数据所困扰。我无法控制数据的输入。(为了澄清这一点)。
I have two columns and 107,000 rows.
我有两列和107000行。
Column A has an ID#, Column B has the corresponding Date.
A列有ID号,B列有相应的日期。
The issue I have is that Column A can have multiple identical values, and the corresponding date value in Column B has different or same dates.
我的问题是,A列可以有多个相同的值,而B列中相应的日期值有不同或相同的日期。
I'm looking to add column C with a way to look up the cell in Column A, check it against the rest of column A, find any matches, and then return the Max/most recent date from column B for that ID#.
我希望添加列C,以查找列a中的单元格,与列a的其余部分进行检查,找到任何匹配,然后从列B返回该ID#的最大/最近日期。
2 个解决方案
#1
0
Please try:
请尝试:
=MAX(IF(A:A=A1,B:B))
entered with Ctrl+Shift+Enter and copied down to suit.
输入Ctrl+Shift+Enter并复制到suit。
I'm afraid this could be quite slow.
恐怕这可能会很慢。
I did not limit the range because I assumed 107,000 rows was an approximation. However this is slow even for 1,000 rows, so for emphasis I repeat part of @XOR LX's comment:
我没有限制范围,因为我假设107000行是一个近似值。然而,即使是1000行,这也是很慢的,因此,为了强调这一点,我重复了部分@XOR LX的评论:
Even reducing the number of rows being referenced by a factor of 10 will have a significant improvement on calculation speed.
即使减少10倍引用的行数,也会显著提高计算速度。
#2
1
Another possible solution:
另一个可能的解决方案:
Sort Columns A and B by Column B Newest to Oldest
将A和B列按B列排序,从最新的到最老的
Copy Column A (the ID#) to column D
将列A (ID#)复制到列D
Remove duplicates from column D
从D列中删除副本
Use VLOOKUP in column E -- In E1 put VLOOKUP(D1,A:B,2,FALSE) and copy down
在E列中使用VLOOKUP——在E1中使用VLOOKUP(D1,A:B,2,FALSE)并将其复制
Columns D and E will now be unique ID numbers and the newest date.
D和E列将是唯一的ID号和最新的日期。
#1
0
Please try:
请尝试:
=MAX(IF(A:A=A1,B:B))
entered with Ctrl+Shift+Enter and copied down to suit.
输入Ctrl+Shift+Enter并复制到suit。
I'm afraid this could be quite slow.
恐怕这可能会很慢。
I did not limit the range because I assumed 107,000 rows was an approximation. However this is slow even for 1,000 rows, so for emphasis I repeat part of @XOR LX's comment:
我没有限制范围,因为我假设107000行是一个近似值。然而,即使是1000行,这也是很慢的,因此,为了强调这一点,我重复了部分@XOR LX的评论:
Even reducing the number of rows being referenced by a factor of 10 will have a significant improvement on calculation speed.
即使减少10倍引用的行数,也会显著提高计算速度。
#2
1
Another possible solution:
另一个可能的解决方案:
Sort Columns A and B by Column B Newest to Oldest
将A和B列按B列排序,从最新的到最老的
Copy Column A (the ID#) to column D
将列A (ID#)复制到列D
Remove duplicates from column D
从D列中删除副本
Use VLOOKUP in column E -- In E1 put VLOOKUP(D1,A:B,2,FALSE) and copy down
在E列中使用VLOOKUP——在E1中使用VLOOKUP(D1,A:B,2,FALSE)并将其复制
Columns D and E will now be unique ID numbers and the newest date.
D和E列将是唯一的ID号和最新的日期。