2017 ACM-ICPC 亚洲区（南宁赛区）网络赛 Frequent Subsets Problem

1000ms
131072K

The frequent subset problem is defined as follows. Suppose $U$ ={1, 2, $\ldots$ ,N} is the universe, and $S_{1}$ , $S_{2}$ , $\ldots$ , $S_{M}$ are $M$ sets over $U$ . Given a positive constant $\alpha$ , $0<\alpha \leq 1$ , a subset $B$ ( $\neq 0$ ) is α-frequent if it is contained in at least $\alpha M$ sets of $S_{1}$ , $S_{2}$ , $\ldots$ , $S_{M}$ , i.e. $\left | \left \{ i:B\subseteq S_{i} \right \} \right | \geq \alpha M$ . The frequent subset problem is to find all the subsets that are α-frequent. For example, let $U=\{1, 2,3,4,5\}$ , $M = 3$ , $\alpha =0.5$ , and $S_{1}=\{1, 5\}$ , $S_{2}=\{1,2,5\}$ , $S_{3}=\{1,3,4\}$ . Then there are $3$ α-frequent subsets of $U$ , which are $\{1\}$ , $\{5\}$ and $\{1,5\}$ .

Input Format

The first line contains two numbers $N$ and $\alpha$ , where $N$ is a positive integers, and $\alpha$ is a floating-point number between 0 and 1. Each of the subsequent lines contains a set which consists of a sequence of positive integers separated by blanks, i.e., line $i + 1$ contains $S_{i}$ , $\le i \le M$ . Your program should be able to handle $N$ up to $20$ and $M$ up to $50$ .

Output Format

The number of $\alpha$ -frequent subsets.

样例输入

15 0.4
1 8 14 4 13 2
3 7 11 6
10 8 4 2
9 3 12 7 15 2
8 3 2 4 5

样例输出

题目来源

2017 ACM-ICPC 亚洲区（南宁赛区）网络赛

题意：现在给你一个n，再给你不多于50组的数据（及大集合）（每一行算一组数据）（每组数据的数字个数未知，且无重复，且数字不大于n）

现在让你求出，有多少个不同的子集出现在这些集合中的概率大于a。

思路：数据不大，n最大20，最多50组数据；

第几组数据 1 2 3 4 5 ... (最多50组）

   1 : 1 0 0 0 0
    2 : 1 0 1 1 1
3 : 0 1 0 1 1
  4 : 1 0 1 0 1
    5 : 0 0 0 0 1
      6 : 0 1 0 0 0
      7 : 0 1 0 1 0
  8 : 1 0 1 0 1
  9 : 0 0 0 1 0
  10 : 0 0 1 0 0
  11 : 0 1 0 0 0
  12 : 0 0 0 1 0
  13 : 1 0 0 0 0
  14 : 1 0 0 0 0
  15 : 0 0 0 1 0

n : ..............

压缩成n个LL数字（横着看的二进制）

然后枚举所有的可能的子集合，把这个子集合所有对应的LL数字相与（&）后得到的数字转换成的二进制中1的个数就是包含这个子集合的大集合的个数

例1：存在子集{2,4,8}的大集合有多少个？

（1 0 1 1 1 ）& （1 0 1 0 1）&（1 0 1 0 1）=（1 0 1 0 1）代表1,3,5组大集合中含有子集{2,4,8}；

例2：存在子集{3,7}的大集合有多少个？

（0 1 0 1 1 ）& （0 1 0 1 0）=（0 1 0 1 0）代表2,4组大集合中含有子集{3,7}；

代码：

#include<stdio.h>
#include<math.h>
#include<string.h>
#include<algorithm>
#define LL long long
using namespace std;
int a[55][25];
LL d[25];
char s[100];
int ans=0;
double ci;
int LL1(LL x)
{
    int sum=0;
    while(x)
    {
        if(x&1) sum++;
        x>>=1;
    }
    return sum;
}
void dfs(int x,int n,LL w)
{
    if(LL1(w)<ci) return ;
    if(x>n)
    {
        ans++;
        return ;
    }
    dfs(x+1,n,w&d[x]);
    dfs(x+1,n,w);
    return ;
}
int main()
{
    memset(a,0,sizeof(a));
    memset(d,0,sizeof(d));
    int n,k=0;
    double m;
    scanf("%d%lf",&n,&m);
    getchar();
    while(gets(s))
    {
        int la=strlen(s);
        int sum=0;
        for(int i=0; i<la; i++)
        {
            if(s[i]==' ')
            {
                a[k][sum]=1;
                sum=0;
                continue;
            }
            sum=sum*10+s[i]-'0';
        }
        a[k][sum]=1;
        k++;
    }
    ci=k*m-0.0000001;
    for(int j=1; j<=n; j++)
        for(int i=0; i<k; i++)
            d[j]=(d[j]<<1)+a[i][j];
    dfs(1,n,(1LL<<51)-1);//(1LL<<51)-1代表二进制50个1（初始化）
    printf("%d\n",ans-1);
}

另一种代码：

hash思想吧，把每组数据hash成一个数字。思想类似

#include<iostream>
#include<algorithm>
#include<cstring>
#include<cmath>
#include<queue>
#include<cstdio>
#define ll long long
#define lz 2*u,l,mid
#define rz 2*u+1,mid+1,r
#define mset(a,x) memset(a,x,sizeof(a))

using namespace std;
const double PI=acos(-1);
const int inf=0x3f3f3f3f;
const double esp=1e-12;
const int maxn=400005;
const int mod=1e9+7;
int dir[4][2]={0,1,1,0,0,-1,-1,0};
ll gcd(ll a,ll b){return b?gcd(b,a%b):a;}
ll lcm(ll a,ll b){return a/gcd(a,b)*b;}
ll inv(ll b){if(b==1)return 1; return (mod-mod/b)*inv(mod%b)%mod;}
ll fpow(ll n,ll k){ll r=1;for(;k;k>>=1){if(k&1)r=r*n%mod;n=n*n%mod;}return r;}
int a[101];

int main()
{
    int n,x,i;
    double k;
    cin>>n>>k;
    n=(1<<n);
    mset(a,0);
    int top=1;
    while(scanf("%d",&x)!=EOF)
    {
        a[top]+=(1<<(x-1));
        if(getchar()=='\n')
        top++;
    }
    int ans=0;
    for(i=1;i<n;i++)
    {
        int c=0;
        for(int j=1;j<=top;j++)
        {
            if((a[j]&i)==i)
                c++;
        }
        if(1.0*c/top>=k-esp)
        ans++;
    }
    cout<<ans<<endl;
    return 0;
}

秒客网