본문 바로가기

알고리즘

[C] 백준 3978번 : Hacking

문제

 

A coach of one of the soccer world finals teams (lets call him Hugo Hacker) wants to find out secret information about an opposing team before the game. The coach of the opposing team has a website with public information about his team. Hugo suspects that also secret information is stored on the computer which hosts the website.

The website contains a form which allows to search for key words and returns a chunk of a text file which contains the key word. Hugo has found out that by entering words which cannot be found in the documents publicly available, he can exploit a bug in the search and get access to other files on the computer. He already knows the publicly available documents. However the search box has a restriction on the maximum length of a word and the characters which can be entered. Can you tell him a word which can be entered in the search box and which does not occur as a substring in the documents?

 

입력

 

The first line of the input consists of the number of test cases which are to follow. Each test case consists of two lines: in the first line there are three integers n (1 ≤ n ≤ 10 000), m (1 ≤ m ≤ 100) and k (1 ≤ k ≤ 26), where n is the length of the publicly available documents, m is the maximum allowed length of words which can be entered in the search box, and k specifies that the search box allows only the first k characters of the alphabet. The second line of each test case describes the publicly available documents and consists of n lower-case letters. 

 

출력

 

For each test case in the input, print one line in the output containing a word which does not occur as a substring in the given text. The word should have at most m lower-case characters from the first k letters in the alphabet. You may assume that for each given test case, there is always at least one such word (you may print any such word).

 

풀이

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
#include <stdio.h>
#include <math.h>
#define HASHKEY 1234567891
 
int arr[105];
char documents[10000];
int HashSave[10000];
int n,m,k,i,j;
 
int main(){
    int T;
    scanf("%d",&T);
    while(T--){
        scanf("%d %d %d",&n, &m, &k);
        scanf("%s",documents);
        bool saveHash = true;
        int saveHashIdx=0;
        for(i=0;i<m;i++) arr[i]=0;
        if(k==1){
            for(i=0;i<m;i++printf("a");
            printf("\n");
        }else{
            int tmp = log(n)/log(k);
            if(tmp+1<m) m = tmp+1;
            while(1){
                long long int parentHash=0,patternHash=0,mul=1;
                bool finded = false,isEnd = false,skip=false;
                if(saveHash){
                    for(i=0;i<=n-m;i++){
                        if(i==0){
                            for(j=0;j<m;j++){
                                parentHash += (documents[m-1-j]-'a')*mul;
                                parentHash %= HASHKEY;
                                if(j<m-1){
                                    mul *= k;
                                    mul %= HASHKEY;
                                }
                            }
                            HashSave[saveHashIdx++= parentHash;
                        }else{
                            parentHash = k*(parentHash - ((documents[i-1]-'a')*mul)) + documents[m-1+i]-'a';
                            patternHash %= HASHKEY;
                            HashSave[saveHashIdx++= parentHash;
                        }
                    }
                    saveHash = false;
                }
                mul = 1;
                for(j = 0 ; j < m ; j++){
                    patternHash += arr[j]*mul;
                    patternHash %= HASHKEY;
                    if(j<m-1){
                        mul *= k;
                        mul %= HASHKEY;
                    }
                }
                for(i=0;i<=n-m&&!skip;i++){
                    if(HashSave[i] == patternHash){
                        finded = false;
                        for(j = 0 ; j < m ; j++){
                            if(documents[i+j]-'a'!=arr[m-j-1]) break;
                        }
                        if(j==m)  finded = true;
                    }
 
                    if(finded) break;
                    if(i==n-m){
                        for(j=m-1;j>=0;j--printf("%c",arr[j]+'a');
                        printf("\n");
                        isEnd = true;
                    }
                    saveHash = false;
                }
                if(isEnd) break;
 
                int idx=0;
                while(1){
                    arr[idx]++;
                    if(arr[idx]>=k) arr[idx++]=0;
                    else break;
                    if(idx==m) break;
                }
                if(idx==m){
                    m++;
                    for(i=0;i<m;i++) arr[i]=0;
                }
            }
        }
    }
}
 
 
cs

 

GCPC 기출문제!

해시 알고리즘 중 라빈카프를 사용해보았다

라빈 카프 알고리즘도 처음이라 어려웠지만 이 문제를 해결하는데 가장 어려웠던 것은 예외처리였다ㅠㅠ

k가 1인 경우 a를 m만큼 출력해주지 않으면 시간초과가 난다

또한 정답 문자열을 1부터 증가시켜나가도 괜찮지만, GCPC 해설 사이트에서는 m의 초기값을 log(n)/log(k)+1로 시작하는 것을 알 수 있었는데 이부분이 아직 이해가 되지 않는다...

수학 공부가 더 필요한가보다ㅠㅠ

 

이렇게 조건을 만들어 준 다음에는 라빈 카프 알고리즘을 사용해 매번 해시값을 구하지 않으며 기존의 해시값을 이용해 새로운 해시값을 얻어 비교를 쉽게 해 줄 수 있었다

 

반복문과 break 조건 또한 너무 지저분하여서 더 깔끔한 코드를 찾아보고싶다..