This function uses quanteda::kwic()
to return a concordance for a search pattern. The function takes as input three datasets and a pattern and returns a data frame with the hits labelled for authorship.
Usage
concordance(
q.data,
k.data,
reference.data,
search,
token.type = "word",
window = 5,
case_insensitive = TRUE
)
Arguments
- q.data
A
quanteda
corpus object, such as the output ofcreate_corpus()
.- k.data
A
quanteda
corpus object, such as the output ofcreate_corpus()
.- reference.data
A
quanteda
corpus object, such as the output ofcreate_corpus()
. This is optional.- search
A string. It can be any sequence of characters and it also accepts the use of * as a wildcard.
- token.type
Choice between "word" (default), which searches for word or punctuation mark tokens, or "character", which instead uses a single character search.
- window
The number of context items to be displayed around the keyword (a
quanteda::kwic()
parameter).- case_insensitive
Logical; if TRUE, ignore case (a
quanteda::kwic()
parameter).
Examples
concordance(enron.sample[1], enron.sample[2], enron.sample[3:49], "wants to", token.type = "word")
#> docname from to pre node post
#> 1 known [Kh Mail_1].txt 5 6 N N N N wants to be N when he V
#> 2 known [Ld Mail_5].txt 160 161 D S D . he wants to V to V the N
#> 3 known [Lb Mail_1].txt 573 574 our N . anyone that wants to V us is J .
#> authorship
#> 1 Q
#> 2 Reference
#> 3 Reference
#using wildcards
concordance(enron.sample[1], enron.sample[2], enron.sample[3:49], "want * to", token.type = "word")
#> docname from to pre node
#> 1 known [Kw Mail_2].txt 672 674 let me know if you want me to
#> 2 known [Lc Mail_5].txt 175 177 s N . if you want me to
#> 3 known [Ml Mail_5].txt 242 244 need . you do n't want me to
#> post authorship
#> 1 V on other N in Reference
#> 2 , i can put on Reference
#> 3 come work for you too Reference
#searching character sequences with wildcards
concordance(enron.sample[1], enron.sample[2], enron.sample[3:49], "help*", token.type = "character")
#> docname from to pre node post authorship
#> 1 known [Kh Mail_1].txt 703 707 need help V it Q
#> 2 known [Kh Mail_1].txt 2014 2018 want help V it Q
#> 3 known [Kh Mail_3].txt 1797 1801 N , helpe d the K
#> 4 known [Kh Mail_4].txt 52 56 P P helpe d the Reference
#> 5 unknown [Kw Mail_3].txt 2756 2760 ding help in th Reference
#> 6 known [Kw Mail_5].txt 31 35 your help and N Reference
#> 7 known [Kw Mail_5].txt 1463 1467 need help in do Reference
#> 8 known [Lc Mail_2].txt 1600 1604 some help . why Reference
#> 9 known [Lc Mail_5].txt 1163 1167 d of help and B Reference
#> 10 known [Ld Mail_2].txt 285 289 ally help us ou Reference
#> 11 known [Lt Mail_1].txt 884 888 r be helpi ng to Reference
#> 12 known [Lt Mail_1].txt 919 923 , or help V a N Reference
#> 13 known [Lt Mail_3].txt 910 914 your help as a Reference
#> 14 known [Lt Mail_4].txt 1611 1615 ttle help from Reference
#> 15 unknown [Lk Mail_4].txt 1243 1247 N to help V N f Reference
#> 16 unknown [Lk Mail_4].txt 1272 1276 N to help V the Reference
#> 17 known [Lk Mail_1].txt 1512 1516 ease help him w Reference
#> 18 known [Lk Mail_2].txt 387 391 ight help . ple Reference
#> 19 known [Lk Mail_3].txt 994 998 ease help him a Reference
#> 20 unknown [Lb Mail_3].txt 2279 2283 N to help our N Reference
#> 21 unknown [Lb Mail_3].txt 2405 2409 and help the N Reference
#> 22 unknown [Lb Mail_3].txt 2479 2483 g to help out w Reference
#> 23 unknown [Lb Mail_3].txt 2617 2621 and help them Reference
#> 24 known [Lb Mail_1].txt 1363 1367 d to help you i Reference
#> 25 known [Lb Mail_2].txt 1652 1656 and help in V Reference
#> 26 known [Lb Mail_2].txt 1676 1680 and helpi ng ea Reference
#> 27 known [Lb Mail_4].txt 1038 1042 e to help P N a Reference
#> 28 known [Lb Mail_5].txt 1066 1070 this helps V ou Reference
#> 29 known [La Mail_2].txt 2086 2090 ould help V the Reference
#> 30 known [La Mail_2].txt 2494 2498 ould help get t Reference
#> 31 known [La Mail_4].txt 1908 1912 also help V N . Reference
#> 32 known [La Mail_5].txt 2424 2428 will help the N Reference
#> 33 known [Mf Mail_2].txt 805 809 your help . thi Reference
#> 34 known [Mf Mail_2].txt 2097 2101 any help you c Reference
#> 35 known [Mf Mail_2].txt 2458 2462 can help with Reference
#> 36 known [Ml Mail_1].txt 596 600 you help and V Reference
#> 37 known [Ml Mail_1].txt 1223 1227 your help and l Reference
#> 38 known [Ml Mail_1].txt 2492 2496 and help save Reference
#> 39 known [Ml Mail_1].txt 2598 2602 your help and V Reference
#> 40 known [Ml Mail_2].txt 12 16 your help and V Reference
#> 41 known [Ml Mail_4].txt 622 626 your help in V Reference
#> 42 known [Ml Mail_4].txt 1296 1300 ou N helpe d us Reference
#> 43 known [Ml Mail_4].txt 1589 1593 your help so th Reference
#> 44 known [Ml Mail_4].txt 1962 1966 your help and V Reference
#> 45 known [Ml Mail_5].txt 475 479 an B help you V Reference