PT | EN | ES

Main Menu


Powered by <TEI:TOK>
Maarten Janssen, 2014-

Corpus Search

CQL Query: query builder | visualize | options

Special characters: ~u = ũ, ~e = ẽ, ~i = ĩ, ~y = ỹ

Search - Help

Type in a search query in the CQP (Corpus Query Protocol) format.
Which fields can be searched on depends on the corpus, but they typically include:

The CQP Query syntax uses an intuitive system of defining properties of words you are looking for, as in for instance:

[nform="casa"] [pos="A.*"] (all graphemic variants of the orthographic word casa followed by an ADJECTIVE)

[pos="N.*"] [lemma="ter"] [pos="V.P.*"] (all possible NOUN words, followed by forms in the paradigm of the VERB ter and by the PAST PARTICIPLE of any verb)

[word=".*au.*"] (orthographic words that have the sequence "au" in it)

[nform="Porém" %ci] (the orthographic word Porém, either with a capital initial or not)

[nform="que" & pos="CS"] (same token, different attributes: all orthographic words que, but only if they have the complementizer value)

@[lemma="o"] [pos="V.*"] (the lemma o -- i.e., o, a, os, as-- followed by any VERB form)

[pos!="DA.*"] [lemma="meu"|lemma="teu"|lemma="seu"]  (anything but an article before Portuguese possessives)

[pos="PP...D.*" & id="w.*"] [pos!="V.*"] (dative clitics in interpolation structures)

[pos="PP...A.*" & id="w.*"] [pos!="V.*"] (accusative clitics in interpolation structures)

[pos="PP.C.0.*" & id="w.*"] [pos!="V.*"] (dative/accusative clitics in interpolation structures)

NB: if you use the CQP Query syntax to search for an individual word or words, you can simply write the ones you intend to find, without any other non-alphabetic characters. In this case, the spelling must match exactly the one in the original document. For example, if you type only “recibí una carta”, the search will not return the possible original variants “resebi”, “Recebi”, "caRta" or "Carta".

Multisearch

The system includes an option, still under development, that allows for multiple simultaneous searches and for the comparison of their results. You can test the interface here. On the Searches table, you can store several search expressions: on the left there is a column for an arbitrary query name and on the right column you write the query using the correct CQL syntax. An example is the comparison between the written forms “mujer” and “muger”:

Query name CQL Query
mujer [form="mujer"]
muger [form="muger"]

Another example is the comparison between the expressions "además de", "demás de", "a más de":

Query name CQL Query
además de [nform="además"] [nform="de"]
demás de [nform="demás"] [nform="de"]
a más de [nform="a"] [nform="más"] [nform="de"]

A third example is the diachronic behaviour of some feature:

Query name CQL Query
ditongos e hiatos portugueses XVI-XVII [ltags="diphthong_and_hiatus"] :: match.text_lang = "PT" & int(match.text_year) >= 1500 & int(match.text_year) <= 1650 within text
ditongos e hiatos portugueses XVII-XVIII [ltags="diphthong_and_hiatus"] :: match.text_lang = "PT" & int(match.text_year) >= 1651 & int(match.text_year) <= 1750 within text
ditongos e hiatos portugueses XVIII-XIX [ltags="diphthong_and_hiatus"] :: match.text_lang = "PT" & int(match.text_year) >= 1751 & int(match.text_year) <= 1840 within text

If you prefer to see the multisearch results on a map, you should launch the search from here.

Search on the Syntactically Annotated Corpus

Please go to the window where you can either select predefined queries or type them using XPath syntax: Tree Search.

Raw Search

If the Search explained above doesn't serve your purposes, please try the Raw Search.

Note on the social classification of letters' authors and addressees

The social classification of Participants is based on the court each individual could appeal to during the Early Modern times according to his or her juridical condition. Nine possibilities were considered: