Here is some info that we hope will be useful to BPFK commissioners, and other people doing research on the Lojban language.
The two alternatives above will often give so many false positives in English so as to be useless. The main source we have for Lojban usage is the IRC logs:
These are filtered line-by-line to exclude lines that have to many words that are not possible Lojban word-forms, so it is a very high-quality corpus. Unfortunately, the files are separated by date, so it is difficult to search inside them.
We also have some contributed texts that were uploaded to the old Twiki, and (to my knowledge) not available elsewhere: