Lojban Sentence Templates Posted by Eppcott on Mon 10 of Nov, 2008 18:38 GMT posts: 4740 Use this thread to discuss the Lojban Sentence Templates page.
Posted by Eppcott on Mon 10 of Nov, 2008 18:38 GMT posts: 4740 A couple of years ago I recommended to Robin a type of Lojban random sentence generator that relies on templates. Taking an existing utterance that parses, you transform each cmavo into the upper-case symbol for its selma'o. You transform each gismu, lujvo and fu'ivla into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively. You have a template. To use the template to make a random sentence, replace each selma'o tag with a cmavo from that selma'o, and each gismu with an actual gismu. You get a sentence that parses. Let's say you have the template LE MI GISMU CU GISMU KO'A2. Fill in the slots randomly with "le do bakni cu sovna di'u" and you get something nonsensical (Your cow is an egg of the last utterance.), but it parses. The reason I bring this up now is that I would like to find out which templates are the most common in the searchable corpus of Lojban utterances, such as IRC logs. This would suggest very useful templates for the home-game I'm building with dice, paper, and ceramics. I am considering recording audio learning courses in which I would group Lojban utterances (sensable ones) by template, to provide a variety of examples of simple valid sentence structures, and relate selma'o through substitution. Perhaps someone could intuit the most common and useful templates, if not search it with textfile processing. -Eppcott To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by adamgarrigus on Mon 10 of Nov, 2008 18:51 GMT posts: 92 On Mon, Nov 10, 2008 at 1:37 PM, Matt Arnold <matt.mattarn@gmail.com> wrote: > A couple of years ago I recommended to Robin a type of Lojban random > sentence generator that relies on templates. Taking an existing > utterance that parses, you transform each cmavo into the upper-case > symbol for its selma'o. You transform each gismu, lujvo and fu'ivla > into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively. > You have a template. > > To use the template to make a random sentence, replace each selma'o > tag with a cmavo from that selma'o, and each gismu with an actual > gismu. You get a sentence that parses. Let's say you have the template > LE MI GISMU CU GISMU KO'A2. Fill in the slots randomly with "le do > bakni cu sovna di'u" and you get something nonsensical (Your cow is > an egg of the last utterance.), but it parses. > > The reason I bring this up now is that I would like to find out which > templates are the most common in the searchable corpus of Lojban > utterances, such as IRC logs. This would suggest very useful templates > for the home-game I'm building with dice, paper, and ceramics. I am > considering recording audio learning courses in which I would group > Lojban utterances (sensable ones) by template, to provide a variety of > examples of simple valid sentence structures, and relate selma'o > through substitution. Perhaps someone could intuit the most common and > useful templates, if not search it with textfile processing. Smallish point: Is there any reason not to merge < gismu >, < lujvo >, and < fu'ivla > into < brivla >? The distinctions between those three seem not relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works just as well, right? mu'o mi'e komfo,amonan
Posted by Eppcott on Mon 10 of Nov, 2008 19:26 GMT posts: 4740 True! On Mon, Nov 10, 2008 at 1:50 PM, komfo,amonan <komfoamonan@gmail.com> wrote: > On Mon, Nov 10, 2008 at 1:37 PM, Matt Arnold <matt.mattarn@gmail.com> wrote: >> >> A couple of years ago I recommended to Robin a type of Lojban random >> sentence generator that relies on templates. Taking an existing >> utterance that parses, you transform each cmavo into the upper-case >> symbol for its selma'o. You transform each gismu, lujvo and fu'ivla >> into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively. >> You have a template. >> >> To use the template to make a random sentence, replace each selma'o >> tag with a cmavo from that selma'o, and each gismu with an actual >> gismu. You get a sentence that parses. Let's say you have the template >> LE MI GISMU CU GISMU KO'A2. Fill in the slots randomly with "le do >> bakni cu sovna di'u" and you get something nonsensical (Your cow is >> an egg of the last utterance.), but it parses. >> >> The reason I bring this up now is that I would like to find out which >> templates are the most common in the searchable corpus of Lojban >> utterances, such as IRC logs. This would suggest very useful templates >> for the home-game I'm building with dice, paper, and ceramics. I am >> considering recording audio learning courses in which I would group >> Lojban utterances (sensable ones) by template, to provide a variety of >> examples of simple valid sentence structures, and relate selma'o >> through substitution. Perhaps someone could intuit the most common and >> useful templates, if not search it with textfile processing. > > Smallish point: Is there any reason not to merge < gismu >, < lujvo >, and < > fu'ivla > into < brivla >? The distinctions between those three seem not > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works > just as well, right? > mu'o mi'e komfo,amonan To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by PierreAbbat on Mon 10 of Nov, 2008 21:30 GMT posts: 324 On Monday 10 November 2008 13:50:47 komfo,amonan wrote: > Smallish point: Is there any reason not to merge < gismu >, < lujvo >, and > < fu'ivla > into < brivla >? The distinctions between those three seem not > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works > just as well, right? arxokuna ki'a? .i lumge'u java'i prokiono xu? mu'o mi'e pier To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by adamgarrigus on Mon 10 of Nov, 2008 22:35 GMT posts: 92 On Mon, Nov 10, 2008 at 4:28 PM, Pierre Abbat <phma@phma.optus.nu> wrote: > On Monday 10 November 2008 13:50:47 komfo,amonan wrote: > > Smallish point: Is there any reason not to merge < gismu >, < lujvo >, > and > > < fu'ivla > into < brivla >? The distinctions between those three seem > not > > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works > > just as well, right? > > arxokuna ki'a? .i lumge'u java'i prokiono xu? go'i .i mi ca lo nu djica lo nu fanva cu pu facki fi no valsi be fi la lojban .i zvati pe'a la jbovlaste ji lo drata mu'o mi'e komfo,amonan
Posted by pdf23ds on Mon 10 of Nov, 2008 23:51 GMT posts: 143 On Mon, Nov 10, 2008 at 12:37, Matt Arnold <matt.mattarn@gmail.com> wrote: > A couple of years ago I recommended to Robin a type of Lojban random > sentence generator that relies on templates. Taking an existing > utterance that parses, you transform each cmavo into the upper-case > symbol for its selma'o. You transform each gismu, lujvo and fu'ivla > into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively. > You have a template. Lojbob has a working sentence parser, but he never did release it (that I'm aware of). I have a copy on my server, and I'll give the URL if it's alright with him. Chris Capel -- "What is it like to be a bat? What is it like to bat a bee? What is it like to be a bee being batted? What is it like to be a batted bee?" -- The Mind's I (Hofstadter, Dennet) To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by arj on Tue 11 of Nov, 2008 07:59 GMT posts: 953 On Mon, Nov 10, 2008 at 01:37:22PM -0500, Matt Arnold wrote: > The reason I bring this up now is that I would like to find out which > templates are the most common in the searchable corpus of Lojban > utterances, such as IRC logs. This would suggest very useful templates > for the home-game I'm building with dice, paper, and ceramics. I am > considering recording audio learning courses in which I would group > Lojban utterances (sensable ones) by template, to provide a variety of > examples of simple valid sentence structures, and relate selma'o > through substitution. Perhaps someone could intuit the most common and > useful templates, if not search it with textfile processing. With the help of #lojban, I've cobbled together a small collection of scripts that runs through the IRC logs and outputs the sequence of selma'o/word classes that are used. It's taking very long to run — I don't expect it to complete in days — but here are the ten most frequent templates as of now (about 10% complete): 3191 COI 2071 COI cmene 1675 UI 1290 COI PA KOhA 564 gismu 415 UI CAI 363 KOhA gismu 337 PA 332 COI gismu 324 GOhA No big surprises here. -- Arnt Richard Johansen http://arj.nvg.org/ <Nixon> Etter revolusjonen har jeg ordnet meg slik at jeg fÃ¥r meg statue. <Nixon> Har avtalt dette med nøkkelpersoner pÃ¥ venstresiden. <Nixon> Som takk for min innsats. <Nixon> Det blir en 150m høy statue i havnebassenget. <Kre> skal du ha restaurant i hodet? To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by Anonymous on Tue 11 of Nov, 2008 11:37 GMT On Tue, Nov 11, 2008 at 4:57 AM, Arnt Richard Johansen <arj@nvg.org> wrote: > > It's taking very long to run — I don't expect it to complete in days — but here are the ten most frequent templates as of now (about 10% complete): > > 3191 COI > 2071 COI cmene > 1675 UI > 1290 COI PA KOhA > 564 gismu > 415 UI CAI > 363 KOhA gismu > 337 PA > 332 COI gismu > 324 GOhA > > No big surprises here. Single PA is a little bit surprising to me. Many answers to "how many?" questions? Could they be infiltrated English "no"s? The rest are all expected. I'm not sure that generating anything but {coi ro do} out of COI PA KOhA makes much sense though. Probably 100% of the template is made with the same words. Similarly for some of the other templates. mu'o mi'e xorxes To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by pdf23ds on Tue 11 of Nov, 2008 13:54 GMT posts: 143 On Mon, Nov 10, 2008 at 17:49, Chris Capel <pdf23ds@gmail.com> wrote: > Lojbob has a working sentence parser, but he never did release it > (that I'm aware of). I have a copy on my server, and I'll give the URL > if it's alright with him. Ack. "Sentence parser"? I meant random sentence generator. Chris Capel -- "What is it like to be a bat? What is it like to bat a bee? What is it like to be a bee being batted? What is it like to be a batted bee?" -- The Mind's I (Hofstadter, Dennet) To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by arj on Tue 11 of Nov, 2008 18:48 GMT posts: 953 On Tue, Nov 11, 2008 at 08:33:11AM -0300, Jorge LlambÃas wrote: > On Tue, Nov 11, 2008 at 4:57 AM, Arnt Richard Johansen <arj@nvg.org> wrote: > > > > 337 PA > > Single PA is a little bit surprising to me. Many answers to "how > many?" questions? Could they be infiltrated English "no"s? You're absolutely correct. 1364 lines in the entire corpus consist of someone saying just "no". Other single-digit utterances (from pa to so) are only at 207, combined. ("so" is probably also infiltration from English.) -- Arnt Richard Johansen http://arj.nvg.org/ Jeg er nok verdens sydligste sengevæter. Forutsatt at ingen pÃ¥ basen pÃ¥ Sydpolen driver med slikt, da. --Erling Kagge: Alene til Sydpolen To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by lagejyspa on Wed 12 of Nov, 2008 14:41 GMT posts: 350 On Mon, Nov 10, 2008 at 5:34 PM, komfo,amonan <komfoamonan@gmail.com> wrote: > On Mon, Nov 10, 2008 at 4:28 PM, Pierre Abbat <phma@phma.optus.nu> wrote: >> >> On Monday 10 November 2008 13:50:47 komfo,amonan wrote: >> > Smallish point: Is there any reason not to merge < gismu >, < lujvo >, >> > and >> > < fu'ivla > into < brivla >? The distinctions between those three seem >> > not >> > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works >> > just as well, right? >> >> arxokuna ki'a? .i lumge'u java'i prokiono xu? > > go'i .i mi ca lo nu djica lo nu fanva cu pu facki fi no valsi be fi la > lojban .i zvati pe'a la jbovlaste ji lo drata doi komfo,amonan .iki'u lo nu do finti zo arxokuna fi'o velji'i tu'a la'o net. http://laxmahispajispaji.blogspot.com/2007_02_01_archive.html net. kei do djuno lo du'u makau fo makau cu te fu'ivla .i ma joi ma du la'edi'u --gejyspa To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by adamgarrigus on Wed 12 of Nov, 2008 17:31 GMT posts: 92 On Wed, Nov 12, 2008 at 9:39 AM, Michael Turniansky <mturniansky@gmail.com>wrote: > On Mon, Nov 10, 2008 at 5:34 PM, komfo,amonan <komfoamonan@gmail.com> > wrote: > > On Mon, Nov 10, 2008 at 4:28 PM, Pierre Abbat <phma@phma.optus.nu> > wrote: > >> > >> On Monday 10 November 2008 13:50:47 komfo,amonan wrote: > >> > Smallish point: Is there any reason not to merge < gismu >, < lujvo >, > >> > and > >> > < fu'ivla > into < brivla >? The distinctions between those three seem > >> > not > >> > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } > works > >> > just as well, right? > >> > >> arxokuna ki'a? .i lumge'u java'i prokiono xu? > > > > go'i .i mi ca lo nu djica lo nu fanva cu pu facki fi no valsi be fi la > > lojban .i zvati pe'a la jbovlaste ji lo drata > > doi komfo,amonan .iki'u lo nu do finti zo arxokuna fi'o velji'i tu'a > la'o net. http://laxmahispajispaji.blogspot.com/2007_02_01_archive.html > net. kei do djuno lo du'u makau fo makau cu te fu'ivla .i ma joi ma du > la'edi'u zo arxokuna fu'ivla fo zoi zoi aroughcun/arethkone/arehkan zoi noi ke'a valsi fi la'o zoi Powhatan zoi zi'e noi ke'a krasi zoi zoi raccoon zoi .i ra morsi pe'a bangu .i je la'a mi fu'irgau ei fi lo valsi be fi lo jmive pe'a bangu be mu'u la'o zoi Mikasuki zoi .i mu'o mi'e komfo,amonan
Posted by arj on Sun 16 of Nov, 2008 11:30 GMT posts: 953 On Tue, Nov 11, 2008 at 08:57:09AM +0100, Arnt Richard Johansen wrote: > On Mon, Nov 10, 2008 at 01:37:22PM -0500, Matt Arnold wrote: > > The reason I bring this up now is that I would like to find out which > > templates are the most common in the searchable corpus of Lojban > > utterances, such as IRC logs. This would suggest very useful templates > > for the home-game I'm building with dice, paper, and ceramics. I am > > considering recording audio learning courses in which I would group > > Lojban utterances (sensable ones) by template, to provide a variety of > > examples of simple valid sentence structures, and relate selma'o > > through substitution. Perhaps someone could intuit the most common and > > useful templates, if not search it with textfile processing. > > With the help of #lojban, I've cobbled together a small collection of scripts that runs through the IRC logs and outputs the sequence of selma'o/word classes that are used. Done. The complete list is here: http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=647 (As for why 2033 blank lines have been counted, your guess is as good as mine.) -- Arnt Richard Johansen http://arj.nvg.org/ For en bedre verden og tørrere, gladere barn. To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by Anonymous on Sun 16 of Nov, 2008 13:01 GMT On Sun, Nov 16, 2008 at 8:27 AM, Arnt Richard Johansen <arj@nvg.org> wrote: > > Done. The complete list is here: > http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=647 > > (As for why 2033 blank lines have been counted, your guess is as good as mine.) An empty string is a valid text, so empty lines should be counted as grammatical Lojban. mu'o mi'e xorxes To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.