Lojban In General

Lojban In General


posts: 4740

A couple of years ago I recommended to Robin a type of Lojban random
sentence generator that relies on templates. Taking an existing
utterance that parses, you transform each cmavo into the upper-case
symbol for its selma'o. You transform each gismu, lujvo and fu'ivla
into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively.
You have a template.

To use the template to make a random sentence, replace each selma'o
tag with a cmavo from that selma'o, and each gismu with an actual
gismu. You get a sentence that parses. Let's say you have the template
LE MI GISMU CU GISMU KO'A2. Fill in the slots randomly with "le do
bakni cu sovna di'u" and you get something nonsensical (Your cow is
an egg of the last utterance.), but it parses.

The reason I bring this up now is that I would like to find out which
templates are the most common in the searchable corpus of Lojban
utterances, such as IRC logs. This would suggest very useful templates
for the home-game I'm building with dice, paper, and ceramics. I am
considering recording audio learning courses in which I would group
Lojban utterances (sensable ones) by template, to provide a variety of
examples of simple valid sentence structures, and relate selma'o
through substitution. Perhaps someone could intuit the most common and
useful templates, if not search it with textfile processing.

-Eppcott


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 92

On Mon, Nov 10, 2008 at 1:37 PM, Matt Arnold <matt.mattarn@gmail.com> wrote:

> A couple of years ago I recommended to Robin a type of Lojban random
> sentence generator that relies on templates. Taking an existing
> utterance that parses, you transform each cmavo into the upper-case
> symbol for its selma'o. You transform each gismu, lujvo and fu'ivla
> into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively.
> You have a template.
>
> To use the template to make a random sentence, replace each selma'o
> tag with a cmavo from that selma'o, and each gismu with an actual
> gismu. You get a sentence that parses. Let's say you have the template
> LE MI GISMU CU GISMU KO'A2. Fill in the slots randomly with "le do
> bakni cu sovna di'u" and you get something nonsensical (Your cow is
> an egg of the last utterance.), but it parses.
>
> The reason I bring this up now is that I would like to find out which
> templates are the most common in the searchable corpus of Lojban
> utterances, such as IRC logs. This would suggest very useful templates
> for the home-game I'm building with dice, paper, and ceramics. I am
> considering recording audio learning courses in which I would group
> Lojban utterances (sensable ones) by template, to provide a variety of
> examples of simple valid sentence structures, and relate selma'o
> through substitution. Perhaps someone could intuit the most common and
> useful templates, if not search it with textfile processing.


Smallish point: Is there any reason not to merge < gismu >, < lujvo >, and <
fu'ivla > into < brivla >? The distinctions between those three seem not
relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works
just as well, right?

mu'o mi'e komfo,amonan

posts: 4740

True!

On Mon, Nov 10, 2008 at 1:50 PM, komfo,amonan <komfoamonan@gmail.com> wrote:
> On Mon, Nov 10, 2008 at 1:37 PM, Matt Arnold <matt.mattarn@gmail.com> wrote:
>>
>> A couple of years ago I recommended to Robin a type of Lojban random
>> sentence generator that relies on templates. Taking an existing
>> utterance that parses, you transform each cmavo into the upper-case
>> symbol for its selma'o. You transform each gismu, lujvo and fu'ivla
>> into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively.
>> You have a template.
>>
>> To use the template to make a random sentence, replace each selma'o
>> tag with a cmavo from that selma'o, and each gismu with an actual
>> gismu. You get a sentence that parses. Let's say you have the template
>> LE MI GISMU CU GISMU KO'A2. Fill in the slots randomly with "le do
>> bakni cu sovna di'u" and you get something nonsensical (Your cow is
>> an egg of the last utterance.), but it parses.
>>
>> The reason I bring this up now is that I would like to find out which
>> templates are the most common in the searchable corpus of Lojban
>> utterances, such as IRC logs. This would suggest very useful templates
>> for the home-game I'm building with dice, paper, and ceramics. I am
>> considering recording audio learning courses in which I would group
>> Lojban utterances (sensable ones) by template, to provide a variety of
>> examples of simple valid sentence structures, and relate selma'o
>> through substitution. Perhaps someone could intuit the most common and
>> useful templates, if not search it with textfile processing.
>
> Smallish point: Is there any reason not to merge < gismu >, < lujvo >, and <
> fu'ivla > into < brivla >? The distinctions between those three seem not
> relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works
> just as well, right?
> mu'o mi'e komfo,amonan


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 324

On Monday 10 November 2008 13:50:47 komfo,amonan wrote:
> Smallish point: Is there any reason not to merge < gismu >, < lujvo >, and
> < fu'ivla > into < brivla >? The distinctions between those three seem not
> relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works
> just as well, right?

arxokuna ki'a? .i lumge'u java'i prokiono xu?

mu'o mi'e pier


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 92
On Mon, Nov 10, 2008 at 4:28 PM, Pierre Abbat <phma@phma.optus.nu> wrote:


> On Monday 10 November 2008 13:50:47 komfo,amonan wrote:
> > Smallish point: Is there any reason not to merge < gismu >, < lujvo >,
> and
> > < fu'ivla > into < brivla >? The distinctions between those three seem
> not
> > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works
> > just as well, right?
>
> arxokuna ki'a? .i lumge'u java'i prokiono xu?


go'i .i mi ca lo nu djica lo nu fanva cu pu facki fi no valsi be fi la
lojban .i zvati pe'a la jbovlaste ji lo drata

mu'o mi'e komfo,amonan

posts: 143

On Mon, Nov 10, 2008 at 12:37, Matt Arnold <matt.mattarn@gmail.com> wrote:
> A couple of years ago I recommended to Robin a type of Lojban random
> sentence generator that relies on templates. Taking an existing
> utterance that parses, you transform each cmavo into the upper-case
> symbol for its selma'o. You transform each gismu, lujvo and fu'ivla
> into a tag meaning < gismu >, < lujvo > and < fu'ivla > respectively.
> You have a template.

Lojbob has a working sentence parser, but he never did release it
(that I'm aware of). I have a copy on my server, and I'll give the URL
if it's alright with him.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 953

On Mon, Nov 10, 2008 at 01:37:22PM -0500, Matt Arnold wrote:
> The reason I bring this up now is that I would like to find out which
> templates are the most common in the searchable corpus of Lojban
> utterances, such as IRC logs. This would suggest very useful templates
> for the home-game I'm building with dice, paper, and ceramics. I am
> considering recording audio learning courses in which I would group
> Lojban utterances (sensable ones) by template, to provide a variety of
> examples of simple valid sentence structures, and relate selma'o
> through substitution. Perhaps someone could intuit the most common and
> useful templates, if not search it with textfile processing.

With the help of #lojban, I've cobbled together a small collection of scripts that runs through the IRC logs and outputs the sequence of selma'o/word classes that are used.

It's taking very long to run — I don't expect it to complete in days — but here are the ten most frequent templates as of now (about 10% complete):

3191 COI
2071 COI cmene
1675 UI
1290 COI PA KOhA
564 gismu
415 UI CAI
363 KOhA gismu
337 PA
332 COI gismu
324 GOhA

No big surprises here.

--
Arnt Richard Johansen http://arj.nvg.org/
<Nixon> Etter revolusjonen har jeg ordnet meg slik at jeg får meg statue.
<Nixon> Har avtalt dette med nøkkelpersoner på venstresiden.
<Nixon> Som takk for min innsats.
<Nixon> Det blir en 150m høy statue i havnebassenget.
<Kre> skal du ha restaurant i hodet?


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Tue, Nov 11, 2008 at 4:57 AM, Arnt Richard Johansen <arj@nvg.org> wrote:
>
> It's taking very long to run — I don't expect it to complete in days — but here are the ten most frequent templates as of now (about 10% complete):
>
> 3191 COI
> 2071 COI cmene
> 1675 UI
> 1290 COI PA KOhA
> 564 gismu
> 415 UI CAI
> 363 KOhA gismu
> 337 PA
> 332 COI gismu
> 324 GOhA
>
> No big surprises here.

Single PA is a little bit surprising to me. Many answers to "how
many?" questions? Could they be infiltrated English "no"s? The rest
are all expected.

I'm not sure that generating anything but {coi ro do} out of COI PA
KOhA makes much sense though. Probably 100% of the template is made
with the same words. Similarly for some of the other templates.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143
On Mon, Nov 10, 2008 at 17:49, Chris Capel <pdf23ds@gmail.com> wrote:

> Lojbob has a working sentence parser, but he never did release it
> (that I'm aware of). I have a copy on my server, and I'll give the URL
> if it's alright with him.

Ack. "Sentence parser"? I meant random sentence generator.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 953

On Tue, Nov 11, 2008 at 08:33:11AM -0300, Jorge Llambías wrote:
> On Tue, Nov 11, 2008 at 4:57 AM, Arnt Richard Johansen <arj@nvg.org> wrote:
> >
> > 337 PA
>
> Single PA is a little bit surprising to me. Many answers to "how
> many?" questions? Could they be infiltrated English "no"s?

You're absolutely correct. 1364 lines in the entire corpus consist of someone saying just "no". Other single-digit utterances (from pa to so) are only at 207, combined. ("so" is probably also infiltration from English.)

--
Arnt Richard Johansen http://arj.nvg.org/
Jeg er nok verdens sydligste sengevæter. Forutsatt at ingen på basen på
Sydpolen driver med slikt, da. --Erling Kagge: Alene til Sydpolen


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 350

On Mon, Nov 10, 2008 at 5:34 PM, komfo,amonan <komfoamonan@gmail.com> wrote:

> On Mon, Nov 10, 2008 at 4:28 PM, Pierre Abbat <phma@phma.optus.nu> wrote:

>>
>> On Monday 10 November 2008 13:50:47 komfo,amonan wrote:
>> > Smallish point: Is there any reason not to merge < gismu >, < lujvo >,
>> > and
>> > < fu'ivla > into < brivla >? The distinctions between those three seem
>> > not
>> > relevant to the stated purpose. { le do .arxokuna cu barkla di'u } works
>> > just as well, right?
>>
>> arxokuna ki'a? .i lumge'u java'i prokiono xu?
>
> go'i .i mi ca lo nu djica lo nu fanva cu pu facki fi no valsi be fi la
> lojban .i zvati pe'a la jbovlaste ji lo drata

doi komfo,amonan .iki'u lo nu do finti zo arxokuna fi'o velji'i tu'a
la'o net. http://laxmahispajispaji.blogspot.com/2007_02_01_archive.html
net. kei do djuno lo du'u makau fo makau cu te fu'ivla .i ma joi ma du
la'edi'u

--gejyspa


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 92

On Wed, Nov 12, 2008 at 9:39 AM, Michael Turniansky
<mturniansky@gmail.com>wrote:

> On Mon, Nov 10, 2008 at 5:34 PM, komfo,amonan <komfoamonan@gmail.com>
> wrote:

> > On Mon, Nov 10, 2008 at 4:28 PM, Pierre Abbat <phma@phma.optus.nu>

> wrote:
> >>
> >> On Monday 10 November 2008 13:50:47 komfo,amonan wrote:
> >> > Smallish point: Is there any reason not to merge < gismu >, < lujvo >,
> >> > and
> >> > < fu'ivla > into < brivla >? The distinctions between those three seem
> >> > not
> >> > relevant to the stated purpose. { le do .arxokuna cu barkla di'u }
> works
> >> > just as well, right?
> >>
> >> arxokuna ki'a? .i lumge'u java'i prokiono xu?
> >
> > go'i .i mi ca lo nu djica lo nu fanva cu pu facki fi no valsi be fi la
> > lojban .i zvati pe'a la jbovlaste ji lo drata
>
> doi komfo,amonan .iki'u lo nu do finti zo arxokuna fi'o velji'i tu'a
> la'o net. http://laxmahispajispaji.blogspot.com/2007_02_01_archive.html
> net. kei do djuno lo du'u makau fo makau cu te fu'ivla .i ma joi ma du
> la'edi'u


zo arxokuna fu'ivla fo zoi zoi aroughcun/arethkone/arehkan zoi noi ke'a
valsi fi la'o zoi Powhatan zoi zi'e noi ke'a krasi zoi zoi raccoon zoi .i ra
morsi pe'a bangu .i je la'a mi fu'irgau ei fi lo valsi be fi lo jmive pe'a
bangu be mu'u la'o zoi Mikasuki zoi

.i mu'o mi'e komfo,amonan

posts: 953

On Tue, Nov 11, 2008 at 08:57:09AM +0100, Arnt Richard Johansen wrote:
> On Mon, Nov 10, 2008 at 01:37:22PM -0500, Matt Arnold wrote:
> > The reason I bring this up now is that I would like to find out which
> > templates are the most common in the searchable corpus of Lojban
> > utterances, such as IRC logs. This would suggest very useful templates
> > for the home-game I'm building with dice, paper, and ceramics. I am
> > considering recording audio learning courses in which I would group
> > Lojban utterances (sensable ones) by template, to provide a variety of
> > examples of simple valid sentence structures, and relate selma'o
> > through substitution. Perhaps someone could intuit the most common and
> > useful templates, if not search it with textfile processing.
>
> With the help of #lojban, I've cobbled together a small collection of scripts that runs through the IRC logs and outputs the sequence of selma'o/word classes that are used.

Done. The complete list is here:
http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=647

(As for why 2033 blank lines have been counted, your guess is as good as mine.)

--
Arnt Richard Johansen http://arj.nvg.org/
For en bedre verden og tørrere, gladere barn.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sun, Nov 16, 2008 at 8:27 AM, Arnt Richard Johansen <arj@nvg.org> wrote:
>
> Done. The complete list is here:
> http://www.lojban.org/tiki/tiki-download_wiki_attachment.php?attId=647
>
> (As for why 2033 blank lines have been counted, your guess is as good as mine.)

An empty string is a valid text, so empty lines should be counted as
grammatical Lojban.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.