Lojban In General

Lojban In General


Annotated PEG grammar

posts: 143
Use this thread to discuss the Annotated PEG grammar page.
posts: 143

http://pdf23ds.net/lojban/Annotated%20Grammar.html

Here's a very unfinished version of an annotated grammar. Anybody want
to see it finished?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sat, Nov 15, 2008 at 5:45 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> http://pdf23ds.net/lojban/Annotated%20Grammar.html
>
> Here's a very unfinished version of an annotated grammar. Anybody want
> to see it finished?

I wouldn't mind seeing the annotations finished, I think that's a
useful thing to have, but what I would really want is seeing the rules
cleaned up, including some minor adjustments to the grammar. This
grammar is not official yet so we do have some room to tinker.

For example, why is joik-jek up there in the text rule instead of
being a fragment? Presumably this dangling joik-jek is there to allow
answers to je'i questions, but the proper place for such things is the
fragments rule. For example, to a "ko'a ji ko'e ji ko'i" question one
can answer ".e .i .e", and these are two fragments, separated by ".i".
To a "broda je'i brode je'i brodi" question it is also possible to
answer "je .i je", but the parse is completely different and not very
sensical. Moving joik-jek to fragment requires a couple of adjustments
in other rules, but I don't see any reason not to do it.

BTW, the comment about free* and UI is not exactly rght. In fact UI is
not an instance of free, and can occur in a few more places than
free*. For example, UI can occur between CMENE and free can't. That's
another thing that should be changed, free should be able to appear
anywhere that it doesn't cause a problem.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sat, Nov 15, 2008 at 16:05, Jorge Llambías <jjllambias@gmail.com> wrote:
> I wouldn't mind seeing the annotations finished, I think that's a
> useful thing to have, but what I would really want is seeing the rules
> cleaned up, including some minor adjustments to the grammar. This
> grammar is not official yet so we do have some room to tinker.

Well, I've already made a few, as you know. But I think ideally we'd
have a few test cases for each change that shows what the change
changes.

> For example, why is joik-jek up there in the text rule instead of
> being a fragment?

Because otherwise there wouldn't be anything to do up there?

> BTW, the comment about free* and UI is not exactly rght. In fact UI is
> not an instance of free, and can occur in a few more places than
> free*.

And a few less, no? Yes, you're absolutely right, and I do find it kind of odd.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sat, Nov 15, 2008 at 7:29 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sat, Nov 15, 2008 at 16:05, Jorge Llambías <jjllambias@gmail.com> wrote:
>> BTW, the comment about free* and UI is not exactly rght. In fact UI is
>> not an instance of free, and can occur in a few more places than
>> free*.
>
> And a few less, no?

I think UI can appear almost anywhere at all (the only exceptions
being in the middle of magic word constructs). I don't think there's a
place where free can appear and UI can't.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sat, Nov 15, 2008 at 16:53, Jorge Llambías <jjllambias@gmail.com> wrote:

> On Sat, Nov 15, 2008 at 7:29 PM, Chris Capel <pdf23ds@gmail.com> wrote:

>> On Sat, Nov 15, 2008 at 16:05, Jorge Llambías <jjllambias@gmail.com> wrote:
>>> BTW, the comment about free* and UI is not exactly rght. In fact UI is
>>> not an instance of free, and can occur in a few more places than
>>> free*.
>>
>> And a few less, no?
>
> I think UI can appear almost anywhere at all (the only exceptions
> being in the middle of magic word constructs). I don't think there's a
> place where free can appear and UI can't.

If that's really true, then there are a lot of extra free* floating
around the grammar.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sat, Nov 15, 2008 at 8:11 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sat, Nov 15, 2008 at 16:53, Jorge Llambías <jjllambias@gmail.com> wrote:
>>
>> I think UI can appear almost anywhere at all (the only exceptions
>> being in the middle of magic word constructs). I don't think there's a
>> place where free can appear and UI can't.
>
> If that's really true, then there are a lot of extra free* floating
> around the grammar.

free* is all over the grammar instead of being part of the post-clause
rule like UI. That makes free* a little more restricted. Instead of
being allowed after (practically) any selma'o at all, like UI, it is
alowed in a lot of places, but not just anywhere at all.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 4034

On 11/15/08, Jorge Llambías <jjllambias@gmail.com> wrote:

> On Sat, Nov 15, 2008 at 5:45 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> > http://pdf23ds.net/lojban/Annotated%20Grammar.html

> > Here's a very unfinished version of an annotated grammar. Anybody want
> > to see it finished?
> I wouldn't mind seeing the annotations finished

<aol>me too</aol>

> I think that's a
> useful thing to have, but what I would really want is seeing the rules
> cleaned up, including some minor adjustments to the grammar. This
> grammar is not official yet so we do have some room to tinker.

Yes I think that maybe making sure that the grammar can handle things
like the magic words in a way most can agree on would be great.
http://www.lojban.org/tiki/tiki-index.php?page=Magic+Words&bl
I'm not sure what other proposed changes might need a tiny bit of tweaking.

Also there might be minor things that could be changed that don't
really change things much.
{paragraph <- (statement / fragment) (I !jek !joik !joik-jek free*
(statement / fragment)?)*}
could be changed to
{paragraph <- (statement / fragment) (I !joik-jek free* (statement /
fragment)?)*}
or {paragraph <- (statement / fragment) (I !jek !joik free*
(statement / fragment)?)*} .

given {joik-jek <- joik free* / jek free*} it's slightly redundant.

>
> For example, why is joik-jek up there in the text rule instead of
> being a fragment?

I can't really speak to Jorge Llambías's suggestions about fragments.
It seems sensible to me, but I lack the experience and knowlege to
really evaluate his suggestion properly.

>
> BTW, the comment about free* and UI is not exactly rght. In fact UI is
> not an instance of free, and can occur in a few more places than
> free*. For example, UI can occur between CMENE and free can't. That's
> another thing that should be changed, free should be able to appear
> anywhere that it doesn't cause a problem.

"text-part-2" , "sumti-6" , and "free" all have CMENE+ rules and there
is a indicators rule that might be appropriate or not.
indicators <- FUhE? indicator+
indicator <- ((UI / CAI) NAI? / DAhO / FUhO) !BU
If you created a rule like:
names <- CMENE+ indicators? (CMENE+ indicators?)*
and then used "names" instead of "CMENE+" inside those three rules
would that be close to what you suggest?

names <- (CMENE+ indicators / )* CMENE+
hmm you did say between, not sure if you can do a zero-width match like that.

text-part-2 <- (CMENE+ / indicators?) free*
sumti-6 <- ... / LA free* relatives? CMENE+ free* / ...
free <- ... / vocative relatives? CMENE+ free* relatives? DOhU? / ...

text-part-2 <- (names / indicators)? free*
sumti-6 <- ... / LA free* relatives? names free* / ...
free <- ... / vocative relatives? names free* relatives? DOhU? / ...

Hmm now that I look at it free also doesn't seem to list UI either.

post-clause <- spaces? si-clause? !ZEI-clause !BU-clause indicators*
that probably does much of the UI usage soak up.

CMENE-clause <- CMENE-pre CMENE-post
CMENE-pre <- pre-clause CMENE spaces?
CMENE-post <- post-clause
SPACE CMENE-no-SA-handling <- pre-clause CMENE post-clause

And it looks like the CMENE stuff already might soak up indicators. Not sure.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sat, Nov 15, 2008 at 8:21 PM, Stephen Pollei
<stephen.pollei@gmail.com> wrote:
>
> Yes I think that maybe making sure that the grammar can handle things
> like the magic words in a way most can agree on would be great.

Those are already handled properly by Robin's grammar:
<http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.peg.txt>

(Except for SA, which is still a big mess.)

> Also there might be minor things that could be changed that don't
> really change things much.
> {paragraph <- (statement / fragment) (I !jek !joik !joik-jek free*
> (statement / fragment)?)*}
> could be changed to
> {paragraph <- (statement / fragment) (I !joik-jek free* (statement /
> fragment)?)*}
> or {paragraph <- (statement / fragment) (I !jek !joik free*
> (statement / fragment)?)*} .
>
> given {joik-jek <- joik free* / jek free*} it's slightly redundant.

Yes. In fact that !joik-jek in front of free* is extremely weird. If
it's needed at all, it has to go after free*, otherwise it looks like
the presence of a free (which itself can never begin with joik-jek)
would license a "statement" that begins with joik-jek (I suppose
somethig that begins with joi gi... gi ...). And that would be a
really weird function for "free".

mu'o mi'e xorxes

>
>>
>> For example, why is joik-jek up there in the text rule instead of
>> being a fragment?
>
> I can't really speak to Jorge Llambías's suggestions about fragments.
> It seems sensible to me, but I lack the experience and knowlege to
> really evaluate his suggestion properly.
>
>>
>> BTW, the comment about free* and UI is not exactly rght. In fact UI is
>> not an instance of free, and can occur in a few more places than
>> free*. For example, UI can occur between CMENE and free can't. That's
>> another thing that should be changed, free should be able to appear
>> anywhere that it doesn't cause a problem.
>
> "text-part-2" , "sumti-6" , and "free" all have CMENE+ rules and there
> is a indicators rule that might be appropriate or not.
> indicators <- FUhE? indicator+
> indicator <- ((UI / CAI) NAI? / DAhO / FUhO) !BU
> If you created a rule like:
> names <- CMENE+ indicators? (CMENE+ indicators?)*
> and then used "names" instead of "CMENE+" inside those three rules
> would that be close to what you suggest?
>
> names <- (CMENE+ indicators / )* CMENE+
> hmm you did say between, not sure if you can do a zero-width match like that.
>
> text-part-2 <- (CMENE+ / indicators?) free*
> sumti-6 <- ... / LA free* relatives? CMENE+ free* / ...
> free <- ... / vocative relatives? CMENE+ free* relatives? DOhU? / ...
>
> text-part-2 <- (names / indicators)? free*
> sumti-6 <- ... / LA free* relatives? names free* / ...
> free <- ... / vocative relatives? names free* relatives? DOhU? / ...
>
> Hmm now that I look at it free also doesn't seem to list UI either.
>
> post-clause <- spaces? si-clause? !ZEI-clause !BU-clause indicators*
> that probably does much of the UI usage soak up.
>
> CMENE-clause <- CMENE-pre CMENE-post
> CMENE-pre <- pre-clause CMENE spaces?
> CMENE-post <- post-clause
> SPACE CMENE-no-SA-handling <- pre-clause CMENE post-clause
>
> And it looks like the CMENE stuff already might soak up indicators. Not sure.
>
>
> To unsubscribe from this list, send mail to lojban-list-request@lojban.org
> with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
> you're really stuck, send mail to secretary@lojban.org for help.
>
>


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 4034

On 11/15/08, Jorge Llambías <jjllambias@gmail.com> wrote:
> On Sat, Nov 15, 2008 at 8:21 PM, Stephen Pollei
> <stephen.pollei@gmail.com> wrote:

> > magic words in a way most can agree on would be great.

> Those are already handled properly by Robin's grammar:
> <http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.peg.txt>

> (Except for SA, which is still a big mess.)
yes and I think the sa was just for some cases like with brivla or
cmevla, and some nesting cases. Otherwise yes I think his peg
supported the magic words pretty well. I also thought he had some
nonstandard extensions as well though?

http://digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/
[[ # There is possibly some wierd interaction lurking with hadling of
SI inside of ZOI+SA clauses. If anyone finds a sentence that behaves
wierdly in that sort of situation, please let me know.

  1. As a subset of the morphological problems, cmavo starting with a

vowel don't work without a space after them in lo'u...le'u quotes, and
probably other places.

  1. Currently does not handle "nested" SA, nor SA+BRIVLA or SA+CMENE.]]

[[Multiple sa in a row delete back to further previous instances of
that selma'o. For example, "le le broda cu brode sa sa le brodi" is
the same as "le brodi". ]]

> > {paragraph <- (statement / fragment) (I !jek !joik !joik-jek free*
> > (statement / fragment)?)*}

> Yes. In fact that !joik-jek in front of free* is extremely weird. If
> it's needed at all, it has to go after free*,

paragraph <- (statement / fragment) (I free* !jek !joik (statement /
fragment)?)*

I agree. And think the rule I just stated is an optimazation and still
accomplishes what I think the intent was. I don't know enough to
really say though.

http://pdf23ds.net/lojban/Annotated%20Grammar.html and
http://digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.peg.txt
seem to difer more than I suspected at first.

I was going to suggest that just about every place where "NIhO+" was
used to be expanded into "NIhO+ I?" . I see that Chris's version chops
off a "-clause" from the end of lots of the rules concerning selma'o.
That's why I was getting confused if CMENE+ was doing the right thing
or not, the real rules use CMENE-clause+ which does.

So I quess I suggest that places that do a "NIhO-clause+" be changed
into a "NIhO-clause+ I-clause?" or a "(NIhO-clause I-clause?)+"

Mostly it's so it a tiny bit more consistant with the implied " ni'o i
" that the rules in the magic words sections seems to imply exist for
"sa i" and "su" to work.

I could be totally crazy on my "ni'o i" suggestion, take with shaker of salt.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sat, Nov 15, 2008 at 18:23, Stephen Pollei <stephen.pollei@gmail.com> wrote:
> http://pdf23ds.net/lojban/Annotated%20Grammar.html and
> http://digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.peg.txt
> seem to difer more than I suspected at first.

I throw out all the sa and magic word stuff, and delete all of the
morphology interface, which handles UI and some other stuff. (That
involves cutting off the "-clause"s, which are only necessary because
of the morphology stuff.) And besides that, there's a handful of other
changes that I discussed with Jorge earlier on this list. I can send
you a diff of those if you're interested.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 4034
On 11/15/08, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sat, Nov 15, 2008 at 18:23, Stephen Pollei <stephen.pollei@gmail.com> wrote:
> > http://pdf23ds.net/lojban/Annotated%20Grammar.html and
> > http://digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.peg.txt
> > seem to difer more than I suspected at first.
>
>
> I throw out all the sa and magic word stuff, and delete all of the
> morphology interface, which handles UI and some other stuff. (That
> involves cutting off the "-clause"s, which are only necessary because
> of the morphology stuff.)

Well the clause stuff is what handles the ui and ba'e stuff as well
which I don't count as a morpholgy interface issue at all. To me the
morph interface stuff is the different stuff prefixed by NORATS or has
&{ .. } . I also think the magic word stuff is a pretty important part
of the whole system.

I do think that the whole foo-clause stuff could use some real clean up.
A lot of it is schematic and those should likely be collected into one
place, and the things which get special handling should all be grouped
togethe; right now it's all mixed together by just using an alphabetic
order.

> And besides that, there's a handful of other
> changes that I discussed with Jorge earlier on this list. I can send
> you a diff of those if you're interested.

You can do that if you want, I'd also suggest that you document you
changes on your web page.

I also thank you for your efforts, I think it will be very helpful.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sat, Nov 15, 2008 at 21:08, Stephen Pollei <stephen.pollei@gmail.com> wrote:

> On 11/15/08, Chris Capel <pdf23ds@gmail.com> wrote:

>> I throw out all the sa and magic word stuff, and delete all of the
>> morphology interface, which handles UI and some other stuff. (That
>> involves cutting off the "-clause"s, which are only necessary because
>> of the morphology stuff.)
>
> Well the clause stuff is what handles the ui and ba'e stuff as well
> which I don't count as a morpholgy interface issue at all. To me the
> morph interface stuff is the different stuff prefixed by NORATS or has
> &{ .. } . I also think the magic word stuff is a pretty important part
> of the whole system.

No doubt they're important, but the goal of my document is to provide
an overview of how lojban works from the perspective of its grammar. I
think the -clause details and magic words would distract too much from
that purpose. I don't plan on filling out mesko either. (But someone
can contribute descriptions if they like.)

> I do think that the whole foo-clause stuff could use some real clean up.
> A lot of it is schematic and those should likely be collected into one
> place, and the things which get special handling should all be grouped
> together; right now it's all mixed together by just using an alphabetic
> order.

I don't think it can be done in pure PEG, alas. Perhaps a source file
that generates a peg could be created. :-) The source file could also
output a Rats! version that would be more compact and efficient than
the current version, which is quite huge dll-wise.

> I also thank you for your efforts, I think it will be very helpful.

Thanks. Most helpful as an adjunct to Lojgloss, which shows the
production names as part of its interface. (Work on Lojgloss is
progressing, by the way. I hope to have a new web version, with
parsing support, available within a few weeks.)

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 4034
On 11/15/08, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sat, Nov 15, 2008 at 21:08, Stephen Pollei <stephen.pollei@gmail.com> wrote:

> > On 11/15/08, Chris Capel <pdf23ds@gmail.com> wrote:


> No doubt they're important, but the goal of my document is to provide
> an overview of how lojban works from the perspective of its grammar. I
> think the -clause details and magic words would distract too much from
> that purpose. I don't plan on filling out mesko either. (But someone
> can contribute descriptions if they like.)

That's fair enough

> > I do think that the whole foo-clause stuff could use some real clean up.
> > A lot of it is schematic and those should likely be collected into one
> > place, and the things which get special handling should all be grouped

> > together; right now it's all mixed together by just using an alphabetic
> > order.


> I don't think it can be done in pure PEG, alas. Perhaps a source file
> that generates a peg could be created. :-)

Well I mostly meant that a lot of it is like:
BAI-clause <- BAI-pre BAI-post
BAI-pre <- pre-clause BAI spaces?
BAI-post <- post-clause

${cmavo}-clause <- ${cmavo}-pre post-clause
${cmavo}-pre <- BAhE-clause? ${cmavo} spaces?

A little shorter as nobody uses ${cmavo}-post afaik.
All those can be grouped in alphabetical order one after another.

Then there are the special ones which correspond with the magic words,
all those clause ones could be grouped togheter in alphabetical(or
other) order: SI SA SU ZO ZOI LOhU LEhU ZEI BU LAhO and FAhO . I think
the Y-clause is called spaces;-) NIhO, LU, TUhE, and TO might also be
different because of how su interacts with them.
BRIVLA and CMENE might also get specialy written clause stuff.

> The source file could also
> output a Rats! version that would be more compact and efficient than
> the current version, which is quite huge dll-wise.

Well I don't have any ideas on how to optimize things in that way, the
peg is fairly thinish.
I have an idea for the sa thing, but that would most likely make
things bigger not smaller.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

> http://pdf23ds.net/lojban/Annotated%20Grammar.html

>>paragraph <- (statement / fragment) (I !jek !joik !joik-jek free* (statement / fragment)?)*
>>
>>Jeks and joiks are disallowed after the I clause because to match them here would give them lower precedence than they actually have.

There ae three problems with that:

(1) As Stephen points ot, "!jek !joik" duplicates "!joik-jek".

(2) Putting the restriction in front of "free*" makes no sense,
because that would mean that for example ".i je broda" is blocked, but
".i pamai je broda" is allowed. That makes no sense.

(3) In fact "!jek !joik !joik-jek" is just wrong here, even if placed
after "free*". It's true that we don't want this pat of the rule to
grab "i je broda" because of precedence, but there is no danger of
that happening because "statement" will already have grabed them. On
the other hand, this rule is blocking things like "broda .i joi gi da
gi de brode" for no apparent reason. (While "broda .i pa mai joi gi
da gi de brode" is accepted!)

Conclusion: we must get rid of that "!jek !joik !joik-jek", or else
explain better what it is doing there.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 12:48, Jorge Llambías <jjllambias@gmail.com> wrote:
> (1) As Stephen points ot, "!jek !joik" duplicates "!joik-jek".

The only cruft-reducing change so far I've made is to get rid of stag,
which was fairly egregious. If you want to keep a version somewhere
that makes changes you're fairly sure are sound, I'll be glad to
incorporate those into Lojgloss for further verification. And the
guide--I suppose. It's not like I'm currently annotating the
"official" version.

> (2) Putting the restriction in front of "free*" makes no sense,
> because that would mean that for example ".i je broda" is blocked,

But considering your (3) would not be being matched in the first
place. (What's the Lojban tense for "be being" or "have been being"?)

> Conclusion: we must get rid of that "!jek !joik !joik-jek", or else
> explain better what it is doing there.

Sounds good. As for an explanation, I know that Robin tried to modify
the old grammar only minimally in converting it to PEG. Were those in
the old one, and not necessary any longer?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 953

On Sat, Nov 15, 2008 at 02:45:23PM -0600, Chris Capel wrote:
> http://pdf23ds.net/lojban/Annotated%20Grammar.html
>
> Here's a very unfinished version of an annotated grammar. Anybody want
> to see it finished?

This is a very interesting project.

I started doing something similar at http://www.lojban.org/tiki/tiki-index.php?page=Annotated+machine+grammar , but I never got very far, and haven't touched it in years.

--
Arnt Richard Johansen http://arj.nvg.org/
Jeg er nok verdens sydligste sengevæter. Forutsatt at ingen på basen på
Sydpolen driver med slikt, da. --Erling Kagge: Alene til Sydpolen


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sun, Nov 16, 2008 at 4:07 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sun, Nov 16, 2008 at 12:48, Jorge Llambías <jjllambias@gmail.com> wrote:
>> (1) As Stephen points ot, "!jek !joik" duplicates "!joik-jek".
>
> The only cruft-reducing change so far I've made is to get rid of stag,
> which was fairly egregious. If you want to keep a version somewhere
> that makes changes you're fairly sure are sound, I'll be glad to
> incorporate those into Lojgloss for further verification. And the
> guide--I suppose. It's not like I'm currently annotating the
> "official" version.

Hmm... I don't really want to maintain a version at this point, mainly
becaue I have no easy way to test it. But if you are annotating this
version, at least you can say that there seems to be no point to this
or that, if we can't see what the point is.

>> Conclusion: we must get rid of that "!jek !joik !joik-jek", or else
>> explain better what it is doing there.
>
> Sounds good. As for an explanation, I know that Robin tried to modify
> the old grammar only minimally in converting it to PEG. Were those in
> the old one, and not necessary any longer?

No, the EBNF verson does not use "!" at all. (BTW, you should probably
explain "!" at the top, that was the most difficult thing for me to
understand when I started looking at PEGs.)

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

> http://pdf23ds.net/lojban/Annotated%20Grammar.html

>>sentence <- (terms CU? free*)? bridi-tail / bridi-tail

The second part of this rule, as written, is unreachable. If we want
to write it as a two part rule, it should be:

sentence <- terms CU? free* bridi-tail / bridi-tail


>> subsentence <- sentence / prenex subsentence
>>
>> The subsentence is simply the sentence along with its prenex.

You may want to point out that it's called "subsentence" because it's
what appears in subordinate clauses (NOI, NU, GE...GI...) I find the
rule name a bit confusing, that a subsentence is actually more complex
than a sentence and includes it.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 13:19, Jorge Llambías <jjllambias@gmail.com> wrote:

> On Sun, Nov 16, 2008 at 4:07 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> Hmm... I don't really want to maintain a version at this point, mainly
> becaue I have no easy way to test it. But if you are annotating this
> version, at least you can say that there seems to be no point to this
> or that, if we can't see what the point is.

Oh, I thought you were able to recompile camxes to test changes in the
morphology. Guess not. (I know I can't.)

> (BTW, you should probably
> explain "!" at the top, that was the most difficult thing for me to
> understand when I started looking at PEGs.)

Huh. How is that not there already? I just assumed it was in that stub
explanation at the top and never actually checked the list.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

>> bridi-tail-1 <- bridi-tail-1 gihek !(tag? BO) !(tag? KE) free* bridi-tail-2 tail-terms / bridi-tail-2
>>
>> This production parses undecorated giheks. To ensure that no BO or KE are used, the lookahead symbol is used.

Neither a free* nor a bridi-tail-2 can start with (tag? BO), so that
restriction is doing nothing.

And the !(tag? KE) seems wrong there. Why is "broda gi'e ke ge brode
gi brodi" disallowed, while "broda gi'e xi pa ke ge brode gi brodi"
allowed by this rule?

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 13:24, Jorge Llambías <jjllambias@gmail.com> wrote:
> sentence <- terms CU? free* bridi-tail / bridi-tail

I'm going to go ahead and make this change instead of commenting on
it. I'm not sure enough about the joik-jek change in "paragraph".

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 13:38, Jorge Llambías <jjllambias@gmail.com> wrote:
>>> bridi-tail-1 <- bridi-tail-1 gihek !(tag? BO) !(tag? KE) free* bridi-tail-2 tail-terms / bridi-tail-2
>>>
>>> This production parses undecorated giheks. To ensure that no BO or KE are used, the lookahead symbol is used.
>
> Neither a free* nor a bridi-tail-2 can start with (tag? BO), so that
> restriction is doing nothing.

I think it is, because bridi-tail-2 requires a BO in its gihek clause.
So bridi-tail-1 will fail because of this lookahead if there's a BO
there, and will get matched by bridi-tail-2. OTOH, it might work just
fine without the !BO, because nothing in "free" parses BO, right? Or
could it?

> And the !(tag? KE) seems wrong there. Why is "broda gi'e ke ge brode
> gi brodi" disallowed, while "broda gi'e xi pa ke ge brode gi brodi"
> allowed by this rule?

Is it disallowed? It seems to me your second example would still parse
passing through bridi-tail-1 and matching the second alternative of
gek-sentence. In which case the only question is whether the !KE is
superfluous.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

>> gek-sentence <- gek subsentence gik subsentence tail-terms / tag? KE free* gek-sentence KEhE? free* / NA free* gek-sentence

>> This production refers to the already-defined production subsentence above. The options handle KE and NA prefixes.

Perhaps point out why KE is used here at all. It's not needed to do
any grouping, since gek already handles that by itself. I think the
only reason is to separate "tag" from a possibe gek of form "tag GI"
since otherwise "tag tag" could collapse into a single tag.


>> sumti-4 <- sumti-5 / gek sumti gik sumti-4
>>
>>This references itself instead of 'sumti' so that joik-eks get associated with the whole gek-gik phrase instead of just the sumti after the gik.

A very unfortunate and confusing choice. gek .. gik ... should allow
the same things for each connectand.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 14:12, Jorge Llambías <jjllambias@gmail.com> wrote:
>>> gek-sentence <- gek subsentence gik subsentence tail-terms / tag? KE free* gek-sentence KEhE? free* / NA free* gek-sentence
>
>>> This production refers to the already-defined production subsentence above. The options handle KE and NA prefixes.
>
> Perhaps point out why KE is used here at all. It's not needed to do
> any grouping, since gek already handles that by itself. I think the
> only reason is to separate "tag" from a possibe gek of form "tag GI"
> since otherwise "tag tag" could collapse into a single tag.

I believe the other examples is when you have, e.g., four subsentences
all connected with geks and you want to group the middle two somehow.

>>> sumti-4 <- sumti-5 / gek sumti gik sumti-4
>>>
>>>This references itself instead of 'sumti' so that joik-eks get associated with the whole gek-gik phrase instead of just the sumti after the gik.
>
> A very unfortunate and confusing choice. gek .. gik ... should allow
> the same things for each connectand.

Perhaps. I've taken that bit out of the commentary, preferring a more
elaborate treatment of precedence and reference to other members of a
series at some other place in the document. Do you think that
particular comment is noteworthy?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sun, Nov 16, 2008 at 5:08 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sun, Nov 16, 2008 at 13:38, Jorge Llambías <jjllambias@gmail.com> wrote:
>
>>>> bridi-tail-1 <- bridi-tail-1 gihek !(tag? BO) !(tag? KE) free* bridi-tail-2 tail-terms / bridi-tail-2
>>>>
>>>> This production parses undecorated giheks. To ensure that no BO or KE are used, the lookahead symbol is used.
>>
>> Neither a free* nor a bridi-tail-2 can start with (tag? BO), so that
>> restriction is doing nothing.
>
> I think it is, because bridi-tail-2 requires a BO in its gihek clause.

But it can never _start_ with "tag? BO", which is all !(tag? BO) cares about.

>> And the !(tag? KE) seems wrong there. Why is "broda gi'e ke ge brode
>> gi brodi" disallowed, while "broda gi'e xi pa ke ge brode gi brodi"
>> allowed by this rule?
>
> Is it disallowed?

As a "bridi-tail-1", it's disallowed. (It's allowed as a "bridi-tail".)

> It seems to me your second example would still parse
> passing through bridi-tail-1 and matching the second alternative of
> gek-sentence.

"broda gi'e ke ge brode gi brodi" will match bridi-tail (but not bridi-tail-1)

"broda gi'e xi pa ke ge brode gi brodi", OTOH, will match bridi-tail-1.

That's inconsistent.

> In which case the only question is whether the !KE is
> superfluous.

I suspect it's simply wrong, in order to get the right precedences.
But in an case, if it belongs there at all, it must go after free*,
not before.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sun, Nov 16, 2008 at 5:21 PM, Chris Capel <pdf23ds@gmail.com> wrote:

> On Sun, Nov 16, 2008 at 14:12, Jorge Llambías <jjllambias@gmail.com> wrote:
>>>> gek-sentence <- gek subsentence gik subsentence tail-terms / tag? KE free* gek-sentence KEhE? free* / NA free* gek-sentence
>>
>> Perhaps point out why KE is used here at all. It's not needed to do
>> any grouping, since gek already handles that by itself. I think the
>> only reason is to separate "tag" from a possibe gek of form "tag GI"
>> since otherwise "tag tag" could collapse into a single tag.
>
> I believe the other examples is when you have, e.g., four subsentences
> all connected with geks and you want to group the middle two somehow.

KE-KEhE is never needed with geks for grouping purposes. The case you
give could be:

ge broda gi (ga (go brode gi brodi) gi brodo)

or

ge (ga broda gi (go brode gi brodi)) gi brodo

KE-KEhE wouldn't add anything.ether way.


>>>> sumti-4 <- sumti-5 / gek sumti gik sumti-4
>>
>> Do you think that
> particular comment is noteworthy?

I think it's worth keeping in mind that this rule connects different
level of things so as to "fix" that if possible before making it
official.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 14:24, Jorge Llambías <jjllambias@gmail.com> wrote:

> On Sun, Nov 16, 2008 at 5:08 PM, Chris Capel <pdf23ds@gmail.com> wrote:

>> On Sun, Nov 16, 2008 at 13:38, Jorge Llambías <jjllambias@gmail.com> wrote:
>>>> bridi-tail-1 <- bridi-tail-1 gihek !(tag? BO) !(tag? KE) free* bridi-tail-2 tail-terms / bridi-tail-2
>>>>
>>>> This production parses undecorated giheks. To ensure that no BO or KE are used, the lookahead symbol is used.
>>>
>>> Neither a free* nor a bridi-tail-2 can start with (tag? BO), so that
>>> restriction is doing nothing.
>>
>> I think it is, because bridi-tail-2 requires a BO in its gihek clause.
>
> But it can never _start_ with "tag? BO", which is all !(tag? BO) cares about.

The BO (by which I mean tag? BO) is going to be after the gihek in any
case, not at the beginning of the production. The beginning of the
stuff inside the latter part of the gihek, you mean?

The !BO keeps "broda gi'e bo brode" from matching the first option of
bridi-tail-1, instead matching the optional part of bridi-tail-2
(instead of two bridi-tail-2's for broda and brode each).

>>> And the !(tag? KE) seems wrong there. Why is "broda gi'e ke ge brode
>>> gi brodi" disallowed, while "broda gi'e xi pa ke ge brode gi brodi"
>>> allowed by this rule?
>>
>> Is it disallowed?
>
> As a "bridi-tail-1", it's disallowed. (It's allowed as a "bridi-tail".)

Which is sort of the defining difference between the two. I think the
real issue here is the handling of the frees in bridi-tail. Both of
your examples parse, but the latter as a bridi-tail-1, and there's no
reason for it to parse differently just because of a free.

>> It seems to me your second example would still parse
>> passing through bridi-tail-1 and matching the second alternative of
>> gek-sentence.
>
> "broda gi'e ke ge brode gi brodi" will match bridi-tail (but not bridi-tail-1)
>
> "broda gi'e xi pa ke ge brode gi brodi", OTOH, will match bridi-tail-1.
>
> That's inconsistent.

Oh, I see I have duplicated your effort. Scientific method! Yay!

>> In which case the only question is whether the !KE is
>> superfluous.
>
> I suspect it's simply wrong, in order to get the right precedences.
> But in an case, if it belongs there at all, it must go after free*,
> not before.

But what would removing the !KE do? Would it fix it?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 16, 2008 at 14:40, Jorge Llambías <jjllambias@gmail.com> wrote:
>>>>> sumti-4 <- sumti-5 / gek sumti gik sumti-4
>>>
>>> Do you think that
>> particular comment is noteworthy?
>
> I think it's worth keeping in mind that this rule connects different
> level of things so as to "fix" that if possible before making it
> official.

It turns out that sumti-1 is the first member of a series in the
grammar to reference a previous member of its series. So sumti-4 is
very much the rule, not the exception. The only reason that comment
was there is because it was a comment I put in my copy of the grammar
file while debugging something about the grammar parse tree. Do you
think more rules should reference earlier members of the series when
possible? That would very much change the semantics of the language,
albeit often in gray areas.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143
On Sun, Nov 16, 2008 at 14:52, Chris Capel <pdf23ds@gmail.com> wrote:

> It turns out that sumti-1 is the first member of a series in the
> grammar to reference a previous member of its series. So sumti-4 is
> very much the rule, not the exception. The only reason that comment
> was there is because it was a comment I put in my copy of the grammar
> file while debugging something about the grammar parse tree. Do you
> think more rules should reference earlier members of the series when
> possible? That would very much change the semantics of the language,
> albeit often in gray areas.

Now that I look at it, that sumti-1 rule looks fishy. Why is it
referencing sumti instead of sumti-1? The only thing in sumti is VUhO,
which looks just wrong.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sun, Nov 16, 2008 at 5:47 PM, Chris Capel <pdf23ds@gmail.com> wrote:


>>>>> bridi-tail-1 <- bridi-tail-1 gihek !(tag? BO) !(tag? KE) free* bridi-tail-2 tail-terms / bridi-tail-2
>
> The BO (by which I mean tag? BO) is going to be after the gihek in any
> case, not at the beginning of the production. The beginning of the
> stuff inside the latter part of the gihek, you mean?

I mean that bridi-tail-1 will not match "broda gi'e bo brode" other
than as a bridi-tail-2. The !BO does nothing.
The first part already fails because "bo brode" is not a possible
bridi-tail-2, the !BO is unnecessary.

> The !BO keeps "broda gi'e bo brode" from matching the first option of
> bridi-tail-1, instead matching the optional part of bridi-tail-2

But it wouldn't match the first option even without !BO, that's my point.

>>> In which case the only question is whether the !KE is
>>> superfluous.
>>
>> I suspect it's simply wrong, in order to get the right precedences.
>> But in an case, if it belongs there at all, it must go after free*,
>> not before.
>
> But what would removing the !KE do? Would it fix it?

I think so, yes.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Sun, Nov 16, 2008 at 6:00 PM, Chris Capel <pdf23ds@gmail.com> wrote: > On Sun, Nov 16, 2008 at 14:52, Chris Capel <pdf23ds@gmail.com> wrote:

>> It turns out that sumti-1 is the first member of a series in the
>> grammar to reference a previous member of its series. So sumti-4 is
>> very much the rule, not the exception.

"gek sumti gik sumti-4" is weird because it means "ge da .a de gi da
.a di" will group as "(ge (da .a de) gi da) .a di", which is horrible.

> The only reason that comment
>> was there is because it was a comment I put in my copy of the grammar
>> file while debugging something about the grammar parse tree. Do you
>> think more rules should reference earlier members of the series when
>> possible? That would very much change the semantics of the language,
>> albeit often in gray areas.

What I think is that "ge X gi Y" should be symmetrical in X and Y, it
should always be possibe to rewrite it as "ge Y gi X" without changing
the meaning.

> Now that I look at it, that sumti-1 rule looks fishy. Why is it
> referencing sumti instead of sumti-1? The only thing in sumti is VUhO,
> which looks just wrong.

It means you can use a VUhO relative clause to apply only within a
KE-KEhE grouped sumti.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.