Lojban In General

Lojban In General


peg experiment with changing clauses to better support sa

posts: 4034

I haven't actually used "rats" on this and I'm sure that other parts
of the peg parser file would have to be tweaked in minor ways. This is
just something to show an approach I thought might be helpful.

I am also sure the below has at least one error and/or ommision.

;have the clause chew up it's own sa and su junk

; http://www.lojban.org/tiki/tiki-index.php?page=Magic+Words&bl
; http://digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/
; http://digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/lojban.peg.txt

; the selma'o types not explicitly listed can use the same template
; sm varible being the selma'o
; maybe split up those that can have nai following and those that can't

${sm}-clause1 <- spaces? (${sm}-SA)* (spaces? SA)* spaces? !anticmavo ${sm}
${sm}-SA <- spaces? !anticmavo ${sm} (quote-clauses / !${sm} !SA !FAhO
any-word)* ${sm}-SA? SA &( (spaces? SA)* spaces? ${sm})
${sm}-clause <- BAhE-clause? ${sm}-clause1 spaces? post-clause


;anticmavo is misnamed, maybe anti valsi is better, but even that
isn't quite right
quote-clauses <- ZOI-clause1 / LOhU-clause1 / LAhO-clause1 /
ZO-clause1 / anticmavo
anticmavo <- SI-clause / BU-clause1 / ZEI-clause1

;SI SA SU ZO ZOI LOhU LEhU ZEI BU LAhO and FAhO
; do some "magic" words by hand
; http://www.lojban.org/tiki/tiki-index.php?page=Magic+Words&bl

anti-quote <- SI / SA / SU / Y / ZEI / BU / FAhO
atomic-word <- BU-clause1 / ZEI-clause1 / ZO-clause1 / any-word
anti-SU <- NIhO / LU / TUhE / TO / SU / FAhO

SI-clause <- spaces? atomic-word SI-clause spaces? SI / spaces?
atomic-word spaces? SI
SU-clause <- (spaces? quote-clauses / spaces? !anti-SU any-word)*
spaces? !anticmavo SU

;SA-clause <- BAhE-SA / .... / ZOhU-SA / BRIVLA-SA / CMEVLA-SA
; each clause that needs SA support can directly use the one tailored
to it's needs
; so SA-clause shouldn't be needed

;erasure-clause <- SU-clause / SA-clause / SI-clause
; I don't know what order these should go in for an erasure clause
; I also don't think it will be needed

;the pre erasure clause will be needed, I don't think order matters
pre-I-SA <- spaces? (quote-clauses / !I !SA !FAhO any-word)* (spaces? SA)+ &I
pre-NIhO-SA <- spaces? (quote-clauses / !NIhO !SA !FAhO any-word)*
(spaces? SA)+ &NIhO
pre-erasure-clause <- (SU-clause / spaces? SI+ / pre-I-SA / pre-NIhO-SA)*

ZO-word <- ZEI-clause1 / BU-clause1 / any-word
ZO-clause0 <- spaces? ZO spaces? !anti-quote ZO-word
ZO-clause1 <- (ZO-SA)* (spaces? SA)* ZO-clause0
ZO-SA <- ZO-clause0 spaces? ( !ZO-clause1 quote-clauses / !ZO !SA

FAhO any-word)* spaces? ZO-SA? SA &( (spaces? SA)* spaces? ZO)

ZO-clause <- BAhE-clause? ZO-clause1 post-clause

; zoi and la'o can't use strict peg beause it has to remember the word
; ZOI-for-${word} <- spaces? ZOI spaces? !anti-quote ${word} (spaces?

${word} any-word)* ${word}

; ZOI-clause0 <- ZOI-for-a / ZOI-for-by / .... / ZOI-for-zvati
; otherwise you can use an infinite size rule set using a schema

ZOI-clause0 <- spaces? ZOI spaces? !anti-quote ${word} (spaces?

${word} any-word)* ${word}

ZOI-clause1 <- (ZOI-SA)* (spaces? SA)* ZOI-clause0
ZOI-SA <- ZOI-clause0 spaces? ( !ZOI-clause1 quote-clauses / !ZOI !SA

FAhO any-word)* spaces? SA spaces? ZOI-SA? SA &( (spaces? SA)*

spaces? ZOI)
ZOI-clause <- BAhE-clause? ZOI-clause1 post-clause

LAhO-clause0 <- spaces? LAhO spaces? !anti-quote ${word} (spaces?

${word} any-word)* ${word}

LAhO-clause1 <- (LAhO-SA)* (spaces? SA)* LAhO-clause0
LAhO-SA <- LAhO-clause0 spaces? ( !LAhO-clause1 quote-clauses / !LAhO

SA !FAhO any-word)* spaces? SA spaces? LAhO-SA? SA &( (spaces? SA)*

spaces? LAhO)
LAhO-clause <- BAhE-clause? LAhO-clause1 post-clause



LOhU-clause0 <- spaces? LOhU spaces? (spaces? !LEhU any-word)* LEhU-clause1
LOhU-clause1 <- (LOhU-SA)* (spaces? SA)* LOhU-clause0
LOhU-SA <- LOhU-clause0 spaces? ( !LOhU-clause1 quote-clauses / !LOhU

SA !FAhO any-word)* spaces? SA spaces? LOhU-SA? SA &( (spaces? SA)*

spaces? LOhU)
LOhU-clause <- BAhE-clause? LOhU-clause1 post-clause

LEhU-clause0 <- spaces? LEhU
LEhU-clause1 <- (LEhU-SA)* (spaces? SA)* LEhU-clause0
LEhU-SA <- LEhU-clause0 spaces? ( !LOhU-clause1 quote-clauses / !LOhU

LEhU !SA !FAhO any-word)* spaces? SA spaces? LEhU-SA? SA &( (spaces?

SA)* spaces? LEhU)
LEhU-clause <- LEhU-clause1 post-clause

; using sa into a zei-ified word is more limited
ZEI-word <- BU-clause1 / ZO-clause1 / anyword
ZEI-clause0 <- spaces? ZEI-word spaces? ZEI spaces? ZEI-word
ZEI-clause1 <- ZEI-word (ZEI-SA)* (spaces? SA)* ZEI ZEI-word
ZEI-SA <- spaces? ZEI spaces? (spaces? !quote-clauses !ZEI !SA !FAhO
any-word)* spaces? SA spaces? &ZEI
ZEI-clause <- BAhE-clause? !ZEI-clause1 post-clause

; using sa into a bu-ified word is more limited
BU-word <- ZO-clause1 / ZEI-clause1 / anyword
BU-clause0 <- spaces? BU-word spaces? BU
BU-clause1 <- spaces? BU-word (BU-SA)* spaces? BU
BU-SA <- spaces? BU spaces? (spaces? !quote-clauses !BU !SA !FAhO
any-word)* spaces? SA spaces? &BU
BU-clause0 <- !BU-anticlause any-word BU (( quote-clauses / !BU anyword) SA BU)*
BU-clause <- BAhE-clause? BU-clause1 post-clause

FAhO-clause0 <- space? !anticmavo FAhO .*
FAhO-clause <- BAhE-clause? FAhO-clause0



;Y

;NIhO, LU, TUhE, and TO

anti-SU <- NIhO / LU / TUhE / TO

; these four are all so similar that I just did a template again that
just adds "spaces? SU-clause?"

${sm}-clause1 <- spaces? (${sm}-SA)* (spaces? SA)* spaces? !anticmavo
${sm} spaces? SU-clause?
${sm}-SA <- spaces? !anticmavo ${sm} (quote-clauses / !${sm} !SA !FAhO
any-word)* ${sm}-SA? SA &( (spaces? SA)* spaces? ${sm})
${sm}-clause <- BAhE-clause? ${sm}-clause1 spaces? post-clause


;BRIVLA and CMENE

BRIVLA-atom <- !SI-clause !BU-clause1 (ZEI-clause1 / BRIVLA)
BRIVLA-clause0 <- spaces? BRIVLA-atom
BRIVLA-clause1 <- spaces? (BRIVLA-SA)* (spaces? SA)* BRIVLA-clause0
BRIVLA-SA <- BRIVLA-clause0 (quote-clauses / !BRIVLA-clause0 !SA !FAhO
any-word)* BRIVLA-SA? SA &( (spaces? SA)* BRIVLA-clause0)
BRIVLA-clause <- (BAhE-clause? BRIVLA-clause1 post-clause)+

${sm}-clause1 <- spaces? (${sm}-SA)* (spaces? SA)* spaces? !anticmavo ${sm}

CMEVLA-clause0 <- !CMEVLA-anticlause CMEVLA
CMEVLA-anticlause <- CMEVLA-SA / SI-clause / BU-clause
CMEVLA-SA <- CMEVLA (quote-clauses / !CMEVLA any-word)* SA &CMEVLA
CMEVLA-clause <- BAhE-clause? CMEVLA-clause0 post-clause

CMEVLA-clause0 <- spaces? !anticmavo CMEVLA
CMEVLA-clause1 <- spaces? (CMEVLA-SA)* (spaces? SA)* CMEVLA-clause0
CMEVLA-SA <- CMEVLA (quote-clauses / !CMEVLA !SA !FAhO any-word)*
CMEVLA-SA? SA &( (spaces? SA)* CMEVLA-clause0)
CMEVLA-clause <- (BAhE-clause? CMEVLA-clause1 post-clause)+

;post-clause <- spaces? SI-clause* !anticmavo indicators*
post-clause <- (spaces? SI-clause* / spaces? indicators*)*
indicator <- !anticmavo (UI-clause / CAI-clause / DAhO-clause / FUhO-clause)

;make custom clauses for each of ui, cai, nai, da'o, and fu'o


NAI-clause1 <- spaces? (NAI-SA)* (spaces? SA)* spaces? !anticmavo NAI
NAI-SA <- spaces? !anticmavo NAI (quote-clauses / !NAI !SA !FAhO
any-word)* $NAI-SA? SA &( (spaces? SA)* spaces? NAI)
NAI-clause <- BAhE-clause? NAI-clause1 post-clause

UI-clause1 <- spaces? (UI-SA)* (spaces? SA)* spaces? !anticmavo UI
UI-SA <- spaces? !anticmavo UI (quote-clauses / !UI !SA !FAhO
any-word)* UI-SA? SA &( (spaces? SA)* spaces? UI)
UI-clause <- BAhE-clause? UI-clause1 NAI-clause1?

CAI-clause1 <- spaces? (CAI-SA)* (spaces? SA)* spaces? !anticmavo CAI
CAI-SA <- spaces? !anticmavo CAI (quote-clauses / !CAI !SA !FAhO
any-word)* CAI-SA? SA &( (spaces? SA)* spaces? CAI)
CAI-clause <- BAhE-clause? CAI-clause1 NAI-clause1?

DAhO-clause1 <- spaces? (DAhO-SA)* (spaces? SA)* spaces? !anticmavo DAhO
DAhO-SA <- spaces? !anticmavo DAhO (quote-clauses / !DAhO !SA !FAhO
any-word)* DAhO-SA? SA &( (spaces? SA)* spaces? DAhO)
DAhO-clause <- BAhE-clause? DAhO-clause1

FUhO-clause1 <- spaces? (FUhO-SA)* (spaces? SA)* spaces? !anticmavo FUhO
FUhO-SA <- spaces? !anticmavo FUhO (quote-clauses / !FUhO !SA !FAhO
any-word)* FUhO-SA? SA &( (spaces? SA)* spaces? FUhO)
FUhO-clause <- BAhE-clause? FUhO-clause1

; and finally ba'e
BAhE-clause1 <- spaces? (BAhE-SA)* (spaces? SA)* spaces? !anticmavo BAhE
BAhE-SA <- spaces? !anticmavo BAhE (quote-clauses / !BAhE !SA !FAhO
any-word)* BAhE-SA? SA &( (spaces? SA)* spaces? BAhE)
BAhE-clause <- BAhE-clause1+ (spaces? SI-clause* )*

; also BY-clause needs a custom one written similar in spirit to the
brivla with zei, bu is like zei in a way.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Sun, Nov 23, 2008 at 02:28, Stephen Pollei <stephen.pollei@gmail.com> wrote:
> I haven't actually used "rats" on this and I'm sure that other parts
> of the peg parser file would have to be tweaked in minor ways. This is
> just something to show an approach I thought might be helpful.

E-mail me a complete version of the changed PEG and I'll test it with
Lojgloss. Oh, and does anyone have some good test sentences?

Could you give a conceptual summary of how this differs from the
existing SA handling?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 953

On Sun, Nov 23, 2008 at 07:44:24AM -0600, Chris Capel wrote:
> On Sun, Nov 23, 2008 at 02:28, Stephen Pollei <stephen.pollei@gmail.com> wrote:
> > I haven't actually used "rats" on this and I'm sure that other parts
> > of the peg parser file would have to be tweaked in minor ways. This is
> > just something to show an approach I thought might be helpful.
>
> E-mail me a complete version of the changed PEG and I'll test it with
> Lojgloss. Oh, and does anyone have some good test sentences?

http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/test_sentences.txt

--
Arnt Richard Johansen http://arj.nvg.org/
Yxskaftbud, ge vår wczonmö iqhjälp.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

I didn't check the details, but if what you are doing is a SA that
erases everything to the previous appearance of the same selma'o,
Robin already had that working in a previous version of his PEG.

The main problem wih SA is not how to write the grammar though, the
problem is deciding what exactly we want it to do. SA-selma'o is not
very pretty. SA-construct (where "construct" can be "sumti", "selbri",
and a few others) is only slightly better.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 4034

On 11/23/08, Jorge Llambías <jjllambias@gmail.com> wrote:
> I didn't check the details, but if what you are doing is a SA that
> erases everything to the previous appearance of the same selma'o,
> Robin already had that working in a previous version of his PEG.

Yes mostly that because I noticed in his notes that he didn't have
BRIVLA+SA or CMENE+SA working. In addition, by inspection and testing,
he doesn't have many others working either . Like in #jbosnu : " i sa
i" worked but " ui sa ui" failed.
I didn't know about any earlier working version. I also don't know any
url where I can find earlier working versions.

> The main problem with SA is not how to write the grammar though, the
> problem is deciding what exactly we want it to do. SA-selma'o is not
> very pretty. SA-construct (where "construct" can be "sumti", "selbri",
> and a few others) is only slightly better.

If I understand you; Yes my approach is basicly SA-selma'o, except my
naming is backwards of that — ${sm}-SA . I decomposed the problem
into about 124 easier subproblems(hopefully my selma'o count is right)
; about 100 of the 124 subproblems share the same template. So I have
A-SA, BAhE-SA, BAI-SA, BE-SA, BEhO-SA, BEI-SA, ... , ... , ...,
ZEI-SA, ZI-SA, ZIhE-SA, ZO-SA, ZOhU-SA, ZOI-SA, BRIVLA-SA, and
CMEVLA-SA . Not sure why you would have sumti-SA and selbri-SA, so I
think we are maybe talking about completely different things. Also for
me "SA-selma'o" isn't one rule it's over a hundred. ${sm} is the
varible that you substitute in for each selma'o that can use the
template.

Each selma'o-SA is only used in two spots:
1) In selma'o-clause stuff
2)recursively to allow " x ... x ... sa sa x" and friends.
If it wasn't for the recursive definition then each selma'o or
pseudo-selma'o could just put it's relevant selma'o-SA directly into
it's self; Each selma'o clause handles it's own sa and su issues.
Other reasons to have it separate is maybe readability, and if you
wanted to have a SA-clause by recomposing the rule from the decomposed
subproblems. I, in the comments, gave an example of how you could
reform a universal "SA-clause" but noted that you don't really ever
need the answer to the general question if a piece of text is part of

  • any* SA clause just one of the particular subproblems.


I also noted that most(about 100/124 or about 80%) of the $[sm}-clause
and $[sm}-SA can be formed by using the same template. ${sm}-clause is
already very boiler plate stuff in rlpowell's peg grammar. The ones
which were not able to use the same template were SI, SA, SU, ZO, ZOI,
LOhU, LEhU, ZEI, BU, LAhO, FAhO, NIhO, LU, TUhE, TO, ui, cai, nai,
da'o, fu'o , BY, and ba'e . A lot of those are the so called "magic"
words. That also leaves around 100 selma'o that can use the same
template. Another change slightly unrelated to the sa fixes is that
some of the magic words clauses that quote stuff, now also match and
consume the text so quoted.

Also I think correctness over prettyness, might be a priority; at
least have a complete but ugly version and a pretty but incomplete
version. Also the peg grammar is a more exacting specification for
"deciding what exactly we want it to do". Right now there is some hand
waving imho.

PS I noticed that the first version of the below should probably be
changed into the second version of the below. So that "x ... z .... sa
z .... sa x" works. That should allow some SA nesting.

;old nonnested
${sm}-clause1 <- spaces? (${sm}-SA)* (spaces? SA)* spaces? !anticmavo ${sm}
${sm}-SA <- spaces? !anticmavo ${sm} (quote-clauses / !${sm} !SA !FAhO
any-word)* ${sm}-SA? SA &( (spaces? SA)* spaces? ${sm})
${sm}-clause <- BAhE-clause? ${sm}-clause1 spaces? post-clause

;should be nestable
;can maybe be slightly optimized in case of
; hopefully rare multiple sa's in row within a inner sa thing
that should be skipped over
${sm}-clause1 <- spaces? (${sm}-SA)* (spaces? SA)* spaces? !anticmavo ${sm}
${sm}-SA-end <- (spaces? SA)* spaces? ${sm}
${sm}-SA <- spaces? !anticmavo ${sm} (quote-clauses / spaces?

${sm}-SA-end !FAhO any-word)* ${sm}-SA? SA &(${sm}-SA-end)

${sm}-clause <- BAhE-clause? ${sm}-clause1 spaces? post-clause

I'll also have to fix up the other one which don't follow the
template, and finish the BY-clause, BY-SA stuff. plus do more review
to see what else I've missed.

PPS

wc cmavo_selmaho.txt
122 122 1213 cmavo_selmaho.txt

122 + the brivla pseudo selma'o and the cmevla pseudo selma'o is how I
derived the number 124. I might be off a tiny bit.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

On Mon, Nov 24, 2008 at 8:14 PM, Stephen Pollei
<stephen.pollei@gmail.com> wrote:
> I didn't know about any earlier working version. I also don't know any
> url where I can find earlier working versions.

Hopefully Robin has it saved somewhere. It was something similar to
what you have, with a repeated pattern of rules for each selma'o.

> Not sure why you would have sumti-SA and selbri-SA, so I
> think we are maybe talking about completely different things.

For example, if you want to erase "lo broda" and replace it with "re
lo broda". "lo broda cu sa re lo broda cu brode". The idea is that
SA doesn't just look at the following word to know how far back to
delete, but to the folowing construct. I sill think this is only
marginally better than the selmaho version though.

> Also I think correctness over prettyness, might be a priority; at
> least have a complete but ugly version and a pretty but incomplete
> version.

Yes, but first we need to know what the correct behavior is supposed to be.

SU is used when you are getting nowhere with your utterance and you
want to make a fresh start. That's a reasonable thing to have, and
easy for the human parser to handle. SI is used to replace the last
word with something else. Somewhat artificial, but at least clear
enough and not impossibly hard to follow. But what do we want SA to
do? Searching back for the last appearance of some particular selmaho
in speech is an extremely hard thing to do, and then on top of that
you have to start reprocessing from there, keeping what you had before
and continuing with something else? That may work for machines, but
for human beings? In my opinion, that's the issue we have to solve for
SA before working out the grammar in detail. Selmaho driven
replacement is not a human friendly option, in my opinion.

mu'o mi'e xorxes


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.