A challenge for computer science/programming geeks: The LLG wants to give you $500! Posted by rlpowell on Tue 28 of Oct, 2008 23:01 GMT posts: 14214 Use this thread to discuss the A challenge for computer science/programming geeks: The LLG wants to give you $500! page.
Posted by rlpowell on Tue 28 of Oct, 2008 23:01 GMT posts: 14214 On Tue, Oct 28, 2008 at 05:48:35PM -0500, Chris Capel wrote: > On Tue, Oct 28, 2008 at 16:51, Robin Lee Powell > <rlpowell@digitalkingdom.org> wrote: > > Your mission, should you choose to accept it, is to create a CFG > > for Lojban. There are three options, with different monetary > > values attached, in order of what we in the LLG board would > > prefer to get. > > How exactly is PEG not context free? Wikipedia's not helping me > much here. AFAICT, PEGs are a subset of CFGs where each rule is > unique and all operators are greedy and unambiguous. No, absolutely not. CFGs are descriptions of how to *generate* a grammar. PEGs are descriptions of how to *parse* a grammar. The two formalisms look similar visually, but have almost nothing in common in terms of formal analysis. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 00:26 GMT posts: 14214 On Tue, Oct 28, 2008 at 06:06:36PM -0500, Chris Capel wrote: > On Tue, Oct 28, 2008 at 16:51, Robin Lee Powell > <rlpowell@digitalkingdom.org> wrote: > > For $100: Formally prove that encoding Lojban's elidable > > terminators is not possible in a CFG. > > Hmm. Don't get too excited, there. What's the standard of > formality, here? Computer proof level, or simply very explicit > English? I don't know what "computer proof" means. I had in mind a normal mathematical proof; something like http://en.wikipedia.org/wiki/Cook%E2%80%93Levin_theorem -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 01:00 GMT posts: 14214 On Tue, Oct 28, 2008 at 09:30:51PM -0300, Jorge LlambÃas wrote: > On Tue, Oct 28, 2008 at 6:51 PM, Robin Lee Powell > <rlpowell@digitalkingdom.org> wrote: > > > > For $500: Produce a working CFG for Lojban, in any format that > > some parser generator somewhere can accept, *or* in BNF or a > > well defined extension thereof. It must support every aspect of > > the language except for ZOI (which is fully context sensitive). > > Elidable terminators are the hard part. > > You don't specify that the CFG must be unambiguous, so ambiguous > CFG's are acceptable? (We already know the language is > unambiguous, but a grammar that generates it can still be > ambiguous, allowing more than one way to generate the same valid > string.) If the grammar is ambiguous, I think you've failed to make a grammar for Lojban. > Terminals are selmaho, not phonemes, right? That was my assumption, yes. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 01:24 GMT posts: 14214 On Tue, Oct 28, 2008 at 10:20:12PM -0300, Jorge LlambÃas wrote: > On Tue, Oct 28, 2008 at 9:59 PM, Robin Lee Powell > <rlpowell@digitalkingdom.org> wrote: > > > > If the grammar is ambiguous, I think you've failed to make a > > grammar for Lojban. > > A CFG is a set of rules that generates each and only the valid > strings of the language. Unambiguity is desirable for a parsing > grammar (which we already have, the PEG), but a generating grammar > that generates all the Lojban strings is also a grammar for > Lojban, even if it has more than one way to generate the same > valid string. I don't believe, but am not certain, that that is possible. That is, I believe that if there is more than one way to generate a string, you've describe an ambiguous language, and that such a description would not, in fact, describe Lojban. > A machine that generated its output from such a grammar would > generate valid Lojban. If you want an unambiguous CFG (which is > obviously a somewhat harder problem) I think you need to specify > it. Then please, everyone, consider it so specified. Apologies for triple posting again, but if xorxes is right this is important. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 01:35 GMT posts: 14214 On Tue, Oct 28, 2008 at 10:30:16PM -0300, Jorge LlambÃas wrote: > On Tue, Oct 28, 2008 at 10:24 PM, Robin Lee Powell > <rlpowell@digitalkingdom.org> wrote: > > > > I don't believe, but am not certain, that that is possible. > > That is, I believe that if there is more than one way to > > generate a string, you've describe an ambiguous language, and > > that such a description would not, in fact, describe Lojban. > > See example 3 in > <http://en.wikipedia.org/wiki/Context-free_grammar> You win. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 08:56 GMT posts: 14214 On Wed, Oct 29, 2008 at 11:34:37AM +0300, Cyril Slobin wrote: > On 10/29/08, Robin Lee Powell <rlpowell@digitalkingdom.org> wrote: > > > > A machine that generated its output from such a grammar would > > > generate valid Lojban. If you want an unambiguous CFG (which > > > is obviously a somewhat harder problem) I think you need to > > > specify it. > > > > Then please, everyone, consider it so specified. Apologies for > > triple posting again, but if xorxes is right this is important. > > Correct me if I mislook something, but without this clarification > the task seems to be trivial, isn't it? If we don't care about > parsing tree, we can just make all elidable terminators optional > and get the solution. Look at a very tiny subset of the language: > > <bridi> ::= broda | <bridi> <bridi> | nu <bridi> | nu <bridi> kei > > The string "nu broda broda" can be generated in two ways: > > nu broda) broda) and (nu (broda broda > > , but the grammar generates all good strings and only good strings. > Therefore unambiguity is the central difficulty there. Fair warning: it's been a *long* time since I worked on this. Your example is too simple to show the problem. sentence ::= <sumti> <selbri> | <sumti> <selbri> <sumti> sumti ::= le <selbri> ku | le <selbri> selbri ::= <brivla> | <brivla> <selbri> brivla ::= klama This will generate the *incorrect* sentence "le klama klama". That's a valid utterance, but it's *not* a Lojban sentence, and a parser based on this grammar will mark it as such. The definition of an elidable terminator is that it can be dropped when no ambiguity (or significant change to the parse tree) results. In this case, the ku is not elidable, but it is not obvious how to make a CFG that refuses "le klama klama" but accepts "le klama ku klama le klama" If you think you've found a way, try adding another elidable terminator that interacts directly with ku. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 18:20 GMT posts: 14214 On Wed, Oct 29, 2008 at 09:41:57AM -0300, Jorge LlambÃas wrote: > On Wed, Oct 29, 2008 at 7:27 AM, Kevin Reid <kpreid@mac.com> wrote: > > > > An unambiguous grammar for the same set of parse trees would be > > > > SELBRI -> SBATOM | SELBRI SBATOM > > SBATOM -> BRIVLA | "ke" SELBRI "ke'e" > > And to make "ke'e" elidable: > > SELBRI -> SELBRI-closed | SELBRI-open > SELBRI-closed -> SBATOM-closed | SELBRI-closed SBATOM-closed > SELBRI-open -> SBATOM-open | SELBRI-closed SBATOM-open > SBATOM-closed -> BRIVLA | "ke" SELBRI "ke'e" > SBATOM-open -> "ke" SELBRI Again, this is too simple to actually illustrate the problem: there is no sentence that Kevin's example generates that is illegal if you remove a ke'e. > My first impresssion is that the mission truly is impossible: i.e. > it is possible to handle the elidable terminators, but not with > less than 2000 production rules, because handling each terminator > will multiply the number of productions by some factor, and there > are a lot of terminators. But that's just my first impression. I don't believe it's possible even with 100K productions; I would be very interested to be proven wrong on that. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Wed 29 of Oct, 2008 18:45 GMT posts: 14214 On Wed, Oct 29, 2008 at 03:40:22PM -0300, Jorge LlambÃas wrote: > >> My first impresssion is that the mission truly is impossible: > >> i.e. it is possible to handle the elidable terminators, but not > >> with less than 2000 production rules, because handling each > >> terminator will multiply the number of productions by some > >> factor, and there are a lot of terminators. But that's just my > >> first impression. > > > > I don't believe it's possible even with 100K productions; I > > would be very interested to be proven wrong on that. > > But you didn't offer any money for that proof! ;) If a big-production version is possible, anyone trying to get the small-production version will find it. I'm assuming they wouldn't be mean enough to keep it to themselves. However, I tentatively (subject to board approval, which I won't request unless it comes up, but expect to get) offer $100 for a mass-production CFG for Lojban. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by JohnCowan on Wed 29 of Oct, 2008 20:38 GMT posts: 149 Robin Lee Powell scripsit: > I don't believe, but am not certain, that that is possible. That > is, I believe that if there is more than one way to generate a > string, you've describe an ambiguous language, and that such a > description would not, in fact, describe Lojban. The fact that there exists an ambiguous grammar for a language is no evidence one way or another about the existence of an unambiguous grammar for the same language. -- John Cowan cowan@ccil.org http://ccil.org/~cowan Rather than making ill-conceived suggestions for improvement based on uninformed guesses about established conventions in a field of study with which familiarity is limited, it is sometimes better to stick to merely observing the usage and listening to the explanations offered, inserting only questions as needed to fill in gaps in understanding. --Peter Constable To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Thu 30 of Oct, 2008 19:29 GMT posts: 14214 On Thu, Oct 30, 2008 at 11:21:45AM -0800, Stephen Pollei wrote: > On 10/28/08, Robin Lee Powell <rlpowell@digitalkingdom.org> wrote: > > > > Your mission, should you choose to accept it, is to create a > > CFG for Lojban. There are three options, with different > > monetary values attached, in order of what we in the LLG board > > would prefer to get. > > > For $100: Formally prove that encoding Lojban's elidable > > terminators is not possible in a CFG. > > I've been thinking about the problem as well as I think the issue > is the ambiguity. For instance I beleive that any peg grammar can > be reduced into a most likely weaker cfg grammar that has > potentialy multiple parse trees for the same valid statement in > the grammar. You're wrong, I'm afraid. http://en.wikipedia.org/wiki/Parsing_expression_grammar#Examples Look for "The following parsing expression grammar describes the classic non-context-free language". PEGs can encode things that CFGs simply cannot, ambiguously or otherwise. This is why Lojban's status is an interesting open question: we can do elidable terminators in PEG, but haven't been able to in CFGs. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.
Posted by rlpowell on Thu 30 of Oct, 2008 20:14 GMT posts: 14214 On Thu, Oct 30, 2008 at 12:00:35PM -0800, Stephen Pollei wrote: > On 10/30/08, Robin Lee Powell <rlpowell@digitalkingdom.org> wrote: > > On Thu, Oct 30, 2008 at 11:21:45AM -0800, Stephen Pollei wrote: > > > On 10/28/08, Robin Lee Powell <rlpowell@digitalkingdom.org> wrote: > > > > > You're wrong, I'm afraid. > > > > http://en.wikipedia.org/wiki/Parsing_expression_grammar#Examples > > > > Look for "The following parsing expression grammar describes the > > classic non-context-free language". > > S => &(A !b) a+ B !c > A => a A? b > B => b B? c > > would will my transformations become > > S => a+ B > A => a A? b > B => b B? c > > all since I specified that ?, *, and + would be non greedy than this > cfg grammar would match whatever that peg grammar matches, only > problem is that it would over match, and would possible have multiple > matches. That's not even remotely the same language. For one thing, there's no way to get to A, so it's actually: S => a+ B B => b B? c which is just ambncn, which is a completely different language. You may already understand this, and that's fine, I just want there to be no confusion: a language that is as close to Lojban as that is to anbncn is of no use to us, and we won't pay out for it. -Robin -- They say: "The first AIs will be built by the military as weapons." And I'm thinking: "Does it even occur to you to try for something other than the default outcome?" — http://shorl.com/tydruhedufogre http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ To unsubscribe from this list, send mail to lojban-list-request@lojban.org with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if you're really stuck, send mail to secretary@lojban.org for help.