LojbanMachineTranslation

  • Nicholas, N. 1996a. Lojban as a Machine Translation Interlanguage in the Pacific. Fourth Pacific Rim International Conference on Artificial Intelligence: Workshop on 'Future Issues for Multilingual Text Processing', Cairns, Australia, 27 August 1996. 31-39.
  • pycyn pointed out that work has been performed on conversion form Logician's English to predicate logic. See the provided references.
  • And suggested Discourse Representation Theory as being relevent.
  • Bjorn Gohla pointed out KPML a natural language text generation system.
    • Jay think that translating Lojban into other languages is almost purely a NL text generation problem. (Jay also feels that translating natural languages into Lojban is an uninteresting problem, FWIW.)
      • An uninteresting problem?? Well, let's get some skeleton code running then if it's so easy! Because it's a very useful project!
        • Come on, xod, we're supposed to be thinking logically, aren't we? le'i ro cinri na cmima le'i frili The kind of reasoning you're demonstrating is the kind of thing Lojban ought to be wonderful at putting a stop to. :-) --jay
          • zo'o .i vu'e ma'i lo'i skami nabmi le sizytolcinri ca'a smuni lo nalpluja
      • What quality of translation is uninteresting? A Babelfish-quality machine translator would be useful; is that what you consider dull? A Stefan George-quality translator would be incredible (and far beyond the state of the art!).
        • Any quality is uninteresting. I don't see taking massive amounts of natlang text and moving it into Lojban as useful. What I do see as useful is writing new things (patents, manuals, etc) in Lojban, and then being able to get quality translations of the Lojban, in n other languages. (Nobody will learn Lojban to read things which are already available in a natural language, but they might learn it to that they can write things that can be translated into n languages.) --jay
          • Nobody needs to learn Lojban if converting into and out of Lojban is so easy. Sorry, but if Lojban is used as an interlingua, it will be less like a Lingua Franca, spoken by many people to each other, and more like like a hidden inter-translation code that few ever care to see.As far as I am concerned, Natlang > Lojban is hard, Lojban > Natlang is easy. So, you can see my surprise at the allegation that the former is easy too! If it's all so easy, let's just do it. --xod
            • You're living in an alternate reality, because nobody has said the that natlang -> lojban is easy, and asserting it more often won't make it true. --jay
              • Ignoring my direct addressing of this point doesn't help matters. Read the sentence I wrote above in Lojban. --xod
                • I did read it. That is a perfectly valid assumption to make, right until you're corrected. When you persist in holding a view which is valid in reference frame A, after it has been pointed out that you're not in reference frame A, well, see the above "alternate realities" comment. --jay
                  • The reference frame is skami nabmi. We are discussing a issue of software complexity. Where is the disconnect, and why do you think it's been explained to me even once? --xod



I'm pretty knowledgeable about artificial intelligence, though I've never worried much about natural language understanding specifically. In my opinion, the issue of whether translation to or from Lojban is "easier" is secondary. Before you can answer it, you have to ask: What quality of translation?

  • There are large chunks of the translation process which Lojban makes easier, and the only thing about Lojban which would make the process more difficult is the fact that you can't rely on natural language vagueness and iffiness to carry the day. (And really, you shouldn't let it ever do so, but people want results...) --jay


For machine translation quality similar to the current state of the artpoorI expect Lojban would be much easier to translate from and somewhat easier to translate to. Lojban provides unambiguous parses and often-unambiguous word meanings, which are basic abilities that today's machine translators have trouble with.

  • "Trouble with"? They're incapable of it. For some languages, its provably impossible. (without true understanding of the context. see Swiss German) --jay
    • "Incapable" is too strong a term. Machine translators can use statistical models to make guesses at word senses. It's not on the same planet as throwing darts. mi'e jezrax
      • I was referring to parsing the grammar, actually. Swiss German is provably context sensitive, so you'll need to understand it before you can even hope to parse it. As far word sense, well, if you've got an algorithmic process for even guessing at the meaning of words in natural language, I suggest you publish. :-) Otherwise, you'll have to define for the system every word you want it to be able to translate. (Whereas in Lojban, that is limited to the so-far-little-used fu'ivla) (The best statistical model I've seen which would be applied to determining word sense from nothing is latent semantic analysis, and that requires a very very big corpus, and acts in odd, unpredictable ways: hot and cold are "closer" to it than are cold and cool.)
        • zo'o No need to publish; there are already enough book chapters on it. Search Amazon for "machine translation" or "natural language processing". It's a decades-old research field; people know what the problems are; none of them are solved, but there's progress on every front. One specific suggestion: "Foundations of Statistical Natural Language Processing" http://www.amazon.com/exec/obidos/ASIN/0262133601/qid=1009761097/sr=1-2/ref=sr_1_75_2/103-0198688-3778269
        • The ambiguous-parse problem occurs in, probably, all natural languages. I gather that the most popular way to deal with it is by brute force: produce all possible parses (usually a lot more than you expect), and then rank them. As far as I'm concerned, this is an easy problem from among the problems of natural language understanding--but only relatively speaking!


For machine translation quality similar to a quick-and-dirty human translationmoderate quality but still much better than the current state of the artI doubt Lojban offers much advantage. The problems are so much more difficult that merely getting the syntax and individual words correct doesn't go that far toward solving them.

  • What problems (that don't already exist in dealing with natural language)? jbofi'e already performs "quick-and-dirty" translation, and the results would be decent with smoothing and some knowledge of the destination language applied (getting subject/verb count agreement and such things to match). --jay
    • jbofi'e's translation quality is way, way worse than a quick-and-dirty human translation.
      • You'll need to define "quick and dirty", then, as I interpret it to mean dictionary lookup of each individual word, limited attempts to deal with conjugation, and simplistic reordering to match the order of subject, verb and object in the destination language. jbofi'e definitely beats that out.
        • A "quick-and-dirty human" translation is, say, one done in real time by a simultaneous translator.


Obviously, the higher-quality the machine translation, the more uses it has. I doubt that a poor-quality translation would be adequate for the proposed patent application, though I could be wrong. Also, the patent application could rely on formalizing the source text according to special-purpose rules, which would make the job easier.

  • A poor translation that takes 2 seconds could be worth something compared to a good translation which would take a week or so and cost you a bit of money. (As far as limiting the domain, see the METEO system used by Canada to do translation of weather reports, works flawlessly.) --jay
    • Of course; it all depends on the use.


mi'e jezrax


Created by admin. Last Modification: Friday 30 of November, 2001 12:31:04 GMT by admin.