lojgloss extraneous characters

Posted by pdf23ds on Wed 18 of Jun, 2008 11:05 GMT posts: 143

How should I handle non-lojban characters? I can strip out some that
should be superfluous, and others that I can't make any sense out of,
but should I translate some characters into cmavo? I'm thinking of
parentheses, braces, and brackets here. I could translate parentheses
into to-toi, and square brackets into sei-se'u. Any other ideas?

Should I just ignore any random non-lojban characters?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)

To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

Link

Posted by Anonymous on Wed 18 of Jun, 2008 13:04 GMT

On 6/18/08, Chris Capel <pdf23ds@gmail.com> wrote:

> How should I handle non-lojban characters? I can strip out some that
> should be superfluous, and others that I can't make any sense out of,
> but should I translate some characters into cmavo? I'm thinking of
> parentheses, braces, and brackets here. I could translate parentheses
> into to-toi, and square brackets into sei-se'u. Any other ideas?

Sometimes people write "(to ..toi)", which would translate to
"to to...toi toi". That's still grammatical, but probably not what
was intended. Similarly "?" is sometimes used along with question
words but not instead of them.

";" is sometimes used for "pi'e".

> Should I just ignore any random non-lojban characters?

Probably some like "_" should be taken as a space.

mu'o mi'e xorxes

To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

Link

Posted by skaryzgik on Wed 18 of Jun, 2008 21:31 GMT posts: 5

On Wed, Jun 18, 2008 at 6:02 AM, Chris Capel <pdf23ds@gmail.com> wrote:

> I could translate parentheses
> into to-toi, and square brackets into sei-se'u. Any other ideas?
>

If you're doing that, you could translate something like curly braces into
tu'e-tu'u.

Curly braces reminds me of something. Does there exist a program that will
translate math formulae or expressions into lojban mekso and/or vice versa?

.imu'omi'e .skaryzgik.

--
.i ko tcesi'a la .diskord.
http://skaryzgik.blogspot.com
.i mi'e la poi jitro be lo jdaca'i ku'o .skaryzgik. poi raibalralju
selsi'afanva

Link

Posted by Eimi on Fri 20 of Jun, 2008 14:30 GMT posts: 18 United States

On Wed, 18 Jun 2008, Chris Capel wrote:

> How should I handle non-lojban characters? I can strip out some that
> should be superfluous, and others that I can't make any sense out of,
> but should I translate some characters into cmavo? I'm thinking of
> parentheses, braces, and brackets here. I could translate parentheses
> into to-toi, and square brackets into sei-se'u. Any other ideas?
>
> Should I just ignore any random non-lojban characters?

I would probably translate 0..9 to qw(no pa re ci vo mu xa ze bi so) and
probably a . between digits as a pi. Others get trickier. ; in numbers is
usually pi'e. / and - in dates are pi'e, but - is also ni'u, va'a, and vu'u,
while + is ma'u and su'i. Most of the times I've seen punctuation other than
that, it's along with the lojban word in question, like "(to ... toi)" and
quotes along with lu/li'u. I think second guessing those would probably be
more problems than it's worth.
--
Adam Lopresto <adam@wustl.edu>
System Administrator
Engineering IT, Washington University

To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

Link

Posted by adamgarrigus on Fri 20 of Jun, 2008 14:58 GMT posts: 92

On Wed, 18 Jun 2008, Chris Capel wrote:

How should I handle non-lojban characters? I can strip out some that
> should be superfluous, and others that I can't make any sense out of,
> but should I translate some characters into cmavo? I'm thinking of
> parentheses, braces, and brackets here. I could translate parentheses
> into to-toi, and square brackets into sei-se'u. Any other ideas?
>
> Should I just ignore any random non-lojban characters?
>

I think it's probably best to leach this shorthand out of Lojban usage. It
feels like a natlang security blanket to me, and seems to run counter to the
principle of audiovisual isomorphism. A problem with numerals that hasn't
been brought up (at least in this thread) is: don't most non-anglophone
countries use "," where anglophones use "." and a space or "." where we use
"," (i.e. 186,282.397 == 186 282,397 == 186.282,397)? Of course, the
downside is that use of English-style numerals in Lojban text is
semi-standard & well represented in Lojban text to date, including the
instructional materials. mu'o mi'e komfo,amonan

Link

Posted by pdf23ds on Fri 20 of Jun, 2008 18:08 GMT posts: 143

On Fri, Jun 20, 2008 at 9:56 AM, komfo,amonan <komfoamonan@gmail.com> wrote:
> On Wed, 18 Jun 2008, Chris Capel wrote:
>
>> How should I handle non-lojban characters? I can strip out some that
>> should be superfluous, and others that I can't make any sense out of,
>> but should I translate some characters into cmavo? I'm thinking of
>> parentheses, braces, and brackets here. I could translate parentheses
>> into to-toi, and square brackets into sei-se'u. Any other ideas?
>>
>> Should I just ignore any random non-lojban characters?
>
> I think it's probably best to leach this shorthand out of Lojban usage. It
> feels like a natlang security blanket to me, and seems to run counter to the
> principle of audiovisual isomorphism.

Perhaps so, but if so a parser/glosser is probably not the place to do
it. I really want Lojgloss to be beginner friendly, so that they could
paste any lojban text into the box and see what it means. So I want to
make it as permissive as possible, at least by default. For instance,
I plan to convert "\n\>*" (i.e., e-mail quotes) in the input to spaces
so you can get glosses for quoted lojban text. On the other hand,
perhaps there's something I can do as a step afterwards to encourage
proper lojban?

> A problem with numerals that hasn't
> been brought up (at least in this thread) is: don't most non-anglophone
> countries use "," where anglophones use "." and a space or "." where we use
> "," (i.e. 186,282.397 == 186 282,397 == 186.282,397)? Of course, the
> downside is that use of English-style numerals in Lojban text is
> semi-standard & well represented in Lojban text to date, including the
> instructional materials.

Hmm. Currently the morphology parses digits as PA cmavo (except in
cmene), but basically ignores "," and ".". I'm fine with leaving it
that way, actually. It's close enough for non-standard input. I don't
care about getting non-standard things *exactly right* every time, I
just want it so they at least don't break the rest of the parse, and
if it's possible to do something more useful than not with it, then
I'd like to do it.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)

To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

Link

Lojban In General

lojgloss extraneous characters

Search Lojban Resources

Lojban In General

Thread actions

Search Lojban Resources