lojgloss and linebreaks

posts: 143
Use this thread to discuss the lojgloss and linebreaks page.
posts: 143

I'm using alice as a source of stuff to translate to test my
parser/glosser. When alice starts a new HTML paragraph, the official
parser doesn't say that it's starting a new lojban paragraph. For
that, NIhO is required. Is a NIhO implied by an HTML paragraph break?
"la nicte cadzu", on the other hand, mostly starts new text paragraphs
with {ni'o}, but there are numerous exceptions which might have some
meaning. (Half a {ni'o}?)

Conventionally, casually, I think a paragraph break in text (or a
double-linebreak in plain text) does imply a new paragraph, and I'll
probably treat it as such whether or not it's technically accurate.
But I was wondering if the convention had any formal backing.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

Hello Chris.

2008/6/20 Chris Capel <pdf23ds@gmail.com>:

> Conventionally, casually, I think a paragraph break in text (or a
> double-linebreak in plain text) does imply a new paragraph, and I'll
> probably treat it as such whether or not it's technically accurate.
> But I was wondering if the convention had any formal backing.

Well, though casually paragraph break in text can sometimes really
imply new paragraph, personally I do not like the idea of embedding
such behavior in parser.

I believe that programs must be simple and predictable. They should
what they was told to do. Embedding different special cases,
exceptions of rules and other things is just an easy way to mess
things up a lot.

If you really like the idea of implementing such features, maybe it
would be better to make them optional?

After all, paragraph in text may be just element of formatting, and it
is not exactly the same as {ni'o}

Dmitry.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 3588

dei li 20 pi'e 06 pi'e 2008 la'o fy. Chris Capel .fy. cusku zoi skamyxatra.
> I'm using alice as a source of stuff to translate to test my
> parser/glosser. When alice starts a new HTML paragraph, the official
> parser doesn't say that it's starting a new lojban paragraph. For
> that, NIhO is required. Is a NIhO implied by an HTML paragraph break?
> "la nicte cadzu", on the other hand, mostly starts new text paragraphs
> with {ni'o}, but there are numerous exceptions which might have some
> meaning. (Half a {ni'o}?)
>
> Conventionally, casually, I think a paragraph break in text (or a
> double-linebreak in plain text) does imply a new paragraph, and I'll
> probably treat it as such whether or not it's technically accurate.
> But I was wondering if the convention had any formal backing.
.skamyxatra

If by "formal backing" you mean a reference in the CLL, no, I'm pretty sure
there isn't one. Lojban always requires explicitness in areas where other
languages do not, e.g. paragraph breaks, pauses, and attitude, so I would say
that an empty line in formatting, which is not an actual part of the utterance, most certainly would not indicate a new paragraph.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 5
On Fri, Jun 20, 2008 at 12:46 AM, Chris Capel <pdf23ds@gmail.com> wrote:


> I'm using alice as a source of stuff to translate to test my
> parser/glosser. When alice starts a new HTML paragraph, the official
> parser doesn't say that it's starting a new lojban paragraph. For
> that, NIhO is required. Is a NIhO implied by an HTML paragraph break?
> "la nicte cadzu", on the other hand, mostly starts new text paragraphs
> with {ni'o}, but there are numerous exceptions which might have some
> meaning. (Half a {ni'o}?)
>
> Conventionally, casually, I think a paragraph break in text (or a
> double-linebreak in plain text) does imply a new paragraph, and I'll
> probably treat it as such whether or not it's technically accurate.
> But I was wondering if the convention had any formal backing.
>
> Chris Capel


Sometimes

people

write

like

this.

If someone had typed that way, for whatever pretty (or not) typesetting
reasons they had, and then copy/pasted it into the parser/glosser, and that
glosser puts in a {ni'o} at every double line break, then it would all get
treated like separate sentences, even if, as in my example, it really is
intended only to be one. It should be easy to tell from hearing something to
know how to write it; and from seeing it written, how to say it; but as far
as I know no one has said that a given pronunciation should have only one
typesetting. People might, then, have reason to put words into what might
look like paragraphs by shape, when actually they are in the same sentence.
As an example, anyway. There are probably other, less extreme things that
might be more reasonable and yet still would cause problems for a
parser/glosser that inserted {ni'o} at double line breaks or at "paragraph
breaks".

.mu'omi'e .skaryzgik.


--
.i ko tcesi'a la .diskord.
http://skaryzgik.blogspot.com
.i mi'e la poi jitro be lo jdaca'i ku'o .skaryzgik. poi raibalralju
selsi'afanva

posts: 143

On Fri, Jun 20, 2008 at 8:00 AM, Minimiscience <minimiscience@gmail.com> wrote:
> If by "formal backing" you mean a reference in the CLL, no, I'm pretty sure
> there isn't one. Lojban always requires explicitness in areas where other
> languages do not,

If Lojban requires it, then things that do not follow the rule are not
valid Lojban?

> e.g. paragraph breaks, pauses, and attitude, so I would say
> that an empty line in formatting, which is not an actual part of the utterance, most certainly would not indicate a new paragraph.

Then what about alice?

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143

On Fri, Jun 20, 2008 at 8:18 AM, Marjorie Scherf <skaryzgik@gmail.com> wrote:
> Sometimes
>
> people
>
> write
>
> like
>
> this.

Very rarely. Usually vertically formatted things only use single line
breaks for individual stanzas/whatnots.

> If someone had typed that way, for whatever pretty (or not) typesetting
> reasons they had, and then copy/pasted it into the parser/glosser, and that
> glosser puts in a {ni'o} at every double line break, then it would all get
> treated like separate sentences, even if, as in my example, it really is
> intended only to be one.

I almost certainly would not enter any {ni'o} into the text sent to
the parser. All my manipulations, except for changing some characters
to space, happen to the parse tree, or by modifying the parser itself.
And what I'm thinking of here is detecting double line breaks only at
the end of statements, so your text above (if it were in lojban) would
still parse as a single statement.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

>> e.g. paragraph breaks, pauses, and attitude, so I would say
>> that an empty line in formatting, which is not an actual part of the utterance, most certainly would not indicate a new paragraph.
>
> Then what about alice?

Lojban is somewhat similar to programming languages, so how about such
simile: when programming in C++, it is up to you, whether you ident
your code or not. Identation does not affects parsability of code, it
only helps human to read it.

I think, in Lojban sutuation is mostly the same. Paragraphs are
defined by ni'o, and it is up to author, whether to reflect internal
structure of text, properly inserting line breaks, paragraphs,
tabulation etc, or not.

Or, maybe, author wishes to emphasize some part of (logical)
paragraph, using newlines.

In other words, I think that logical structure of text, determined by
Lojban grammar, should not be mixed with it's formatting.


PS Though, speaking of C++, there is u'i Python. (-;
Dmitry.


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.

posts: 143
On Fri, Jun 20, 2008 at 12:46 AM, Chris Capel <pdf23ds@gmail.com> wrote:

> I'm using alice as a source of stuff to translate to test my
> parser/glosser. When alice starts a new HTML paragraph, the official
> parser doesn't say that it's starting a new lojban paragraph. For
> that, NIhO is required. Is a NIhO implied by an HTML paragraph break?
> "la nicte cadzu", on the other hand, mostly starts new text paragraphs
> with {ni'o}, but there are numerous exceptions which might have some
> meaning. (Half a {ni'o}?)
>
> Conventionally, casually, I think a paragraph break in text (or a
> double-linebreak in plain text) does imply a new paragraph, and I'll
> probably treat it as such whether or not it's technically accurate.
> But I was wondering if the convention had any formal backing.

I think that, really, what I'm confused about here is the meaning of
{ni'o}, which really isn't something you can blame me too much for,
because the grammar's production for statements between {ni'o} is
"paragraph", which is really quite misleading. A paragraph is a

  • visual* unit in written text used to denote a group of statements on

the same topic, *or* used to visually break up a long stream of text
on one topic. So that production really shouldn't be called
"paragraph". Perhaps "topic" would work, but I'm not sure that's
totally accurate either. Then again, production names that are english
word aren't always totally accurate in their connotations.

Chris Capel
--
"What is it like to be a bat? What is it like to bat a bee? What is it
like to be a bee being batted? What is it like to be a batted bee?"
-- The Mind's I (Hofstadter, Dennet)


To unsubscribe from this list, send mail to lojban-list-request@lojban.org
with the subject unsubscribe, or go to http://www.lojban.org/lsg2/, or if
you're really stuck, send mail to secretary@lojban.org for help.