Skip to content

<empty>, when child of <sequence> or <alternate>, not correctly processed #780

Description

@sydb

See #263; despite being closed, I do not think the problem identified by @lb42 has been solved.

The <empty> element, as a member of model.contentPart, is permitted only as a child of <content>, <sequence>, or <alternate> (and, after TEIC/TEI#2538 is merged, <interleave>).

content/empty

I do not believe there is any controversy when <empty> is a child of <content> — this has been exercised by both the Guidelines and lots of people’s ODDs dozens or hundreds of times. Since <content> is required to have 1 and only 1 child, there is no sibling rivalry between <empty> and its siblings, as it has none.

alternate/empty

But when <empty> is a child of <alternate>, its sibling seems to just beat it to death:

 <alternate minOccurs="1" maxOccurs="1">
   <empty/>
   <elementRef key="add" minOccurs="1" maxOccurs="1"/>
 </alternate>

should produce an optional <add> — either ( empty | add ) or ( add )? or ( add? ) or perhaps even ( add | empty ). But what it actually produces is just ( add ), i.e. a required <add>, not an optional <add>. (Yes, I realize the correct effect can be obtained by using the much simpler <elementRef key="add" minOccurs="0" maxOccurs="1"/>, but that’s not the point.)

sequence/empty

I have discovered at least one circumstance for which incorrect output is generated when <empty> is a child of <sequence>. Consider the following PureODD construction. While admittedly a bit off the beaten track, the intent is for a content model that allows either 0 <docDate> elements or 2 or more <docDate> elements — i.e., any number of <docDate>s except one; furthermore, if there are any <docDate>s there can also be global stuff with them.

 <content>
   <sequence minOccurs="0" maxOccurs="1">
     <empty/>
     <sequence minOccurs="2" maxOccurs="unbounded">
       <elementRef key="docDate" minOccurs="1" maxOccurs="1"/>
       <classRef key="model.global" minOccurs="0" maxOccurs="unbounded"/>
     </sequence>
   </sequence>
 </content>

However, the RELAX NG produced by these Stylesheets seems to completely lose the outer <sequence minOccurs="0" maxOccurs="1"> clause. I.e., the <empty> seems to commit not only suicide (which was expected), but parricide as well:

 (
   docDate, model.global*,
   docDate, model.global*,
   ( docDate, model.global* )*
 )

If the outer <sequence> is changed to an <alternate>, the correct model is generated.

It is possible these two problems are related, although I think it unlikely. (So it may be more convenient to split this into two issues.)

I plan to post an ODD that demonstrates these situations shortly.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions