Author Topic: BeerXML 2.0 schema proposal  (Read 29826 times)

beerfan

  • Newbie
  • *
  • Posts: 4
    • View Profile
BeerXML 2.0 schema proposal
« on: May 14, 2006, 06:06:45 PM »
The original BeerXML specification has some serious limitations hindering widespread acceptance so I have proposed to design a new specification which addresses these limitations and provides enough flexibility to serve the needs of the widest possible audience. I will describe below some of the limitations and issues which I see with the original specification but those readers who wish to participate in this discussion may wish to familiarize themselves with the relevant XML specifications and terminology used.

XML 1.0
XML Schema Part 0: Primer Second Edition
XML Schema Part 1: Structures Second Edition
XML Schema Part 2: Datatypes Second Edition
Namespaces in XML 1.0
XML Path Language (XPath)
XSL Transformations (XSLT)

First, a major limitation of the specification is that element attributes are disallowed. This choice was apparently made to allow rudimentary custom parsers to easily read and create BeerXML files. The issue is that XML without attributes is ML, not XML. The extensibility in XML is provided by schema and namespaces which require the use of attributes. The following is a simple example of a schema being specified for an element.

<RECIPES xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="beerxml.xsd">
  <RECIPE>
  ...
  </RECIPE>
</RECIPES>

Aiming for a low entry barrier for acceptance of the specification is an important goal but limiting the specification to ML features which only the most basic text parser can understand circumvents the purpose of using XML entirely. If using custom parsers is a requirement of some stakeholders then I respectfully suggest that the name of the original specification should be changed to BeerML to clarify its purpose and allow a real XML specification to be developed.

Second, the specification was proposed without an XML schema, or even a document type definition. This severely limits the use and interoperability of documents produced and, in effect, the possibility that BeerXML could be used as a standard. For example, it is impossible to determine if a document is valid without parsing it and possibly encountering an error in the process. It is also impossible to create a document that is verifiably compliant with the specification.

Third, the specification defines elements with dubious relevance instead of allowing vendors to extend the specification (using namespaces). For example, "STYLE" contains elements suitable for describing nearly every detail of the range of statistics which define the beer style; information which is non-essential to brewing the recipe and which would be better described in an extended schema. Other elements like "DISPLAY_AMOUNT" are simply translations of data defined in other elements.

For example, with a minimal recipe schema specification a vendor could include proprietary elements by specifying their own schema namespace.

<RECIPES xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.beerxml.com/schema/2.0/beerxml.xsd">
  <RECIPE>
    ...
    <STYLE>
      <NAME>
      <STYLESTATS xmlns:bs="http://www.beersmith.com/beerxml-style.xsd">
        <bs:History>
          ...
        </bs:History>
      </STYLESTATS>
    </STYLE>
  </RECIPE>
</RECIPES>

Forth, the specification defines element names which are duplicated at different levels and may, in some cases, contain conflicting data types. Element structure is also poorly organized with a large amount of elements existing directly under the RECIPE element instead of organized logically (e.g, Metadata, Ingredients, Process, ExpectedResults).

Fifth, the specification provides only limited support for localization. For some reason the specification mandates the use of an XML header which specifies a character encoding of "ISO-8859-1" which limits documents to a subset of latin based characters. Mandating a specific character encoding is counter-productive but if the specification must suggest a specific encoding then "UTF-8" may be a better choice. Further, some authors may desire to include multiple languages for some elements (e.g. notes, instructions). Due to the lack of a schema it is unclear if this is currently allowed. Many units are mandated to be in metric which may be more restrictive than necessary. Also, due to the fact that attributes are disallowed some elements contain a string including the units (e.g. "4 lb") instead of an integer or decimal number.

Finally, the naming convention chosen breaks with most industry recommendations. For reference the URL below lists a number of international organizations and standards bodies which have standardized on the use of camelCase for element names.

http://xml.coverpages.org/camelCase.html

Specifically, the use of UpperCamelCase is recommended for element names and lowerCamelCase for attribute names. The use of punctuation and non-alpha characters (e.g., "_") in element names is discouraged. While every schema implementor is free to choose their own conventions based on specific need, it seems counter-productive to break with industry practice.

For these reasons, I propose to design a BeerXML schema from scratch that does not attempt compatibility with the original specification. If the community approves this proposal I will outline some additional recommendations and seek further feedback on the requirements of the specification, if there are multiple audiences within the community, and any additional concerns with the existing specification.

--
Chris Cook
« Last Edit: May 15, 2006, 11:24:04 AM by beerfan »

BeerSmith

  • Administrator
  • Newbie
  • *****
  • Posts: 24
    • View Profile
    • BeerSmith
Re: BeerXML 2.0 schema proposal
« Reply #1 on: May 14, 2006, 08:22:40 PM »
Chris,
  I agree that the original spec has a number of limitations, but I question throwing the whole thing out and starting over again.  Quite a bit of work went into defining the correct data elements that were common to the various brewing programs.  Achieving consensus was not a small thing.

  I would personally prefer we work within the current framework to address the ills you mention.  The problems you list are primarily stylistic and can be remedied without throwing the whole thing out and starting over again.

  The original goals were rather limited:
    - Allowing programs to exchange data using a common format
    - Allowing data to be displayed (thus the large number of "Display" items) using style sheets

  Examples:
   1. Attributes - can easily be added
   2. With some modifications, a DTD and schema can be built
   3. Styles were included with some forethought - in fact the BJCP released a new style guide just as we released BeerXML - it is important to know which guide and style data were used for a particular recipe.
   4. Elements could be restructured if needed...though I'm not sure what is really to be gained...
   5. Again, the localization line can easily be changed.
   6. Naming may be non-standard, but does this really justify throwing the whole thing out and starting over??

  It was a monumental effort to come up with a common set of data - an effort that included the major beer software developers at that time.  I don't think developing a schema requires that we reopen that whole discussion again.   I would prefer to use the existing standard as a starting point and then adjust as needed to make it more schema friendly.

Cheers!
Brad
BeerSmith Rocks - www.beersmith.com

tsujigiri

  • Guest
Re: BeerXML 2.0 schema proposal
« Reply #2 on: May 14, 2006, 10:20:55 PM »
Hi,
I don't see why we can't work with the existing specification. Unless there are major structural changes to be made, a lot of these points could be added or modified into what already exists.

I definitely think:
1. Attributes should be included
2. We need a schema to validate off right from the start
3. Get rid of the caps

These are easy to fix

beerfan

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #3 on: May 15, 2006, 12:05:02 PM »
I didn't mean to suggest that we should throw away all the work done so far and start over gathering requirements from the beginning. The existing knowledge and specification will be invaluable for moving forward.

That being said, very few of the limitations that I described above can be addressed in the existing specification without making it incompatible. Attributes and element names could be added but element name changes or any sort of reorganization would render it incompatible. Furthermore, backwards compatibility is more of a curse than a blessing. The need for web browsers to render every compliant and non-compliant document in every version of HTML and the resultant chaos that it produced is the reason that the specification has been abandoned in favor of XML.

I also hear your unvoiced concern that adopting a significantly altered specification will require much work to implement. This may be the case but it is not in the best interest of the specification or the community to retard needed change or growth because some implementors have no desire to change. Use of the specification is not compulsory in any case. If change is embraced there are measures that can be taken to minimize the impact on software (e.g., abstract markup from code, use XSLT to produce input and output).

There is a gamut of opinion on how much change is needed or wanted and it is likely that many stakeholders have not yet voiced an opinion. For my part, I do not produce beer recipe software and so consider myself unbiased. Never the less, I have offered to help improve the specification but I have little interest in adding minor tweaks to a specification which, in my opinion, needs more significant change.

Brandybuck

  • Guest
Re: BeerXML 2.0 schema proposal
« Reply #4 on: May 15, 2006, 08:05:31 PM »
1) I am in favor of a schema for the purposes of standardization. But I really don't know how to use a schema. I am a C++ developer, not a web developer, and a lot of this is a foreign language. I am using XML as a file format, nothing more. I do want a human readable document to go along with the schema. I do NOT want to have to parse a schema before I read in a recipe.

2) If we're going to break compatibility, we might as well start over from scratch. What good is a backward compatible version that isn't backward compatible?

strangebrewer

  • Newbie
  • *
  • Posts: 3
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #5 on: May 16, 2006, 04:47:43 PM »
  The original goals were rather limited:
    - Allowing programs to exchange data using a common format
    - Allowing data to be displayed (thus the large number of "Display" items) using style sheets

I was one of the original developers who advocated the "display" elements, because I was using them in my software.  At the time I didn't really understand xsl very well, and now realize that the "display" part could be handled there. 

Anyway, it seems to me we could agree on several modifications to the current BeerXML definition as a first step:
- allow attributes, and fold some current elements into attributes (units, for example)
- require a schema
- adopt camel case

I'd like to propose that we add an attribute to the root element and Recipe element to identify them as BeerXML.

I'd also like to propose that BeerXML adopts the BJCP style XML for the style portion.

Drew

AntonW

  • Global Moderator
  • Full Member
  • *****
  • Posts: 111
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #6 on: May 16, 2006, 05:37:46 PM »
Hi,

I've generated some preliminary XSDs against what I could gleem from the standard as it is today.  Unfortunately I didn't see this forum and posted them here:
http://beerxml.com/forum/index.php?topic=12.0

Please take a look at them and provide feedback.

Thanks,
-Anton

AntonW

  • Global Moderator
  • Full Member
  • *****
  • Posts: 111
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #7 on: May 22, 2006, 11:28:36 PM »
The forum wouldn't store the files, so I placed them in a zip here:

http://www.speakeasy.org/~antonw/beer_xml/beer_xsds_0_01.zip

I still need to roll the updates from Brad's feedback into XSDs, so look under that same directory to find any updates.

Cheers!
-Anton

AntonW

  • Global Moderator
  • Full Member
  • *****
  • Posts: 111
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #8 on: June 16, 2006, 02:15:11 AM »
Has anybody had a chance to review the preliminary *.xsds I submitted for version #1 of the beer xml?

Any and all criticisms/critiques are welcome!


I'm also thinking that version #1 of the standard is overspecified, and am leaning toward factoring out everything that isn't needed to create and share a beer recipe.  Anybody have thoughts on this?

-Anton

beerfan

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #9 on: June 16, 2006, 11:32:35 AM »
Has anybody had a chance to review the preliminary *.xsds I submitted for version #1 of the beer xml?

Any and all criticisms/critiques are welcome!
I have not had a chance to work on this project lately so I cannot comment on your schema files. I apologize.

Quote from: AntonW
I'm also thinking that version #1 of the standard is overspecified, and am leaning toward factoring out everything that isn't needed to create and share a beer recipe.  Anybody have thoughts on this?

I quite agree with you here. For a 2.0 spec I would recommend taking the approach of defining a minimal set of core elements (which will be quicker and easier) and then allowing for extended elements to be specified in optional schemas or in vendor specific schemas using other namespaces. However, this route necessitates some extra work (and xml knowledge) by implementors.

bperetto

  • Guest
Re: BeerXML 2.0 schema proposal
« Reply #10 on: July 06, 2006, 08:59:08 AM »
First, I don't consider myself a programmer so forgive my ignorance.

Those who know me, know how much I love to play Devil's Advocate (not the pinball game) so don't take offense at my rant...

In the limitations, I see a lot of talk about schemas and specifications and elements. But, really, what ARE the REAL limitations? What can't users do? What can't programmers do?

Like I said, I'm no programmer, but I did write a recipe calculator/database with minimal help and not too many headaches and swearing. I put in my own validation code and tried editing the XML files to "break" my database.
« Last Edit: July 06, 2006, 09:07:50 AM by bperetto »

AntonW

  • Global Moderator
  • Full Member
  • *****
  • Posts: 111
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #11 on: July 06, 2006, 03:00:50 PM »
The three most apparent limitations of the current beer xml are:

beerxml 1.0 standard has been assembled in a fashion that makes it impossible for anything but the most basic validation with beerxml XSDs that I generated.  This leads to a lot of programs doing their own validation, which is *guaranteed* to fail miserably and goes directly against the idea of standardization.  The same people that developed XML created XML Schema Definitions (XSDs for short) which programmatically verify the syntax of values for each element based on the XSD.  Since there is already a well defined mechanism for handling validation in XML files we'd like to update the beerxml standard for version 2.0 so that the new files can be more thoroughly verified by the accompanying XSDs.

beerxml 1.0 recipes support the maximum set of elements in the concatenation of all their supporting elements.  For example, in creating a recipe a brewer can add data in the recipe hops section that contains the amounts of humulene, caryophyllene, cohumulone and myrocene of the hops used in the recipe.  Why do I have the opportunity to add cruft that only 1 in 100000 (or more) brewers care about to the recipe when the form of the hops used in the recipe isn't even required?  For brewing a recipe I generally note the type of hops, the alpha acid units, the form of the hops, the amount (mass) and when they were added to the boil because that's all the data necessary to predict bitterness and to reproduce the recipe.  If the brewer knows the type of hops being added to the boil then they can easily cross reference the data in their hops records to find the values of humulene, caryophyllene, cohumulone and myrocene for a given hops.  And it's not just the hops section, all of recipes supporting elements have extraneous values which are unnecessary for making a batch of beer.

beerxml 1.0 contains many duplicated elements (mostly to display quantities in non-normalized units).  Duplicated elements always lead to the question -> which instance is correct?  Semantically, if the two values disagree, which value should you use in your program to calculate the quantity of ingredient needed in the recipe?  The easy fix is to remove the duplicated information and figure out a better way of representing the data so that duplication is unnecessary.

You asked earlier what can't users/programmers do?

Right now users can't exchange beerxml recipes and be guaranteed that those recipes that work for your program will also work with my program.  If those recipes were validated against beerxml XSDs, there would be a greater chance our recipes would work interchangeably.

The other two recommendations are simplifying what exists in the standard and making it easier for programmers to implement the correct functionality, and easier for users to enter data.

-Anton

bperetto

  • Guest
Re: BeerXML 2.0 schema proposal
« Reply #12 on: July 06, 2006, 04:00:22 PM »
I wouldn't say validation is guaranteed to fail. I'm content with mine and I don't have a clue as to what I'm doing.  ;)  I'm not against and XSD as long as it doesn't modify the XML itself (other than the header stuff).

As for the hops, I agree that a lot of that info is usually irrelevant (or even unknown) to most users. I don't remember if they're required or not, but a zero works fine for me.  Most of the time (in my case, for one) people are going to be picking from a list of ingredients such as hops which will already have that data in there. If I had to enter the name, color, extract, yadda yadda for each grain in my recipe, I'd switch to pen and paper.

I agree with the display elements- they're useless. Like the style tags, I just ignore them in my DB. It doesn't bother me that they're in there.

If people can't exchange beerxml files among different programs without consistency, I'm sorry, but I have to blame the programmer. I've had no problem importing from and exporting to various programs including BeerSmith (Thanks to Brad for providing me a copy for testing just that).  The v1 standards guide that was posted seemed very specific into what could and couldn't be done, and that's what I based my imports/exports on.

Again, just my opinions...

AntonW

  • Global Moderator
  • Full Member
  • *****
  • Posts: 111
    • View Profile
Re: BeerXML 2.0 schema proposal
« Reply #13 on: July 06, 2006, 05:19:11 PM »
No problem, you have legitimate concerns.  The beerxml 2.0 standard should be roughly >80% the same content as the beerxml 1.0 standard.  We're just trying to add a few refinements that make the new standard both easier and more consistent to use.  So far we've talked a lot about some of those refinements, but I haven't had a block of time to dig in and factor those ideas back into some beerxml 2.0 XSDs and sample XML to distribute to everybody.

If program A conforms to the standard, and program B conforms to the standard but their recipes are not compatible - who do you blame?  If tomorrow ProMash added beerxml 1.0 support and their recipes didn't work with your program/database, who's the bad programmer?  Now consider if BeerSmith and ProMash could both exchange recipes successfully, and you could exchange recipes with BeerSmith successfully...  See the headache that forward compatibility causes?  ;D

-Anton





bperetto

  • Guest
Re: BeerXML 2.0 schema proposal
« Reply #14 on: July 07, 2006, 03:50:45 PM »
If tomorrow ProMash added beerxml 1.0 support and their recipes didn't work with your program/database, who's the bad programmer?
« Last Edit: July 07, 2006, 03:55:24 PM by bperetto »