What does the BOMP format look like?

I want to build Ralf's German medical dictionary with terminal information (just noun/adjective). I found a GPL word list which I am planning to use. The word list is splitted into several files which makes it easy to distinguish between noun and adjective.

Because I want to know what the Hadifix/BOMP format looks like, I used the simon import function to get this dictionary. But unfortunately, I couldn’t take a look at the source code of this dictionary. It was imported into simon automatically.

By the way, the BOMP dictionary (non-free license) that simon imported on my computer contains about 144k words. If you import Ralf’s German dictionary (version 0.1.4), you get more words, but without terminal information. And of course, there are errors that I am planning to correct. So the BOMP dictionary is currently probably the best dictionary.

How could I add terminal information to Ralf’s dictionary? There are two possible solutions:

1. Convert the PLS dictionary into BOMP format (with an XSLT style-sheet). Then import it into simon as Hadifix dictionary.
2. Add a <terminal> tag to the PLS dictionary. I think that this wouldn’t fit the PLS standard. I want to use existing standards.

The easiest way would be if I could take a look at the source code of the BOMP dictionary.

Tags:

One Response to “What does the BOMP format look like?”

  1. Peter Grasch says:

    The BOMP version we use actually uses the HADIFIX standard. It looks like this:
    Aal NOM ‘?a:l|

    (A one line excerpt should be covered by fair use)

    Greetings,
    Peter