Please note that this page is no longer being updated! Ontology resources, concepts, tools and html of the ontology are available from the OWG home page. RDFS and DAML files are available at mged.sourcefourge.net (links provided from OWG home page).
MGED Ontology Working Group - Building an Ontology

This page sets out the base concepts to structure and pointers to various editing tools for this purpose. Also provided are use cases and scenarios to provide motivation for the ontology. The ontology as a work in progress is provided in a simple text representation, in html, in DAML, and in RDF schema. Note: RDFS and DAML files will be maintained in a CVS at http://mged.sf.net. Please go there for further updates of these files. An up-to-date html file will be maintained at this site.


Use Cases/ Scenarios:

Concepts:

Listed are concepts to be structured in an ontology. Most are defined and I will continue to add to and define the rest. Usage of terms such as "biomaterial" and "biosource" comes from the OMG MAGE-OM submission. I see our goal as to extend the standard that they have made and only modify it when absolutely necessary. I have changed "cell source and type" to "biosource provider" because of confusion about the term. MAGE (the combination of MAML and GEML) is guided by MIAME. (see MAGE and MIAME figures at the end of this section). Some of these concepts such as "organism", "organism part" , and "disease state" are references to external controlled vocabularies/ ontologies. Established ontologies such as the NCBI taxonomy should be used whenever possible. Links to the NCBI taxonomy browser and to model organism databases such as FlyBase, MGD, SGD, TAIR, and WormBase are available in the ontology resources (on the main OWG page). Thanks to B. Aronow, U. Cincinnati, and M. Ashburner, Cambridge U. for suggestions.

Biomaterial: The source of the nucleic acid used to generate labelled material for the microarray experiment.

Biosource: The primary source of the nucleic acid used to generate labelled material for the microarray experiment.

Biosample: The biosource after any treatment.

Labeled Extract: The biosample after labeling for detection of the nucleic acids.

Organism: The genus and species (and subspecies) of the organism from which the biomaterial is derived from.

Biosource provider: The resource (e.g, company, hospital, geographical location) used to obtain or purchase the biomaterial.

Sex: The gender of the organism or the reproductive organs present on the organism (prior to any modification) that the biomaterial is derived from.
TermDefinition
maleThe organism contains only the reproductive organ that produces male gametes (spermatozoa).
female The organism contains only the reproductive organs that produces female gametes (oocytes).
bothThe organism contains both male and female reproductive organs.
noneThe organism does not have reproductive organs.
unknown The reproductive organs of the organism are unknown.

Age: The time period elapsed since an identifiable point in the life cycle of an organism. If a developmental stage is specified, the identifiable point would be the beginning of that stage. Otherwise, the identifiable point must be specified such as planting.
Initial time pointDefinition
birthThe time point at the end of parturition.
fertilizationThe time point at which gametes are joined. May also be used for post-coital measurements.
hatchingThe time point at which the organism leaves the egg.
plantingThe time point at which a seed is planted.

Developmental stage: The developmental stage of the organism's life cycle during which the biomaterial was extracted.

Organism part: The part of the organism's anatomy from which the biomaterial was derived.

Strain or line: Animals or plants that have a single ancestral breeding pair or parent as a result of brother x sister or parent x offspring matings.

Genetic Variation: The genetic modification introduced into the organism from which the biomaterial was derived. Examples of genetic variation include specification of a transgene or the gene knocked-out.

Individual: Identifier or name of the individual organism from which the biomaterial was derived.

Individual genetic characteristics: The genotype of the individual organism from which the biomaterial was derived. Individual genetic characteristics include polymorphisms, disease alleles, and haplotypes.

Disease state: The name of the pathology diagnosed in the organism from which the biomaterial was derived. The disease state is normal if no disease has been diagnosed.

Targeted cell type: The target cell type is the cell of primary interest. The biomaterial may be derived from a mixed population of cells although only one cell type is of interest.

Cell line: The identifier for the immortalized cell line if one was used to derive the biomaterial.

Biomaterial preparation: A description of the state and condition of the biomaterial.

Environmental or experimental history: A description of the conditions the organism has been exposed to that are not one of the variables under study.

Treatment: The manipulation of the biomaterial for the purposes of generating one of the variables under study.

MIAME Sample Description:

The concepts are flattened into a list of attributes. An ontology would provide greater detail in a structured form that would allow computational analysis (e.g., SQL, graph comparisons) between experiments. How much more structure is needed will be driven by the use cases/ scenarios.

MAGE Biomaterial Class Diagram:

The MIAME concepts are structured in the class diagram of Biomaterial from MAGE. Of note is the introduction of an OntologyEntry class that allows the user to point to a database entry somewhere for a controlled vocabulary or ontology term. Treatment has been given attributes (an order, an action) and relationships (measurements of different types). The effects of treatment are the generation of a biosample or labelled extract. Compounds represent everything from culture media components to fluorescent labels.

Ontology tools:

The ontology editors are either open source or licensed for free (at least for academics, let me know if I misrepresented anything). Thanks to Robert Stevens, U. Manchester, for info on Protege and OILed. Products such as those from Rational Rose and Embarcadero can also be used to generate UML models (class diagrams). These are not free.

Hand-crafted ontology:

This text represents the ontology in the RDF schema file provided in the link above. Changes were made to the prior posting of a top-level ontology:
class: BiomaterialDescription
	class: BiomaterialState 
		attribute: has_been_manipulated_by (BiomaterialManipulation)
		class: Biosource
		        attribute: has_been_manipulated_by (EnvironmentalHistory)
		class: Biosample
		        attribute: has_been_manipulated_by (Treatment)
		class: LabeledExtract
		        attribute: has_been_manipulated_by (BiomaterialPreparation)
	class: BiosourceProperty
		class: BiosourceProvider
			attribute: biosource_type (one-of biopsy, paraffin_section)
			attribute: has_donor (Organization) 
			attribute: has_owner (Person) 
		class: Sex 
			instances: male, female, both_sexes, unknown_sex
		class: Age 
			attribute: has_measurement (Measurement)
			attribute: initial_time_point (one-of begining_of_stage, birth, fertilization, hatching, planting)
		class: Individual 
		class: BiosourceOntologyEntry 
			superclass: OntologyEntry
			class: Organism 
				instance: NCBI_taxonomy
			class: DevelopmentalStage
			class: OrganismPart 
			class: StrainOrLine 
			class: GeneticVariation 
			class: IndividualGeneticCharacteristics 
			class: DiseaseState 
			class: TargetedCellType 
			class: CellLine 
			class: ClinicalInformation 
	class: BiomaterialManipulation
		class: BiomaterialPreparation
			attribute: has_protocol (Protocol)
			attribute: has_time_of_day (Measurement)
			attribute: pathological_staging (one-of premortem, postmortem)
			attribute: biomaterial_amount (Measurement)
			attribute: biomaterial_purity (range 0-100)
		class: EnvironmentalHistory
			class: CultureCondition
				attribute: has_measurement (Measurement)
				class: Atmosphere
				class: Humidity
				class: Temperature
				class: Light
				class: DensityRange
				class: Generations
				class: Nutrients
					attribute: nutrient_compund (Compound)
				class: ContaminantOrganisms
					attribute: Is_contaminated_by (Organism)
					attribute: Is_decontaminated_by (Protocol)
				class: Host
					attribute: has_host (Organism)
					attribute: has_host_part (Organism_part)
			class: ClinicalHistory
				attribute: has_clinical_information (ClinicalInformation)
				attribute: has_lab_values (Measurement)
				class: PastMedicalHistory
				class: CurrentDiseaseHistory
				class: ClinicTreatmentHistory
			class: FamilyHistory
			class: Water
				attribute: has_additives (Compound)
				attribute: has_treatments (Protocol)	
			class: Bedding
			class: BarrierFacility
				attribute: description
				attribute: rating
			class: PathogenTests
				attribute: tested_for (Organism)
				attribute: result_summary (one-of positive, negative, inconclusive)
			class: Preservation
				attribute: preservation_type (one-of seed_dormancy, frozen_storage)
				attribute: has_protocol (Protocol)	
		class: Treatment
			attribute: has_protocol (Protocol)
			class: Modification
				attribute: modification_type (one-of addition, removal, rearrangement)
				class: SomaticModification
					attribute: part_modified (OrganismPart)
				class: GeneticModification
					attribute: gene_modified (Gene)
				class: Starvation
				attribute: starved_of (Nutrients)
			class: Infection
				attribute: infected_by (Organism)
			class: BehavioralStimulus
			class: CompoundBasedTreatment
				attribute: has_compound (Compound)
				attribute: has_protocol (Protocol)
				attribute: has_compound_measurement (Measurement)
				attribute: treatment_application (one-of in_vivo, in_vitro, in_situ)
class: Resource
	class: OntologyEntry
		attribute: value
		attribute: description
		attribute: ID
		attribute: has_database_entry (DatabaseEntry)
	class: DatabaseEntry
		attribute: accession
		attribute: accession_version
		attribute: has_URI (URI)
	class: Contact
		attribute: address
		attribute: phone
		attribute: toll_free_phone
		attribute: email
		attribute: fax
		attribute: has_URI (URI)
		class: Organization
			attribute: name
		class: Person
			attribute: last_name
			attribute: first_name
			attribute: mid_initials
			attribute: has_affiliation (Organization)
	class: BibliographicReference
		attribute: title
		attribute: authors
		attribute: publication
		attribute: publisher
		attribute: editor
		attribute: year
		attribute: volume
		attribute: issue
		attribute: pages
		attribute: has_URI (URI)
class: URI
class: Measurement
	attribute: value
	attribute: has_units (Unit)
	attribute: meaurement_type {one-of absolute, change}
class: Unit
        class: TemperatureUnit
	        instances: degrees_C, degrees_F, degrees_K
        class: MassUnit
	        instances: one-of kg, g, mg, ug, ng, pg, fg
        class: VolumeUnit
	        instances: cc, l, dl, ml, ul, nl, pl, fl
        class: DistanceUnit
	        instances: m, cm, mm, um, nm, A
        class: TimeUnit
	        instances: years, months, weeks, days, hours, minutes, seconds, ms, us
        class: QuantityUnit
	        instances: mol, umol, nmol, pmol, fmol, amol, molecules
        class: ConcentrationUnit
	        instances: M, mM, uM, nM, pM, fM, mg/ml, ml/l, g/l, %(weight/vol), %(vol/vol), %(weight/weight)}
class: Protocol
        attribute: has_hardware (Hardware)
        attribute: has_software (Software)
	attribute: name
	attribute: description
	attribute: has_citation (BibliographicReference)
class: Hardware
        attribute: make
	attribute: model
	attribute: has_manufacturer (Contact)
class: Software
        attribute: name
	attribute: version
	attribute: has_manufacturer (Contact)
class: Compound
	attribute: is_solvent {one-of yes, no}
class: Gene
	attribute: has_database_entry (DatabaseEntry)

Last updated Jan. 20,, 2002