Importing a Matrix from a NEXUS Format File

If you have existing matrices, created in software such as Mesquite or MacClade, that are in NEXUS format, you can import those directly into MorphoBank as a starting point for your MorphoBank matrix. To do so, click on the file browse button under "NEXUS file to add to matrix" in the new matrix form and select your data file. You can also specify text to add to the notes data of each taxon and character imported from the NEXUS file. If you have taxon or character partitions in your project, you will also see checkboxes for each partition. If checked, all taxa or characters will be added to the selected partition as they are imported.

When you are ready click on "save." Importation of a NEXUS file can take some time so be patient. Most files take well under a minute to import, but very large data sets have been known to take five minutes or more. Once the import is completed, a statistical summary of the import process will be displayed and you will be shown the new matrix's basic information editing form, as shown in Figure 12.2, “Matrix basic information editing form”.

Figure 12.2. Matrix basic information editing form

Matrix basic information editing form


NEXUS import caveats

NEXUS is a loosely defined format that has never been definitively standardized (see https://www.nescent.org/wg_phyloinformatics/Supporting_NEXUS_Documentation for an interesting discussion about NEXUS limitations). Various software applications import and export NEXUS in subtly inconsistent and incompatible ways. MorphoBank tries to accommodate as many features and oddities as possible but there are still a number of caveats to keep in mind as you gather data to import into MorphoBank.

If MorphoBank rejects your uploaded NEXUS file, first look over the list of advice below. If your file is still unusable, then contact us via the online form available from the bottom of the MorphoBank homepage for assistance. We will be happy to work with you to get your file working with MorphoBank.

Only morphological data is accepted

MorphoBank only accepts NEXUS files containing morphology data. It is possible to create NEXUS files with a mixture of morphological and genetic data, or with genetic data alone. These files will be rejected by MorphoBank, however, primarily because unless one uses non-standard extensions to the NEXUS format (such as the MIXED datatype), it is not always clear what is morphological data and what is not. If you have mixed matrices, remove the genetic data before uploading the file to MorphoBank.

If you need to keep genetic data with your project, for publication or convenience (or both), you can upload files containing the data to MorphoBank. MorphoBank simply puts the files in a storage location - it does not attempt to manipulate or verify uploaded non-morphological data in any way.

Supported NEXUS blocks

MorphoBank extracts data from the TAXA, CHARACTERS, DATA, ASSUMPTIONS and NOTES blocks. All other blocks are extracted as-is and placed in the matrix's NEXUS Blocks tab, where they can be viewed and the raw block data edited. When you export a MorphoBank matrix as NEXUS, these blocks are reincorporated into the output file.

This means that files with tree data (for example) - data which cannot be manipulated with current MorphoBank tools - can be uploaded. Data not usable in MorphoBank are in fact preserved, although they are of limited use while using the system.

Bad data

Most desktop programs are quite forgiving of NEXUS data irregularities and errors, such as having two characters (or taxa) in the same matrix with the exact same name. MorphoBank employs a relational database with a more formal data model, and is less forgiving of errors. In some cases it will reject your data outright, and in other cases it will revert to default values in an attempt to make the data work. After importing a NEXUS file, you should always give the resulting MorphoBank matrix a quick look-over to make sure everything is as it should be. In general, if it looks good, it is.

[Note]Note

If MorphoBank is rejecting a NEXUS file that works in some other program let us know. We will work to resolve the problem and make your file compatible with MorphoBank. Send a note via the online form available from the bottom of the MorphoBank homepage with the details of the problem. Don't forget to attach a copy of the NEXUS file!

Common errors in NEXUS files that should be avoided if at all possible are:

  • Blank characters or taxon data: all characters, character states and taxa should have names defined. Blank data will be filled with default data ("Character 1", "Character 2", ... for characters, for example).

  • Poorly formed NEXUS files: don't modify your NEXUS by hand unless you really know what you're doing. It rarely works. If a NEXUS file lacks internal consistency, as many hand-modified files do, it will be rejected. Examples of inconsistency include MATRIX blocks with characters and/or taxon counts that differ from the number of character and/or taxon counts present in other blocks, partitioned data with mismatched lengths or names, etc.

  • Give unique names to your taxa and characters: in desktop programs where each file is it's own self-contained universe, using duplicate names is an error that doesn't usually bring the house down. In MorphoBank, where taxa and characters with the same name are considered the same thing duplicates can have strange and unintended side-effects. Make sure duplicates are resolved before importing into MorphoBank.

Merging several NEXUS files into a single matrix

When creating a new matrix, MorphoBank offers the option of uploading a NEXUS file for import. You may also upload NEXUS files for import into an existing matrix at any time after a matrix is created. The contents of each uploaded file is merged with the content of the existing matrix. This provides a mechanism for consolidating multiple NEXUS format datasets into a single MorphoBank matrix.

To upload one or more NEXUS files to an existing matrix, open the matrix by clicking on the "edit info" button in the project matrix list (see the section called “Editing”) and then clicking on the File Uploads tab. You will be presented with a screen resembling that in Figure 12.3, “The matrix File Uploads tab”.

The upload tab is divided into two sections. The upper section contains a form for uploading files. The lower section contains a list of all files previously uploaded and includes the date and time of the upload, the name of the member who added the file and a button to download the originally uploaded file for inspection.

To upload a file, use the file browse button under "NEXUS file to add to matrix" in the new matrix form and select your data file. You can also specify text to add to the notes data of each taxon and character imported from the NEXUS file. If you have taxon or character partitions in your project, you will also see checkboxes for each partition. If checked, all taxa or characters will be added to the selected partition as they are imported. Click the "save" button to initiate the upload. Once the import is completed, a statistical summary of the import process will be displayed along with a form to perform another upload.

Figure 12.3. The matrix File Uploads tab

The matrix File Uploads tab


Handling of TREES and other unsupported NEXUS blocks

As mentioned in the section called “Supported NEXUS blocks”, unsupported NEXUS blocks are stored as-is in the matrix's "blocks" list and displayed on the NEXUS Blocks tab.

You can use the NEXUS Blocks tab to view unsupported blocks, and if necessary edit their contents. MorphoBank provides only a simple text editor interface for editing the raw data of unsupported blocks.

You can create new blocks by simply typing in the name and content of the block into the empty "Add additional block" form. The name of the block is used as the identifier for the block in any NEXUS file output. Poorly formatted blocks can cause the NEXUS output of MorphoBank to be invalid. Only create your own blocks if you know what you are doing.

Figure 12.4. The NEXUS Blocks tab

The NEXUS Blocks tab