SobekCM Technical Aspects
SobekCM METS Editor
      Download and Install
      Concepts and Preparation
      Using the METS Editor
            Creating a new METS
            Editing an existing METS
            Creating structure map
            Saving in different formats
      Batch Processes
            Spreadsheet or CSV File
            Marc21 Data File
            OAI-PMH Harvesting
            Directory/Metadata Update
      Image Derivative Creation
      Preferences and Settings
            First Launch Configuration
            Changing your preferences
            Release History

Concepts and Preparation

Directory Naming

In preparation for creating your first digital resource package, you should have the files for your item in the same folder. The folder can be named in any format. The folder name is automatically added in the METS file as the object identifier (OBJID).

File Naming and File Types

Files which represent the same portion of the intellectual entity should be named with the same filename, although of course the extension will be different. For example, images of the same page in different formats should share the same root filename, so 00001.jpg, 00001.tif, 00001.jp2, and 00001.jp2 should all be images of the same page.

The following files types are considered page images:

  • TIFF images ( *.tif, *.tiff )
  • JPEG images ( *.jpg, *.jpeg )
  • JPEG2000 images ( *.jp2 )
  • GIF images ( *.gif )
  • Text images - OCR text ( *.txt )
  • PRO files - Prime Recognition OCR files ( *.pro )

Page images will form their own structure map in the resultant METS which represents the structure of the actual item and for books and newspapers is analogous to a table of contents for the item.

Any file which is not a page image is considered additional resource files and will be included in a second structure map. These files can still be organized hierarchically and be placed into divisions. Where page is the bottom layer of the page image hierarchy into which page images are placed, additional resource files are organized into file groups. Again, all files with the same root name will be organized into the same file group.

METS ObjectID and BibID:VID

This application was originally written in support of a SobekCM digital repository. Within SobekCM digital repositories, each resource is assigned two identifiers:

  1. BibID (or Bibliographic Identifier)
    This is essentially analogous to an identifier for the title, and can be parent to many individual volumes. As such, the BibID is usually directly related to the main bibliographic information regarding all the volumes, such as for newspapers. This is not necessarily or always true, but is generally true in practice.

    Bibliographic identifiers are 10 digital alpha-numeric values which start with two letters and ends in four numbers. The digits between can be either letters or numbers.
  2. VID (or Volume Identifier)
    This identifies an individual volume within the bibliographic unit (or BibID). This is a five digit number, usually represented with leading zeros.

Generally the complete SobekCM Object ID or System ID is the BibID and VID separated by an underscore or colon.

The METS Editor allows you to define any ObjectID for your METS files (usually matching the folder name though). Some of the batch processes may require that this type of numbering scheme is implemented, but it should be quite elementary to change the ObjectID's assigned after the batch process is used.

I will be continuing to work on removing any requirements imposed by the historical use of this application.