Collaborative 3D Scenes with IIIF Manifests

Release Date: 
21 February 2026

A consortium of leading cultural institutions has announced a draft  specification for presenting computer-based 3D content.  During the plenary session of the Jan 2026 Online meeting of the International Image Interoperability Framework a document for their Presentation 4 API was described and public links provided. This work is an evolution of foundational Presentation 3 API which established a widely used framework of for computer presentation of images, audio, and video. The draft Presentation 4 emphasizes the concept of containers of different dimensionality; images are presented in Canvas of 2 dimensions, 3D models in Scenes of 3 dimensions, audio along a Timeline of 1 dimension; and combinations of containers allow presentation of video or animated 3D content. The goal of this work is to allow the community of content enabled by IIIF for images to grow with the addition of 3D capabilities. The base of this specification on the Web Annotation Data Model offers the promise of collaborative composition and presentation by creators throughout the web.

This post will concentrate on the 3D features enabled by the Presentation 4 draft. The first thing those with 3D computer graphics experience will notice on reading that specification is what is missing. The word “mesh” does not appear in the two documents that comprise the Presentation 4 draft, and the word “texture” appears only twice. Instead, the specification presupposes 3D assets, referred to as Model, and specifies how one or more of those assets can be placed in a scene, how they are lit, and how they are viewed from different viewpoints. In the January 2026 landscape an appropriate format for a Model is a glb file based on the glTF standard; but there is no obstacle to using older formats such as Lightwave OBJ files or extending to future formats. What is specified for 3D visualization in the Presentation 4 API is an abstraction of the concept of a scenegraph. IIIF 3D viewers implement that concept in a concrete scenegraph such as that specified in the ISO X3D standard, as supported by libraries such as ThreeJS or Babylon, or implemented in game engines such as Godot, Unreal, or Unity.

A number of 3D viewers have been developed to test the visualization of 3D Scene defined within a IIIF manifest file. An example of such as single-scene manifest is presented as a supplement to this blog posting. This example demonstrates the ability of IIIF Presentation 4 API to facilitate combining in a single viewport a collection of 3D assets from different cultural institutions. Just as important, the Presentation 4 API integrates 3D visualization into an existing system that already supports images, audio, and video. It is already envisioned that future viewers will support high quality 2D canvas elements inside a 3D scene, or 3D scene viewports imbedded in a canvas. For example, we can plan to present high quality and historically significant anatomic illustration viewable in conjunction with 3D volumetric imaging of the same anatomy. The IIIF manifest file allows for presenting multiple Canvas, Scene, and Timeline elements organized in a flexible way; as a storyline allowing viewers to follow a narrative, as a hierarchy allowing exploration of assets in greater and greater detail, or as a mind-map allowing free linking among related images and models. The IIIF Presentation specifications already support a comprehensive metadata ability, which allows for detailed information about each asset to be included in the manifest itself. This allows important documentation of objects and collections to be included. Copyright, licensing, and appropriate usage permissions can also accompany each asset and collection, and use of multiple and selectable languages is supported.

The IIIF Presentation API, both the current version 3 and the draft version 4, are based on the Web Annotation Data Model, an abstract data model constructed of resources, URI addressable packets. The fundamental structure is that of the Annotation, a resource that serves the purpose of linking two resources which can be entities on the web.  Such a basis for a 3D visualization API is only possible because complex data structures, such as meshes, are tightly wrapped into impenetrable resources. This abstract structure of the IIIF Presentation API does allow a great degree of open access and open composition. A 3D Scene published using the IIIF Presentation API can have annotations linked to it by any author. This is the world of Linked Data and the Semantic Web. A concrete and achievable 3D application of this would be the public presentation of a 3D visualization of interest in cultural heritage. Anyone can then publish Web Annotations which are realized as viewpoints, lighting, and visual elements which aid comprehension of the published resource.

A webpage with an example IIIF manifest with a single 3D scene has accompanies this blog posting. The visualization is with the X_ITE JavaScript library implementing an X3D viewer. The complete manifest text is included in the text area of the web page, it is an important feature of these IIIF manifest files that they are in plain-text JSON format, an important enabler of open access. It is also easily apparent that the manifest can contain important sourcing, licensing and permission information for each of the 4 Model resources in the scene. While this prototype viewer is intended to develop the 3D rendering of manifests, the text information in the manifest is readily convertible to HTML content on the webpage or as additional 3D text material in the scene itself.
 

This is a live interactive web visualization, and the text of the manifest can be modified and the results reloaded using the Load Manifest into Viewer. Interesting variations of the scene can be generated by changing the values of those properties labeled as x, y, or z.  The implementation itself was greatly aided by exploiting the scenegraph specification of the X3D ISO standard. The 3D elements of the IIIF manifest are expressed as X3D nodes by JavaScript code in the webpage, and the resulting X3D Scene passed to the X_ITE viewer. No code of the X_ITE viewer itself had to be modified, a tribute to the expressive power of the X3D standard and the capabilities of the X_ITE browser.

 

Resources