As described in Plans for Merging X3D AR Proposals, here we discuss and produce a merged proposal for each functional components by investigating each functional features stepwise.

1. Camera video stream image into the scene (texture and background)

Node structure

There are three options to choose from for designing the new node structure for supporting camera video stream in X3D scene.

Option 1. Describe sensors explicitly

Define a node that represents the camera/image sensor, then route its output to other nodes (e.g. Pixel Texture node or a new Background node such as ImageBackground or MovieBackground)

All three proposals KC1, KC2 and IR support this model with slightly different details.

Pros.
- Open for using it in other purposes in the future (more extensible)

Cons.
- Relatively more complicated to write scenes and implement browsers

Option 2. Describe sensors Implicitly

Define a node that represents "background" or "texture" that is dedicated to showing user media (either from a camera device or a user selected file.)

KC1 proposes this option as an alternative with simpler structure for browser implementation and scene writing.

Pros.
- Simpler on content creators perspective
- Easier to implement and test since lesser interaction with other nodes

Cons.
- Single purpose node, which might not be used much for other purposes

Option 3. Allowing both

Pros.
- Letting user to choose the option that meets their needs

Cons.
- Cost to implement both to browser developers

Selecting video source

Reference: Adobe Flash and HTML5 getUserMedia() API

Scene writer doesn't know about the hardware setup on scene viewer, and accessing camera on the user's device could be an privacy issue. Both Adobe Flash and HTML5 deals this by asking the user to allow browser to use camera input. In addition, they also asks for which camera or video file to use.

2. Tracking (including support for general tracking devices)

Similar to selecting video source, tracking device configuration is unknown to the scene writer, hence it should be taken care by the browser on the user side. In that sense, X3D nodes should just provide an interface to receive tracking results, which is basically transform information.

In that sense, a special transform matrix could be defined, and when a browser detects this node, it should automatically map it to available tracker or ask user to choose which to use.

URN classes could be developed to categorize the tracking targets (e.g. hand, head, viewpoint, etc.) to make it easier for users to identify which tracking devices one should use.

3. Camera calibration (viewpoints)

4. Others (color-keying, depth occlusion)

Discussions for Merging X3D AR Proposals

Contents

1. Camera video stream image into the scene (texture and background)

Node structure

Selecting video source

2. Tracking (including support for general tracking devices)

3. Camera calibration (viewpoints)

4. Others (color-keying, depth occlusion)

Navigation menu

Views

Personal tools

Navigation

Search

Tools