[x3d-public] X3D4 Sound meeting 30 SEP 2020: Web3D 2020 preparations, Gain and ChannelSelector nodes, avoiding channel indices via parent-child MFNode field relationships

Thu Oct 1 10:59:02 PDT 2020

Dick and I continued today.

Here is a way to avoid any need for a ChannelSelector node by using the available ChannelSplitter node:

<ChannelSplitter DEF='ChannelSelectionDemo' channelCountMode = 'explicit'>
     <AudioBufferSource DEF='AudioBufferSource2'/
     <Gain DEF='IgnoreChannel_0'  containerField='outputs'/> <!-- initial output is audio channel 0 -->
     <Gain DEF='IgnoreChannel_1'  containerField='outputs'/> <!-- second  output is audio channel 1 -->
     <Gain USE='SelectGain_other' containerField='outputs'/> <!-- third   output is audio channel 2 -->
<ChannelSplitter/>

Is this sufficient to avoid defining a new ChannelSelector node in the specification?  Seems so.  Examples will reveal it is really needed.

It looks like we have solutions that avoid the need for ROUTE connections, or index numbers, to create an audio graph.  That meets our hoped-for design goals of simplicity.

This means that ROUTE connections can be dedicated to the purpose of animating an audio graph: enable on/off, changing gain, etc.

X3D4 specification:

> 16.4.6 ChannelMerger
> ChannelMerger : X3DSoundChannelNode {
>   SFString [in,out] description  ""
>   SFBool   [in,out] enabled      TRUE
>   SFBool   [in,out] loop         FALSE
>   SFNode   [in,out] metadata     NULL  [X3DMetadataObject]
>   SFTime   [in,out] pauseTime    0     (-∞,∞)
>   SFTime   [in,out] resumeTime   0     (-∞,∞)
>   SFTime   [in,out] startTime    0     (-∞,∞)
>   SFTime   [in,out] stopTime     0     (-∞,∞)
>   SFTime   [out]    elapsedTime
>   SFBool   [out]    isActive
>   SFBool   [out]    isPaused
>   
>   SFInt32  [in,out] channelCount          0          [0,∞)
>   SFString [in,out] channelCountMode      "max"      ["max", "clamped-max", "explicit"]
>   SFString [in,out] channelInterpretation "speakers" ["speakers", "discrete"]
>   SFInt32  [in,out] numberOfInputs        0          [0,∞)
>   SFInt32  [in,out] numberOfOutputs       0          [0,∞)
>   # Mechanisms for parent-child input-output graph design remain under review
> }
> ChannelMerger unites different monophonic input channels into a single output channel.

Here are needed updates for ChannelMerger:

MFNode [in out] inputs [X3DSoundProcessingNode] # multiple inputs
SFNode [in out] output [X3DSoundProcessingNode] #   single output

If we only have single output, then numberOfOutputs field is no longer needed.

> 16.4.7 ChannelSplitter
> ChannelSplitter : X3DSoundChannelNode {
>   SFString [in,out] description  ""
>   SFBool   [in,out] enabled      TRUE
>   SFBool   [in,out] loop         FALSE
>   SFNode   [in,out] metadata     NULL  [X3DMetadataObject]
>   SFTime   [in,out] pauseTime    0     (-∞,∞)
>   SFTime   [in,out] resumeTime   0     (-∞,∞)
>   SFTime   [in,out] startTime    0     (-∞,∞)
>   SFTime   [in,out] stopTime     0     (-∞,∞)
>   SFTime   [out]    elapsedTime
>   SFBool   [out]    isActive
>   SFBool   [out]    isPaused
>   
>   SFInt32  [in,out] channelCount          0          [0,∞)
>   SFString [in,out] channelCountMode      "max"      ["max", "clamped-max", "explicit"]
>   SFString [in,out] channelInterpretation "speakers" ["speakers", "discrete"]
>   SFInt32  [in,out] numberOfInputs        0          [0,∞)
>   SFInt32  [in,out] numberOfOutputs       0          [0,∞)
>   # Mechanisms for parent-child input-output graph design remain under review
> }
Needed for ChannelSplitter:

SFNode [in out] inputs  [X3DSoundProcessingNode] #   single input
MFNode [in out] outputs [X3DSoundProcessingNode] # multiple outputs

Since these field definitions can vary, they would not go in parent abstract interface X3DSoundProcessingNode.

Field numberOfInputs can be omitted, only a single input node.

This probably means that channelCount, numberOfOutputs are accessType outputOnly [out].  The values are determined by the node children for each field.

If agreed that ChannelSelector is superfluous, then the only remaining node that needs to be added to specification is Gain.

... but more work is needed on Sound and SpatialSOund.

We need an X3D definition for "audio graph" term.  Suggested draft:

* An /audio graph/ is a collection of nodes structured to process audio inputs and outputs
   in a manner that is constrained to match the structure allowed by the Web Audio API.

We have defined all of the new nodes (beyond Sound, Spatial Sound and AudioClip) to match the terms and capabilities of Web Audio API.

This means a collection of the new nodes, that together can create and process sound, produce a result that needs to inputs for our Sound and SpatialSound nodes.  In combination, the output is similar to a computational version of a simple AudioClip node.  It is a source, computationally created, whereas the AudioClip is a prerecorded version.

========================================
Basic stages for flow of sound, from source to destination:

a. Sources of sound (perhaps an audio file or MicrophoneSource, perhaps signal processing of channels in audio graph),

b. X3D Sound or SpatialSound node (defining location direction and characteristics of expected sound production in virtual 3D space),

c. Propagation (attenuation model, may be modified by AcousticProperties based on surrounding geometry),

d. Reception point (avatar "ears" or recordable listening point at some location and direction, that "hears" result, with left-right pan and spatialization).
========================================

Producing a figure for this partitioning would be a good idea.

Issue: should Sound or SpatialSound only receive a single channel?  Seems realistic... but note that AudioClip and MovieTexture allow stereo (2 channel) outputs.  So such a restriction on channel count doesn't seem appropriate, except perhaps to have a limit of two channels.  Perhaps best to have spec stay silent and let implementations continue to handle # channels implicitly.

The Sound and SpatialSound nodes define how sound is manifested within the X3D scene. (Other nodes don't do that).

For inputs to X3D Sound and SpatialSound, the 'source' field might be changed to (a) allow multiple inputs as MFNode, and (b) allow other computational sources.  In other words, change

	SFNode   [in,out] source NULL [X3DSoundSourceNode] # and other types
to
	MFNode   [in,out] source NULL [X3DSoundSourceNode,X3DSoundDestinationNode,X3DSoundProcessingNode]

or else a combination of inputs and outputs such as

	SFNode [in,out] source      NULL [X3DSoundSourceNode]
	MFNode [in,out] processing  NULL [X3DSoundProcessingNode]
	SFNode [in,out] destination NULL [X3DSoundDestinationNode]
	SFNode [in,out] analysis    NULL [X3DSoundAnalysisNode]

Confirmed (as discussed yesterday) that X3DSoundProcessingNode interface needs

	SFFloat [in,out] outputGain 1.0

Whew, more to go!  We'll apply these changes to spec to simplify the discussion and advance our shared comprehension.

Efi, we are standing by for your Gain node.

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman