jersmi's picture

@jersmi

Groups

  • Vuo Founder

Compositions

jersmi's picture
jersmi commented on jersmi's Discussion, “3D audio file display

Ok -- success if I convert the 32-bit float .wav to 16-bit integer. Attached zip file includes comp, updated subcomps and .wav files.

Here's what I did to get things working. First thing was to set up some fresh analysis using Terminal:

https://wiki.lazarus.freepascal.org/macOS_Sound_Utilities

Used afinfo on my 32-bit float file:

File type ID:   WAVE
Num Tracks:     1
----
Data format:     1 ch,  44100 Hz, 'lpcm' (0x00000009) 32-bit little-endian float
                no channel layout.
estimated duration: 0.743039 sec
audio bytes: 131072
audio packets: 32768
bit rate: 1411200 bits per second
packet size upper bound: 4
maximum packet size: 4
audio data file offset: 80
optimized
source bit depth: F32
----

Then used Terminal afconvert -f WAVE -d LEI16 to convert it to a 16-bit .wav file. (Just cuz already in terminal and learning these new tools -- assuming could have used any audio software to convert. EDIT: WRONG assumption, see below -- Audacity export is different.)

New file info:

File type ID:   WAVE
Num Tracks:     1
----
Data format:     1 ch,  44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer
                no channel layout.
estimated duration: 0.743039 sec
audio bytes: 65536
audio packets: 32768
bit rate: 705600 bits per second
packet size upper bound: 2
maximum packet size: 2
audio data file offset: 4096
optimized
source bit depth: I16
----

Important clue:

audio data file offset: 4096

Essentially this offset value helped solve the issue. I wish I could extract it in Vuo. The number 4096 here apparently relates to the Apple 'FLLR' subchunk -- listed in the Read Wave Header subcomp "File info" readout -- which designates (IIUC) >4k bytes before the audio data "payload" starts.

So I try offsetting the data start byte to 4097 -- works. I also noticed in the Read Wave Header "File info" readout that the "Sub-chunk 2 size" reads 4044 -- related but 52 bytes off -- 44 byte header + 8 bytes? So I tried setting the data start byte to 4045 and that also works. So i am a little confused why both work, still a lot I'm not getting about the numbers.... (Btw, also had to rejigger the Read Wave Header subcomp to properly calculate the data section size, took a minute to get that sorted.)

Finally -- success!!!

Part 2: tried a simple conversion from Audacity, since the export to .wav only exports 16-bit. (I set up a macro -- now I can easily batch convert my 32-bit files to 16-bit). Terminal afinfo shows that it is different from the .wav using Apple's afconvert, the Audacity file is presumably more "universal" (i.e., audacity does not add the "FLLR" chunk and the +4k byte padding -- why oh why would Apple do that...).

And using the data file offset to set the data start byte (to 45) works:

File type ID:   WAVE
Num Tracks:     1
----
Data format:     1 ch,  44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer
                no channel layout.
estimated duration: 0.743039 sec
audio bytes: 65536
audio packets: 32768
bit rate: 705600 bits per second
packet size upper bound: 2
maximum packet size: 2
audio data file offset: 44
optimized
source bit depth: I16
----

Yet another doc has proved helpful in all this RIFF stuff: https://code.google.com/archive/p/opentx/issues/192 (which originated here: https://stackoverflow.com/questions/6284651/avaudiorecorder-doesnt-write... ).

Notable:

Reading WAVE files properly must really begin as an exercise in locating and identifying RIFF subchunks.

And:

It is allowable to insert subchunks after the data payload.

Which gets back to Steve's point not to trust where chunks are. Case in point, learned today that "acidized" .wav's -- a common format for adding loop metadata readable by audio sampler synths -- put their loop metadata after the audio data "payload".

Finally...

Magneson wrote:

If you're not scared of some heavy nerding, you can also just get the bytes from the wav files via the Data nodes and convert the sample range from the file to the Y-values you need. That way you get straight to the data you want ....

::ROFL:: Well, I guess I'm learning a few things. :-0

(Ps. is drag/drop broken for adding new files to posts?)

jersmi's picture
jersmi commented on jersmi's Discussion, “3D audio file display

Still haven't had to time to dig in with this, but I can at least report that the sine wave file you posted works fine, as well as any other 16-bit single cycle file I have on hand. Apparently something about the 32-bit float type that needs to be sorted. Byte order? Conversion calculation?

jersmi's picture
jersmi commented on jersmi's Discussion, “3D audio file display

Yeah, thanks, Magneson. Logically I would divide my data by 16 -- the test wavetable is 16 sequential sine waves. I'll see what I can do.

The good news is that LPCM is by far the most used, and covering a few can cover a lot of ground. I know there can also be proprietary stuff for specific hardware/software, video cameras, game consoles. Then in addition to all the stuff in the doc you sent, there's also other types of metadata. For ex., here's an info page from the US Library of Congress on efforts to standardize and specs for embedding metadata.

Collecting more links to possibly refer to:

https://www.sounddevices.com/32-bit-float-files-explained/

jersmi's picture
jersmi commented on jersmi's Discussion, “3D audio file display

Jean Marie, thanks for the Wikipedia, apologies if my response seemed behind the curve here. I was just processing that users can still manage the data in .wav files with available Vuo tools if the user can sort out header + data structure if the compression type is known, indicated in header byte chunks (using something like Magneson's subcomp or whatever).

And acknowledging that it would not be practical for Vuo devs to put out a "find in data" to cover all .wav compression types (cuz I have no idea what the range of possible byte orders there might be out there with proprietary types, etc.).

I am still stumped on getting my 32-bit float .wav to output correctly. I'm pretty sure I have the header+data sorted, I just haven't figured out how to parse the waveform data. Should look like 16 continuous sine waves, but mine still looks like the pic.

jersmi's picture
jersmi commented on jersmi's Discussion, “3D audio file display

Yeah -- if one knows the .wav file compression format, then one can "manually" sort out the header data, correct? That is, I understand that .wav is a container for multiple compression types.

Pages