We've documented some notes about the .sal file format in the forum post below: https://discuss.saleae.com/t/logic-2-capture-format-sal/1858
Originally, the request was to publish the .sal file format:
Example use case: receive a .sal from a colleague and be able to export and run a script on the output without having to leave my terminal. And then automating that process. Original discuss forum post is here: https://discuss.saleae.com/t/utilities-for-sal-files/725/2
However, we don't have plans to make the file format public for reasons described in the support article below: https://support.saleae.com/faq/technical-faq/is-the-.sal-file-format-documented
Instead, we would ideally want to provide a .sal-into-.bin translation through an API of the Logic 2 app. That way, we can change our internal file format to our hearts content without needing to make a public API & documentation release each time we do that.
Ideally this would be a tiny, efficient, standalone, open-source C library that opens the .sal file and gives fine-grained access to the raw data through the API (no need to waste time writing the data to some intermediate .bin file). Then it can be easily used by automation that needs to run as fast as possible.
+1 to Kor's request. I will be proceeding with the current automation APIs, but ideally it would be great if Saleae provided a standalone CLI tool which could export binary and/or csv output from a given range from a provided .sal file. It would be helpful not to require the GUI/App to perform the extraction of the data.
"Would it be possible to send me a description/struct definition of the current (2.4.0) .sal or at least a hint at the function/description of the block end/beginning transition bits that appear to follow the end of the block/beginning of the following block?"
It's now possible to automate the export of *.sal files to all of our supported formats using the new Logic 2 Automation API. More details here:
The CSV export format is pretty self explanatory, and the binary export format is documented here (with sample parse code) support.saleae.com/faq/technical-faq/binary-export-format-logic-2
This won't expose the metadata about the capture though, like the trigger settings, voltage thresholds, sample rates, etc.
However, this does have the benefit of allowing you to export into a documented and easy to parse format, and it also allows exporting protocol results, which are not stored in *.sal files. (protocol analyzers are re-run each time you load a *.sal file)
The rest of the meta data is in the meta.json file zipped into the *.sal file. (just rename to *.sal to *.zip to extract) We're happy to answer questions about this format, however it's already changed 12 times since we started working on Logic 2, and the changes have been pretty substantial. I don't expect we'll publish an official specification (since it changes frequently with reckless abandon.)
To add further context (as a user) to why this is a hard format to maintain documentation: the actual encoding type isn't, as far as I can tell, the same throughout the file/capture. For periods of time where the time between state transitions is low, relative to the sample rate, there is a block format that contains (among other things) the sample rate and the length of the data portion of the block. Then the transitions are encoded as the number of _additional_ after the next clock (i.e. n = t-1) before the next transition. This is then written as a variable number of bytes, with 0x40 of the first byte flagging as 'more bytes', and 0x80 of the following bytes, until a byte without 0x8 set. Why not the 0x80 of the first byte flagging? Good Question! And this is the easy format to reverse engineer.
Where I'm stuck currently is deciphering the block format of what appears to be the raw transitions. Whereas the 'time to next transition' block had a fair bit of header information, the raw block I can't make heads or tails of. I *think* it's NRZ(S) (or perhaps NRZ(M) depending on context) encoded, to generate as many consecutive '00' bytes as possible, to facilitate better compression later on. That said, I have no idea.
But, yeah, if they're doing what I think they're doing inside of there, this isn't something that can be documented in a clear way outside of reading the source itself, and even then, it'd be quite difficult.
All that said, Hey Guys! Wanna explain what those 7 uint64_ts in the period type block are other than Sample Rate and DataLen? Or, perhaps, the block of inscrutable bits in the file header after 'type' and the raw block method? ;)
In all fairness, further assistance on this would be appreciated, but also understood if it's not forthcoming.
Oh, and why am I doing this in the first place? Well, NVMe drives die a rapid death when you need to write multiple TB of capture binaries to them per day, because the App OOMs due to the capture length and you need to process them externally instead... If I can read from the .sal directly, that's ~3% of the disk wear (not to mention I/O bandwidth) of the alternative.
Ok, cool. That second block isn't actually transition data (at least it doesn't appear needed to reconstruct). Found the count for them and I can index between blocks. Still don't quite understand the relation between that metadata array entry count and the transitions/transitions count, but I don't think I need to currently.
I guess I'll be sticking with 2.4.0 for a while, but I think I understand a bit more of how some of y'all think, so maybe it won't be too difficult to propagate to later versions.
Also, I'm guessing part of the reason for the blocks is that somewhere you're creating a separate block for each transfer up from the USB driver and you're just serializing the list of blocks. Simple enough approach.
Btw: Great tool, great software. Keep up the good work and don't mind the fool over here tilting at windmills.
Hi Dan, it sounds like you have everything working (nice work!), but I answered a few of your questions here: discuss.saleae.com/t/logic-2-capture-format-sal/1858
I'm happy to try and answer other questions you might have as well.
- The internal format is conveniently all packaged together in a ZIP file containing one data file per channel, and a metadata JSON file. I like that. The export just dumps a bunch of separate files in a directory w/o metdata (e.g. settings).
- For example, the metadata in internal format has the channel names configured by the user, and other channel settings, such as the voltage threshold. Export just throws that all away.
- At the moment the most awkward thing about not having the metadata is not having the channel names configured by the user. Just having export write the metadata JSON file into the same directory as the channel data files would be fine.
- For what it's worth, I'm willing to accept sketchy documentation that says in BIG PRINT at the top, "THIS INFORMATION WILL BE OUTDATED VERY SOON. WE RECOMMEND USING EXPORT INSTEAD."
- I can already parse the infromation out of the JSON in the .sal, so it's fine with me if you just include that as-is in the export, even though it might contain information otherwise not relevant to export.
- Also, maybe you could have a checkbox to bundle up the export into a single ZIP file, like the .sal file. I'm not sure whether there are any other things you might group into "advanced options" for export, but if there are, the checkbox for single ZIP file could be there.
"Generic interface for the file output: we use a software from cern (root.cern.ch) to do a statistical data analysis. Would be really great to attach an indivual file-output (you could provide a DLL containing the byte-conversion stuff and the user can implement his own methods, similar to the user code in the analysers )
It would be really great if one could use the output from the analyzers in the output, too"
"1st priority wish: Provide a DLL with the file-IO, so that I can load a sal-file and access it channel-by channel for digtal and analog. (Similar to what you are doing for the analysers). I can also offer my help here!"
I would like to know the format to be able to convert recordings captured with my own device to .sal to more conveniently analyze the waveforms. I would love to use Logic to record, but it's unreliable in my case (weeks-long capture), because the program closes every few days, sometimes the computer hangs (it's computer's fault, not Logic's), and sometimes there is Read Timeout and the capture stops. So I thought I would code capture feature into my code, and that would work for a long time without my maintenance, but I need some way to analyze the data afterwards.
I have a use case really similar to the one described here. I’d like to convert binaries files back into the .sal binary format in order to re-open the capture on Logic2. Having a bit of detail on how the .sal bin file is actually formatted would be really helpful to do that.
The point here is actually to stay in the very nice Saleae environment to do my work and not have to use other tools. The work I’m doing implies comparing captures taken with my Saleae 16 pro , with capture taken with another hardware. I’d love to import external binaries to use the power of Saleae analysers on it too."
Another use-case: convert data to *.sal format in order to analyze it in Logic 2. The original data is acquired from different hardware (due to voltage isolation requirements) but I want to use Logic 2 to analyze the protocol
Is this idea closer to what you need?
No, I want to convert data which had captured on an oscilloscope to the *.sal format and then load it into Logic 2 for analysis.