Jump to content

Offer of optimisation of sprite and sound files


timegrinder

Recommended Posts

Hey all,

TL;DR: If you're interested, some of the game files could be made smaller with assistance to keep the process ongoing for new files. Cut the current sprite source size by 27% (12.8MB), audio source by ~38% (~30.7MB), paradise.rsc lowered ~28% (~28MB) with sound optimisation, another ~12% (~12MB) with sprite optimisation (~100MB -> ~66MB total)

Too long; read anyway: (thanks for reading :) )

This isn't a code thing or feature to be added, but I know someone involved in the server and through them took an interest in the game data. I have experience in image / web site data optimisation and used those eyes to look over the git repo. After I grabbed a copy of it I saw that there was some data being used up that didn't need to be in the game resources. I'm not intending to step on any toes so any issues / rejections aren't a problem, just something I managed to do relatively easily (taught myself a few things) and figured I'd offer it up to take a look at in case it was helpful to the people running the server as it can be integrated into the media workflow or simply done on a semi-regular basic after a number of updates.

Basic explanation:
Sprites and sound files can be optimised to keep 100% of their quality (or 99.99% in the case of sound) with, in some cases, significantly smaller file sizes.

What does this accomplish?
Aside from the game taking up less space (Yes, it's already extremely small by todays standards) it means that the server updating clients takes less time, the resource file can be pushed out in a shorter time in a smaller size, consuming less bandwidth for the server infrastructure etc. The savings get multiplied for every player that connects for the first time, or with a broken resource cache, or after an update etc.

How is this done?
The sprites can be batch optimised, I've already done a complete test to crush them as far as they can possibly go with 100% exact pixel accuracy to their originals. I've written a batch script for this but I want to add some final safety measures so that I can package it and provide it in a 'anyone can use this without necessarily knowing how it works, or risking data damage in the process' with statistics and warnings if issues arise (also without overwriting files if there's an error etc). The final step is just a safety check to confirm that the DMI metadata (The bit that tells BYOND how to render the directions and animations of the sprite sheet) is completely unaltered during the process so that it can warn if an issue arises that may need manual intervention. If a user repeats the optimisation on already processed data there will be no change and no damage, just wasted CPU time.

The audio optimisations are a lot more complex and I do not have a way to automate them (and might not ever without potentially destroying quality or losing space efficiency), but someone with enough knowledge / practice and a Digital Audio Workstation (even a free one) can do some manual analysis and alterations (spectrum analysis, lowering the sample rate if it doesn't need to be so high based on frequencies used etc) to the sound files before encoding them, and testing which encoding setting provides the most accurate data for the lowest file size (basically encoding several examples at different settings and then just comparing them to the original, because 'the highest setting' is not always the best. A file might only need to be 1MB in order to be 'perceptably perfect', but higher settings may make that same file significantly larger). These kinds of optimisations must be done on the highest quality source available (not the existing ogg files unless absolutely necessary) and should only ever be done once per new sound file added.

As I've gone I've also been writing the basic outline of a tutorial / best (ish) practice to guide others through the process, depending on demand that may be turned into a full fledged tutorial on the process.

After the sprite optimisation I ran up a private server and had a poke around, but not being a player I was only able to run around and do basic interactions, the sprites seemed fine as far as I tested. The sound optimisation I'm not currently 100% sure will work as I have not tested that and there may be edge cases with the alterations I've made that cause some files to fail, though I am not able to actively test this myself, I have simply stuck to the limitations of the FMODex sound system that BYOND uses, so 'on paper' they should work. In reality however, I do not know and would appreciate testers if people are interested.

I've forked the repo and added a testing branch to my own for this purpose until I know my alterations are sound, it can be found at:
https://github.com/timegrinder/SS13-Paradise/tree/testing

Currently it only has the audio optimisations as I want to thoroughly check the DMI metadata on the sprites to make sure they are all intact before uploading them.

The process can also be backported to the other stations / code bases if people care, whether just them finding out your data is in a better envelope and pulling it, or them processing their own in some way.

Hope you survived the rant! Any responses or comments / questions etc are welcome, I might not respond terribly quickly or sometimes forget to check back.

  • Like 6
  • Thanks 1
Link to comment
Share on other sites

Thanks a bunch for this man, I've passed it on to the people who do the technical stuff and i'll ping them here so they remembe

 

@Allfd when you get time can you look at all this or pass it to someone else who can?

  • Like 1
Link to comment
Share on other sites

For sure,

The DMI issue has been identified, so if that turns out fine, I think this is entirely good to go on the icon front.  

The metadata itself should be fine as long as you respect the property for it.  As an example, the bees.dmi icons for bees has

 

Quote

  Properties:
    date:create: 2019-01-17T15:30:48-05:00
    date:modify: 2018-04-04T20:13:16-04:00
    Description:

# BEGIN DMI
version = 4.0
    width = 32
    height = 32
state = "bee_base"
    dirs = 1
    frames = 6
    delay = 0.5,0.5,0.5,0.5,0.5,0.5
state = "bee_grey"
    dirs = 1
    frames = 6
    delay = 0.5,0.5,0.5,0.5,0.5,0.5
state = "bee_wings"
    dirs = 1
    frames = 6
    delay = 0.5,0.5,0.5,0.5,0.5,0.5
state = "queen_base"
    dirs = 1
    frames = 8
    delay = 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5
state = "queen_grey"
    dirs = 1
    frames = 8
    delay = 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5
state = "queen_wings"
    dirs = 1
    frames = 8
    delay = 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5
state = "queen_item"
    dirs = 1
    frames = 9
    delay = 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5
# END DMI

This is an ordered list starting at pixel (0,0) and moving across with each frame being 32x32

It is 100% safe to copy this metadata out, do whatever you want to the dmi as a png file, including rewriting all the metadata.  Then just re-insert this property.

Audio is difficult as you described.

But I would say, yeah, lets do this.

#Edit#

OK, so probably should add, the property I am talking about is the compressed text metadata (zTXt).  You can copy that directly and it will all just work, assuming the pixels 32x32 (or whatever) were not moved in some way.

Edited by Allfd
  • Thanks 1
Link to comment
Share on other sites

Hey there,

Great! I've already worked all of that out before I even got it to a stage of bringing it to you on the forum, the last 'safeguard' was just to do a dump of the zTXt DMI chunk from the source and make sure the output had the same chunk in the same format. I eventually worked it out just by decoding and dumping the contents of all the chunks, running it through sed to select between # BEGIN DMI <> # END DMI inclusive, md5'ing the results and provided they match its considered a pass. (The script is now completed, worked on it last night and have just been fine tuning the optimisation parameters so at some point I'll clean it up and work out the licenses or just give a list of 'go get these programs from these places to use this', but for now you can basically chuck any sprite assets at me and I can run them through. The entire paradise-master repo takes 8 minutes on the wall and ~1300 cpu seconds on my i7-4970K. Most of the time's wasted in bullshit Windows/Batch performance issues but it takes so little time anyway.

EDIT: Same optimisations go for any website / wiki stuff too, it's simple to just grab them all in their directory structure, run them through the same optimiser, and copy them back, or re-add them to the wiki via SimpleBatchUpload

The method I was using to optimise the files didn't actually touch the metadata chunk, it just repackaged the PNG IDAT's losslessly (or at least lossless to the content, so if it was <256 colours it would optimise the palette order, or keep it true colour etc), and discarded any chunk that WASN'T that zTXt. It was just a matter of me putting in that 'check' so the finished script would be portable and usable by basically anyone.

The images aren't resized or reshaped in any way, though I was looking at one of the older DMI's from another server that was just one big long 32 pixel high row, so I checked into the DMI stuff and also found the shape of the image (long or square) isn't important as long as the left to right order of the frames remains the same as it just reads  left to right in x pixel squares until it hits the end of the row and goes to the next one,  some rearrangement of frame groups might improve encoding efficiency but I don't think that's worth touching at all, its just something we also don't have to worry about either since the programs don't mess with the image as if it was a 'sprite sheet', it just encodes it as a flat image more efficiently.

The audio I already tweaked and put up I went back over because I realised I was carving the upper frequencies unintentionally, so some of the file size savings are smaller than I originally posted about, however that will come down to a preference: Do you want to retain 100% of the original sound/frequencies up to the range of human hearing (ie 22KHz ish, many of the sound files don't get anywhere near there anyway, so some go way above and can genuinely be cut without losing anything) or lower it to a standard frequency. Many computer systems can't even play above 16KHz for instance, so that level is an option for 'decent but not perfect' sound quality.

There is also the odd boon we could hit if we could find the original sources of some of the tracks are they're quite obviously 'tracker' or sequenced tracks that have just been exported to an MP3 or OGG and the original track would consume less space and likely be playable (or easily transferrable to a tracker format) by FMODex. I did manage to track one down that was so freaking old I'm surprised I ever found it.

I also had another look over the encoding parameters and worked out more optimal size vs quality settings regarding sample rates, so that's looking better again. Unfortunately it's all manual spectrum matching and I highly doubt anything I can come across that can be packaged into a command line script would be able to do as good of a job as someone going through it by hand. There MAY be a 'near enough' with a large enough buffer above it to cut some of the file sizes down in an automated way without being 'optimal' just 'near enough to'.

 

Edited by timegrinder
  • Like 1
Link to comment
Share on other sites

Assuming I haven't severely overlooked boning something up, all the optimisations that have been done so far are now up on

https://github.com/timegrinder/SS13-Paradise/tree/testing/

The repo's a bit of a mess, but I figure you can just download it and grab whatever you want and integrate it etc.

Link to comment
Share on other sites

Frequency is an interesting question.   I assume you want to lower the sample rate.  I don't actually know..  It could be tried across all the files as a lowpass filter and we could see if anyone noticed the lower frequency limit.

 

Link to comment
Share on other sites

I've worked out that lowering the sample rate alone isn't the best option. Some files with extremely low sample rates actually benefit from encoding efficiency if the sample rate is increased before it's encoded because of the way Vorbis handles it encoding. By default lower 'quality' levels will shave frequencies off the top and bottom (mostly the top) based on the current sample rate and output a file, meaning that a higher quality level (and as a result, a higher file size) is required to encode something containing say, only 8KHz sample rate than if the file was to be saved at 16KHz-44KHz (without upsampling the audio itself) and then encoded at a lower setting.

As far as audible frequencies in the file is concerned it would be similar to / the same as lowering the sample rate, without actually affecting the container itself, which is in effect applying the filter yeah. Some even have a lot of frequency data that is encoded but is so quiet (-80db to -90db) that it could be outright removed to save bits.

There's oddities with rates/frequencies and encoding though so some of it still needs further testing or tinkering. A large number of small files could just be trimmed in an automated way, but larger files could be hand tuned.

Example for the sample rate / frequency encoding:

main.ogg

Link to comment
Share on other sites

Whoops, premature enteration and left it too long to edit the previous.

main.ogg - Original: 884K - Contains audible sound only below ~8-10KHz, this is after removing the frequencies too quiet to hear via filter and encoding accordingly to match the original audio
Sample rate -- Quality -- Size
44.1KHz -- Q 1 -- 421KB
22.0KHz -- Q 6 -- 511KB

song_main.ogg - Original: 1,948K - after noise removal. This one is an example of how lowering the sample rate doesn't affect the sound while lowering the file size
22.0KHz -- Q -2 -- 400KB
32.0KHz -- Q -2 -- 506KB
44.1KHz -- Q -2 -- 560KB
They're all relatively the same spectrum wise, just the 22/32 ones are very slightly quieter as they have a lower range

There's also the benefit of lower encoding quality levels on higher quality / higher sample rate sources being able to carve out frequencies that are closer together without damaging the sound (ie the spectrum would look like /\/\/\/\/\/\) because the closer some frequencies are together the harder it is for the listener to be able to discern them from each other, and the sound test would end up sounding the same for the largest population of players. 

Ogg Vorbis itself has a low pass filter built into it, so keeping the sample rate high ends up with better results as the filter settings carve out the upper frequencies based on sample rate and drop the max frequency dramatically as quality setting and sample rate go down, though I'd prefer to do the filtering manually.

A side effect of going through the audio and processing is I've been collecting proper attributions for anything I've been able to rebuild from a higher quality source, so that can actually be kept somewhere properly (I didn't see a list, only one single entry, though I didn't look horribly hard either).

If I could output a list of 'used resources' instead of having to look them up in DM it'd be useful for targetting file optimisation as there's a lot of audio to optimise and some of those files actually aren't even used.

Edited by timegrinder
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Managed to realise that tracker / module sound files were the way to go, found some of the older sources that were produced and then exported to WAV/MP3/OGG and found their way into the game. Going to see what others I can find, or even make some of them myself if I damn well have to. They provide a higher quality music file in a smaller file size due to being sample and instrument based. (Like a midi, but all of the 'sounds/instruments' are packed into the file itself instead of the OS/other package, so it's cross platform unless something breaks horribly). The module format I'm using is even able to handle additional FMODEx special effects being applied and stores the samples in FLAC instead of raw WAV

I've also been using the spare CPU time while I've been faffing about musically to do some more exhaustive optimisations on the sprites to squeeze every last bit out of them. Opening and re-saving them in DreamMaker will instantly undo a large chunk of the optimisations though, so I've also contacted the BYOND developers and asked if they can add a 'Optimise PNG with external program when saving DMI' option to DreamMaker which (if added) will let us just get the spritemonger's to 'Save Optimised' when they save a DMI which with a single command will give us about 97%+ of the space savings.

With my 'less harsh' redo of some of the audio, replacing some of the files entirely with a better format (The code lines calling the audio files have been patched on my copy too) and the tiny extra savings on the sprites the resource file size is down to 69.9MB. They'll get pushed to my repo at some point in the next couple of days.

There are also some audio files I believe might be duplicated in the file structure (which I understand for readabilities sake, but de-duplicating some of these and redirecting the coded sound file lookups may be more efficient, we just keep a sound file list that people can lookup for 'Where is that sound stored' or something). 

Another consideration which will be more complex but potentially cut down on sound files is making use of the 'speed' settings when playing a sound file in the code as some of the files aren't directly duplicated, but there appears to be a version of them at 1.0x speed, and another say, at 0.4x or 1.5x speed and so on. Along with that maybe some code loops to replay a sound several times (since looping x times isn't coded yet) to cut down on sound effects that are just files with one sound repeating through it. Not holding my breath but I also added my two cents to one of the BYOND Feature Request's to see if they'll add some of the functionality that already exists in FMODEx to DM, or at least add a 'play x times' feature without it being a code loop.

Link to comment
Share on other sites

Very nice,

 

We are aware of the duplicate resources as @Fethas

hhasFet been discovering them while cleaning up the codebase, you may want to make sure you are working with a recent resource commit as she has been fixing everything related to some of our broken resource issues.

But yeah, we shouldn't have duplicate resources, that's a quality control oversight and is not intentionally done as far as I know.

 

Link to comment
Share on other sites

I'm constantly updating from the github repo during my work while I've been running optimisation tests to find the best reduction for time spent for the script for whoever else to use.

Some of the duplications I'm picking up are complex due to either it not being a simple 'two copies of the same file' reason (some sounds aren't copy/pasted, they're encoded several times from the source / etc and due to that you get more than one file with the 'same content' (as far as we're concerned) but the binary data is different, or that some of the files can be removed entirely in place of a slight alteration to how the sound is played.

I'll keep updating the resource base as I go just in case and take a look at what changes Fethas is making along the way so I'm more aware of them. Might even contact them and check at some point.

Link to comment
Share on other sites

While I've been going through these I've also been making notes / rebuilding some of the sound files so that they're a higher quality for the same or lower file size. During the process I set up a script to tell me which files were mono stereo (ie, a single channel encoded as two) because they play like mono regardless. Then I can dump one of the channels and save some of the file size. This saves, in testing, 10-20% of the files size (if the file was encoded properly to begin with, otherwise more). One of the upsides of this is that if people are making mono sound files they can fit a higher quality into the same or smaller file size if they wish (at the loss of the 'stereo' feel of the two channels having minor variation added to them).

I have set up a copy of the voice synthesizer script / software used previously in SpaceStation13 so we can generate anything we like, including new / different sounding speech, though I haven't tuned it yet to output the 'AI' speech. The default is just what comes out for vox_fem words.

I've also got a subjective frequency test file that people can listen to and see what they can personally hear up to (or what their sound card / speakers / headphones can output to) to get a gauge of the user base if people want to participate and respond, this will tell us if there is an upper limit that is relatively standard across the average user that isn't the 22KHz frequency limit of files.
It emits a sine wave that moves from 0Hz to 22KHz over 44 seconds, so users can listen and work out the timestamp of where they stop being able to hear anything and give feedback to help us if they like.

https://goo.gl/forms/44fJ92GZPsq4H0Np1

Link to comment
Share on other sites

What I'm most concerned about is the audio. Two separate codebases have lowered audio quality to make things smaller--one of them had objectively bad audio (and still does) because of it. The other attempted it, but ended up reverting it because despite the fact that it was supposed to be a lossless conversion, people still picked up on the change.

 

I admit, I'm a bit of an audiophile and hypersensitive to sound and pressure changes, so subtle differences in audio quality I can pick up on quite easily.


If the audio is to be compressed (not opposed to), then I would want to be sure it's virtually indistinguishable from the source.

(same for pictures, but I realize that that's a bit easier to do).

Link to comment
Share on other sites

Pictures are definitely always 100% lossless (unless something went wrong in the process, but in my hours of runs at it it's come out 100% accurate).
The audio I'm adjusting to keep the same spectrum, range, etc, the only things that are being done are removing bitrate overhead. If people can 'hear' a difference on most of these I'm going to be surprised, but then the point is also only to remove unneeded data use, so if people can legitimately hear a difference, I'll just make the change less severe until SOME data is being saved but nobody can hear it.

Given I'm running these files through a spectrum analyser when I work on them because it's significantly more accurate than my (or generally anyone's hearing tests) I'm hoping they come out objectively similar. If they don't, I'll just redo them from the originals until they are, or there's no data saving and the original is used. I'm hoping people don't fall into the recurring trap of 'I know the data is different and thus I know the sound is worse', but at the end of the day, this is a game run by people and played by people, and thus any works done are for their benefit, so if they don't like the change then it gets fixed until they do or it doesn't get kept. 

Unlike one of those code bases (I read through some of the horror shows of what people tried to do) I haven't drastically altered the audio format or content. Channels remain the same, detailed spectrum and gain are the same (There's a few minor differences in some files, I've trimmed out content that is literally inaudible because it was -90db or lower, but was still consuming data, one or two I've removed noise that was again so quiet it was inaudible but still wasting bits). When I get around to it some of the files in mono stereo can be undone which doesn't change how they sound, it just saves 10-20% of the file size from the testing I've done.

There's also the odd file that should actually be better quality than it was previously, even while being at a lower or similar file size, so hopefully people don't have an issue with the change, as even an objective increase in quality can feel and sound like a subjective loss of quality depending on what the listener is used to / expecting. While I've tested that the .it files work and play properly, they do play very slightly quieter because of the sound engines default volume for audio module files, so that may pose and issue even though the size vs quality gains are massive.

That audio test I put up is also specifically so that people who ARE able to hear better / are more discerning can give me the feedback to what their upper frequency limits are in case there's room to actually start trimming into the content, but we'll see what people think if people are willing to do their own listening tests.

After I finally cleaned up my repo and worked out several ways of how not to use git there should be a commit that is a large chunk of the data savings that specifically targetted files which were obviously oversized for their content (and a couple of random ones I came across).

https://github.com/timegrinder/SS13-Paradise/tree/optim-snd

I'll post the optimised images to the optim-img tree in the next day, I held off committing them because I got sucked into doing a deep dive on the statistics to do some short circuiting of optimisations that aren't useful based on sprite content / layout - I was hoping to push out a script to do it at the same time as the images but that's less important, I'll just put up the sprites and tweak the script and put it up later.

Link to comment
Share on other sites

On 2/10/2019 at 7:34 AM, timegrinder said:

The audio I already tweaked and put up I went back over because I realised I was carving the upper frequencies unintentionally, so some of the file size savings are smaller than I originally posted about, however that will come down to a preference: Do you want to retain 100% of the original sound/frequencies up to the range of human hearing (ie 22KHz ish, many of the sound files don't get anywhere near there anyway, so some go way above and can genuinely be cut without losing anything) or lower it to a standard frequency. Many computer systems can't even play above 16KHz for instance, so that level is an option for 'decent but not perfect' sound quality.

Big boys in sounds say that the sample rate should be the double of the frequencies present. For, if we checked a flow of sound at few ticks, it draws us a parable, with (x) being sample rate and (y) being the bit depth. If we had an audio with frequencies from 200 Hz to 22 kHz, packed to 16 bit to 44 kHz, nerfing it to 16 bit 22 kHz would actually remove relevant stuff from the both ends, since the ends of the parable are closing to zero. If the point zeros of the parable (y=0) are x=0 kHz and x=22kHz, the band 21 kHz would be - in a rough example - played at value 1 - in comparison of y=0, when x=0 kHz and x=44 kHz, in which x=21 would be like 2. I didn't actually calculate these things, so the function isn't correct, but should give the clue. Thus, if we made the samplerate too close of the actual noticeable signal, we will lose something more or less audibly. Moreover, the change might be audible if we packed something very high-quality like 24 bit 192 kHz to 16 bit 44 kHz, for because of this parable thing, the dynamics will change even in that case, in which no actual audio data is lost. That could be worked around using the compressor, in order to make the remaining low and top ends to play louder, that's what they do in radio. Concerning the mono-stereo stuff, the dynamic range is a thing there too, if we had the perfectly same track twice in a spot, it will play louder, but that is not the most data-efficient way to achieve that effect. 

Too bad I've got no good input on how to actually pack audio, with these circumstances given - I guess you need to use high and low pass to inaudible frequencies, then choose the sample rate and bit depth keeping in mind that "quality" there is actually a compromise of bitdepth -which primarily affects the dynamic range of the track - and samplerate, with the foremost on audible sounds (but secondarily the dynamics, as said). Everything surely has possibilites to be packed, so this is neat!

Edited by Regular Joe
Link to comment
Share on other sites

That's right, yes, sample rate is always max audible frequency (KHz) * 2, so the sample rate's are only being lowered (or were when I was originally testing, but are not any more unless the sound is extremely low frequency) down to above audible KHz *2.

44KHz being the general standard (nowadays it's 48KHz) for general digital audio, for up to 22-24KHz of audible frequencies.

The optimisations regarding sample rate were strictly if, say, we had a sound effect that was stored in 44KHz+ sample rate, but the sound itself was only 2KHz, meaning taking it down to 8KHz sample rate would still give it plenty of headroom.

However I've done away with those changes where possible and left everything at 44KHz and let the encoder handle it, because the encoder handles higher sample rates more efficiently for lower frequencies in use. If I lower the sample rate because the frequencies used are lower, the encoder begins to damage the audio and start notching (band stop) frequencies that are too close together for humans to generally be able to differentiate, which regardless of whether I can hear the difference (or others) I can see the difference when monitoring the spectrum plot and spectrogram, both of which are objective comparisons vs a subjective listening test.

The concept of lowering the sample rates is kind of done away with for the most part since I continued work on the optimisations because of issues / efficiency losses I encountered when handling them that way as the Vorbis encoder encodes more efficienctly (higher quality for lower data size) if I leave the sample rate at 44-48 (22-24KHz audible) even if the content is lower, and just let the encoder do the low pass filter itself based on quality level where appropriate or manually do the low pass / delete above a frequency if nothing audible is actually there.

In these cases through my testing and findings while playing with it, the sample rate changes aren't necessary at all until you hit extremely low audible frequencies, if we find there is an upper cap to audible frequencies that we can remove data with, anything that gets cut off the top will be done by flat deleting the content (if it's far beyond audible gain levels and there's nothing audible near it) or with a low pass filter without changing the sample rate.
Bit depth should never be touched, the masters should always be kept in whatever the max bit depth they were in is as lossy encoding doesn't have a fixed bit depth, it just encodes based on the content and storing the master in a lower bit depth will give the encoder less accurate audio to handle.

Overall, most of these changes are simply:
- Someone has a piece of audio containing 0-10KHz audible (by volume) frequencies + a bunch of frequencies that are too quiet to be heard ever, encoded in 48-90KHz sample rate (24KHz-45KHz audible) at a data rate in some cases of of 500kbit/s - 1000kbit/s.
- The encoder encodes the entire frequency range even if a large portion of it is too quiet to ever be heard above SNR, and provides more bits than the audio actually contains because it was told to (because 'quality / VBR' settings are fixed between a lower and upper bound, even if that lower bound is still much higher than the actual upper bound of the audio.
- The audio itself only contains 20KHz sample rate worth of actual useful audio data, and 100kbit/s of actual data when encoded optimally
- We remove the overhead that is genuinely wasted space, saving a large portion of file size

For proper mono-stereo it should be output the same in both ears vs mono, though as I go removing those I tweak the new mono track (instead of mixing them down I just delete one) and change the gain to match the perceptual gain level of the mono-stereo version, so the output should be 'the same' to within enough accuracy it should be indifferent to the audience. Problem with most of the mono-stereo tracks is that they're already so small (and there's so many of them) I haven't undertaken the work because I feel like the data savings are too small for the effort (we're talking maybe at a guess 500KB saved to process 181 effects if I can't do it automatically). Hell, some of these mono-stereo effects could even be upgraded to add some phase variance so that they sound fuller, as at the size they're at it won't be a huge increase in data size, but that depends entirely on the effect and how long it plays for.

My intention is to not ever touch the heard audio, just remove the overhead, so there's no 'compression' or other changes being done to the decoded audio, we're simply altering and optimising how that audio is packed into the file.

A lot of these changes are actually also to institute a 'best practice' (assuming people are interested in following one) for when people add new audio to the game so that it can be handled from whatever quality master they can get a hold of and optimised before it's added, then there's no double handling or risk of players getting 'used to a sound' then having it changed later as well.

Edited by timegrinder
  • Thanks 1
Link to comment
Share on other sites

2 hours ago, timegrinder said:

Overall, most of these changes are simply:
- Someone has a piece of audio containing 0-10KHz audible (by volume) frequencies + a bunch of frequencies that are too quiet to be heard ever, encoded in 48-90KHz sample rate (24KHz-45KHz audible) at a data rate in some cases of of 500kbit/s - 1000kbit/s.
- The encoder encodes the entire frequency range even if a large portion of it is too quiet to ever be heard above SNR, and provides more bits than the audio actually contains because it was told to (because 'quality / VBR' settings are fixed between a lower and upper bound, even if that lower bound is still much higher than the actual upper bound of the audio.
- The audio itself only contains 20KHz sample rate worth of actual useful audio data, and 100kbit/s of actual data when encoded optimally
- We remove the overhead that is genuinely wasted space, saving a large portion of file size

 

Wrote that in order to open it for bassists like me, and this here is neat to see as a such, for a technical explain. I can tell this is a very good way to do it, also got to remember this myself. I haven't done many tricks of cleansing a track of the unnecessary. These could even add to the overall sound there, with fine devices at least, since removing really inaudible data may still add to the headroom.

2 hours ago, timegrinder said:


44KHz being the general standard (nowadays it's 48KHz) for general digital audio, for up to 22-24KHz of audible frequencies.

I'm getting old, honk

Edited by Regular Joe
Link to comment
Share on other sites

Try the frequency test, I worked out on my system that is rated up to 20KHz I can't hear above 16KHz (I can still feel the sound to about 18KHz though), I got someone else to try it out and they can't hear above about 15KHz but can feel it up to 16KHz. The audio might be stored in a higher range but it's still assumed humans can't hear above 22KHz at the upper extreme. This is why I went back over the changes I'd previously made when I first started and re-did them to include all frequencies until there's enough feedback / an executive decision to remove above frequency x, so there's still a lot of data savings without cutting into the actual content wherever possible.

After some of the cleanup on the source media some of the files have ended up with more clarity for similar (or lower) file sizes as the silence is actually closer to silence (or the noise floor at least) instead of them having a faint 'washy' sound in the background.

As a side note, it's amusing to hide pictures in the audio / in the frequencies above 22KHz. Not that I'd do it here ?

  • explodyparrot 1
Link to comment
Share on other sites

Current state of optimisations

Sprites up at https://github.com/timegrinder/SS13-Paradise/tree/optim-img
Sounds up at https://github.com/timegrinder/SS13-Paradise/tree/optim-snd

Sprites: Resource file size reduced to 99.022MB (11.477MB less than current master)
The files have also been tested for 100% match using the following methods:
- PSNR (how the decoded image visually compares to the original in a 'signal to noise ratio' number 0 being none, infinite being 100%) - They all come up infinite/100%
- DMI Metadata was hashed (extract the DMI chunk as text, run the text result through MD5, spit out a hash. Do the same to the optimised version and if the hashes match (or are not the default 'empty string' hash) they match)
As I'm not an expert or even regular player in BYOND games there may be unforeseen issues with transparencies and how objects are applied on mobs or effects applied, but I've loaded a server using them and run around the station for a while, the basics all seem to look normal compared to the hosted server.

Sounds: Resource file size reduced to 85.694MB (24.805MB less than current master)
We'll see how these turn out based on opinion, they may end up larger if they seem to be different to the originals. Some audio files are actually higher quality after processing the original sources (and not the versions available in the master repo or other SS13 forks)
E1M1 will sound pretty different as it's the full range original source material, not the cut down version that ended up in the game (upper frequencies are cut drastically). I can however replicate that same frequency range cut in the lower file size version based on preferences.

New resource file size (for both optimisations): 74.216MB (36.283MB less than current master)

See how we go from here!

Edited by timegrinder
  • Thanks 1
Link to comment
Share on other sites

  • 1 month later...

Howdy,

Is this still something that's of interest?

If it is, I can put some time in to polish up the sprite optimiser script, instructions, and finish / share some of the documents I was writing to go with it and the sound stuff.

Link to comment
Share on other sites

  • 3 months later...

Few months later I did the sinewave test. I heard it to the point of 32-33 sec and then it went off quickly. Listened it with neutral headphones (dt770 80 ohm) but with average sound card of my old laptop.

How did the project go, did it went on hiatus?

Edited by Regular Joe
Link to comment
Share on other sites

  • 3 months later...

Hey,

The stage the project was at required some testing outside of my own to make sure that the image optimisations weren't going to cause some issues I hadn't foreseen or come across in my own testing. Sound ones too, though they're more opinion based, with a couple of files needing specific checking (because I'd changed formats on some to improve quality/size ratios).

I left it there at the time as I hadn't heard anything back on it and the next steps were either:
1) Alter my methods to improve the results to user liking
or
2) If the results were sufficient, work out an end-user script to automate the process as well as provide a system to prevent previously processed files being re-processed unnecessarily. (This is because the scripts I wrote for testing purposes are exhaustive, take massive amounts of time, and don't have any checks in place for 'This file has already been done, don't do it until it changes' sorts of things).

You can check out the results I'd gotten up to (though significantly outdated now) on the GitHub links provided above.

Edited by timegrinder
  • fastparrot 1
Link to comment
Share on other sites

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Terms of Use