next_split_s() could potentially commit out-of-bounds indexing if the
amount of source data was bigger than the destination data.
This is because the size of the source data passed in includes the null
terminator, so if 1 byte is not subtracted from it, then after it passes
through the VVV_min(), it will index 1 past the end of the passed buffer
when null-terminating it.
In contrast, the other argument of the VVV_min() does not need 1
subtracted from it, because that length does not include a null
terminator (next_split() returns the length of the substring, after all;
not the length of the substring plus 1).
(The VVV_min() here also shortens the range of values to the size of an
int, but we'll probably make size_t versions anyway; plus who really
cares about supporting massively-sized buffers bigger than 2 billion
bytes in length? That just doesn't make sense.)
If PHYSFS_enumerate() isn't successful, we now print that it wasn't
successful, and print the PhysFS error message. (We should get that
logging thing going sometime...)
Note that level dir listing still uses plenty of STL (including the end
product - the `LevelMetaData` struct - which, for the purposes of 2.3,
is okay enough (2.4 should remove STL usage entirely)); it's just that
the initial act of iterating over the levels directory no longer takes
four or SIX(!!!) heap allocations (not counting reallocations and other
heap allocations this patch does not remove), and no longer does any
data marshalling.
Like text splitting, and binary blob extra indice grabbing, the current
approach that FILESYSTEM_getLevelDirFileNames() uses is a temporary
std::vector of std::strings as a middleman to store all the filenames,
and the game iterates over that std::vector to grab each level metadata.
Except, it's even worse in this case, because PHYSFS_enumerateFiles()
ALREADY does a heap allocation. Oh, and
FILESYSTEM_getLevelDirFileNames() gets called two or three times. Yeah,
let me explain:
1. FILESYSTEM_getLevelDirFileNames() calls PHYSFS_enumerateFiles().
2. PHYSFS_enumerateFiles() allocates an array of pointers to arrays of
chars on the heap. For each filename, it will:
a. Allocate an array of chars for the filename.
b. Reallocate the array of pointers to add the pointer to the above
char array.
(In this step, it also inserts the filename in alphabetically -
without any further allocations, as far as I know - but this is a
COMPLETELY unnecessary step, because we are going to sort the list
of levels by ourselves via the metadata title in the end anyways.)
3. FILESYSTEM_getLevelDirFileNames() iterates over the PhysFS list, and
allocates an std::vector on the heap to shove the list into. Then,
for each filename, it will:
a. Allocate an std::string, initialized to "levels/".
b. Append the filename to the std::string above. This will most
likely require a re-allocation.
c. Duplicate the std::string - which requires allocating more memory
again - to put it into the std::vector.
(Compared to the PhysFS list above, the std::vector does less
reallocations; it however will still end up reallocating a certain
amount of times in the end.)
4. FILESYSTEM_getLevelDirFileNames() will free the PhysFS list.
5. Then to get the std::vector<std::string> back to the caller, we end
up having to reallocate the std::vector again - reallocating every
single std::string inside it, too - to give it back to the caller.
And to top it all off, FILESYSTEM_getLevelDirFileNames() is guaranteed
to either be called two times, or three times. This is because
editorclass::getDirectoryData() will call editorclass::loadZips(), which
will unconditionally call FILESYSTEM_getLevelDirFileNames(), then call
it AGAIN if a zip was found. Then once the function returns,
getDirectoryData() will still unconditionally call
FILESYSTEM_getLevelDirFileNames(). This smells like someone bolting
something on without regard for the whole picture of the system, but
whatever; I can clean up their mess just fine.
So, what do I do about this? Well, just like I did with text splitting
and binary blob extras, make the final for-loop - the one that does the
actual metadata parsing - more immediate.
So how do I do that? Well, PhysFS has a function named
PHYSFS_enumerate(). PHYSFS_enumerateFiles(), in fact, uses this function
internally, and is basically just a wrapper with some allocation and
alphabetization.
PHYSFS_enumerate() takes in a pointer to a function, which it will call
for every single entry that it iterates over. It also lets you pass in
another arbitrary pointer that it leaves alone, which I use to pass
through a function pointer that is the actual callback.
So to clarify, there are two callbacks - one callback is passed through
into another callback that gets passed through to PHYSFS_enumerate().
The callback that gets passed to PHYSFS_enumerate() is always the same,
but the callback that gets passed through the callback can be different
(if you look at the calling code, you can see that one caller passes
through a normal level metadata callback; the other passes through a zip
file callback).
Furthermore, I've also cleaned it up so that if editorclass::loadZips()
finds a zip file, it won't iterate over all the files in the levels
directory a third time. Instead, the level directory only gets iterated
over twice - once to check for zips, and another to load every level
plus all zips; the second time is when all the heap allocations happen.
And with that, level list loading now uses less STL templated stuff and
much less heap allocations.
Also, ed.directoryList basically has no reason to exist other than being
a temporary std::vector, so I've removed it. This further decreases
memory usage, depending on how many levels you have in your levels
folder (I know that I usually have a lot and don't really ever clean it
up, lol).
Lastly, in the callback passed to PhysFS, `builtLocation` is actually no
longer hardcoded to just the `levels` directory, since instead we now
use the `origdir` variable that PhysFS passes us. So that's good, too.
If PHYSFS_mountHandle() failed to mount a zip file, we would print
PhysFS's error message straight, without any surrounding context. This
seems a little weird, and doesn't maximize understanding for readers;
I've made it so now the error message is "Could not mount <zip file>:
<PhysFS error>".
When Ethan added PhysFS to the game, he put in a hardcoded check (marked
with a FIXME) that explicitly removed all filenames that were "data"
returned by PHYSFS_enumerateFiles(). Apparently this was due to a weird
bug with the function putting in "data" strings in its output in PhysFS
2.0.3; however, the game now uses PhysFS 3.0.2, and I could not
reproduce this bug on my system. (I also tested, and this also
straight-up ignores legitimate level filenames that just happen to be
"data" (without the .vvvvvv extension).)
After talking with Ethan in Discord DMs, I asked if we could remove this
check, and he said that we could. So I'm doing it now.
Just like I refactored text splitting to no longer use std::vectors,
std::strings, or temporary heap allocations, decreasing memory usage and
improving performance; there's no reason to use a temporary
heap-allocated std::vector to grab all extra binary blob indices, when
instead the iteration can just be more immediate.
Instead, what I've done is replaced binaryBlob::getExtra() with
binaryBlob::nextExtra(), which takes in a pointer to an index variable,
and will increment the index variable until it reaches an extra track.
After the caller processes the extra track, it is the caller's
responsibility to increment the variable again before passing it back to
getExtra().
This avoids all heap allocations and brings down the memory usage of
processing extra tracks.
If you configure the build with -DBUNDLE_DEPENDENCIES=OFF, then VVVVVV
will dynamically link with TinyXML-2 and PhysicsFS instead of using the
bundled source code in third_party/ and statically linking with them.
Unfortunately, it doesn't seem like distros package LodePNG, and LodePNG
isn't intended to be packaged, so we can't dynamically link with it, nor
can we use some system LodePNG header files somewhere else because those
don't exist.
UTF8-CPP is a special case, because no matter what, it's going to be
statically linked with the binary (it doesn't come as a shared object
file in any way). So with -DBUNDLE_DEPENDENCIES=OFF, we will use the
system UTF8-CPP header files instead of the bundled ones, but it will
still be statically linked in with the binary.
The main motivation for doing this is so if VVVVVV ever gets packaged in
distros, distro maintainers would be more likely to accept it if there
was an option to compile the game without bundled dependencies. Also, it
discourages modifying the third-party dependencies we have, because it's
always possible for someone to compile those dependencies without our
changes, with this CMake option.
This patch restores some 2.2 behavior, fixing a regression caused by the
refactor of properly using std::vectors.
In 2.2, the game allocated 200 items in obj.entities, but used a system
where each entity had an `active` attribute to signify if the entity
actually existed or not. When dealing with entities, you would have to
check this `active` flag, or else you'd be dealing with an entity that
didn't actually exist. (By the way, what I'm saying applies to blocks
and obj.blocks as well, except for some small differing details like the
game allocating 500 block slots versus obj.entities's 200.)
As a consequence, the game had to use a separate tracking variable,
obj.nentity, because obj.entities.size() would just report 200, instead
of the actual amount of entities. Needless to say, having to check for
`active` and use `obj.nentity` is a bit error-prone, and it's messier
than simply using the std::vector the way it was intended. Also, this
resulted in a hard limit of 200 entities, which custom level makers ran
into surprisingly quite often.
2.3 comes along, and removes the whole system. Now, std::vectors are
properly being used, and obj.entities.size() reports the actual number
of entities in the vector; you no longer have to check for `active` when
dealing with entities of any sort.
But there was one previous behavior of 2.2 that this system kind of
forgets about - namely, the ability to have holes in between entities.
You see, when an entity got disabled in 2.2 (which just meant turning
its `active` off), the indices of all other entities stayed the same;
the indice of the entity that got disabled stays there as a hole in the
array. But when an entity gets removed in 2.3 (previous to this patch),
the indices of every entity afterwards in the array get shifted down by
one. std::vector isn't really meant to be able to contain holes.
Do the indices of entities and blocks matter? Yes; they determine the
order in which entities and blocks get evaluated (the highest indice
gets evaluated first), and I had to fix some block evaluation order
stuff in previous PRs.
And in the case of entities, they matter hugely when using the
recently-discovered Arbitrary Entity Manipulation glitch (where crewmate
script commands are used on arbitrary entities by setting the `i`
attribute of `scriptclass` and passing invalid crewmate identifiers to
the commands). If you use Arbitrary Entity Manipulation after destroying
some entities, there is a chance that your script won't work between 2.2
and 2.3.
The indices also still determine the rendering order of entities
(highest indice gets drawn first, which means lowest indice gets drawn
in front of other entities). As an example: let's say we have the player
at 0, a gravity line at 1, and a checkpoint at 2; then we destroy the
gravity line and create a crewmate (let's do Violet).
If we're able to have holes, then after removing the gravity line, none
of the other indices shift. Then Violet will be created at indice 1, and
will be drawn in front of the checkpoint.
But if we can't have holes, then removing the gravity line results in
the indice of the checkpoint shifting down to indice 1. Then Violet is
created at indice 2, and gets drawn behind the checkpoint! This is a
clear illustration of changing the behavior that existed in 2.2.
However, I also don't want to go back to the `active` system of having
to check an attribute before operating on an entity. So... what do we
do to restore the holes?
Well, we don't need to have an `active` attribute, or modify any
existing code that operates on entities. Instead, we can just set the
attributes of the entities so that they naturally get ignored by
everything that comes into contact with it. For entities, we set their
invis to true, and their size, type, and rule to -1 (the game never uses
a size, type, or rule of -1 anywhere); for blocks, we set their type to
-1, and their width and height to 0.
obj.entities.size() will no longer necessarily equal the amount of
entities in the room; rather, it will be the amount of entity SLOTS that
have been allocated. But nothing that uses obj.entities.size() needs to
actually know the amount of entities; it's mostly used for iterating
over every entity in the vector.
Excess entity slots get cleaned up upon every call of
mapclass::gotoroom(), which will now deallocate entity slots starting
from the end until it hits a player, at which point it will switch to
disabling entity slots instead of removing them entirely.
The entclass::clear() and blockclass::clear() functions have been
restored because we need to call their initialization functions when
reusing a block/entity slot; it's possible to create an entity with an
invalid type number (it creates a glitchy Viridian), and without calling
the initialization function again, it would simply not create anything.
After this patch is applied, entity and block indices will be restored
to how they behaved in 2.2.
Just like is_positive_num(), an empty string is not a number.
I've also decided to unroll iteration 0 of the loop here so readability
is improved; this happens to also knock out the whole "accepting empty
string" thing, too.
To account for empty strings, we simply have to special-case them.
Simple as that.
This was also a problem with the previous std::string implementation of
this function; regardless, this is fixed now.
The current way "arrays" from XML files are loaded (before this commit
is applied) goes something like this:
1. Read the buffer of the contents of the tag using TinyXML-2.
2. Allocate a buffer on the heap of the same size, and copy the
existing buffer to it. (This is what the statement `std::string
TextString = pText;` does.)
3. For each delimiter in the heap-allocated buffer...
a. Allocate another buffer on the heap, and copy the characters from
the previous delimiter to the delimiter you just hit.
b. Then allocate the buffer AGAIN, to copy it into an std::vector.
4. Then re-allocate every single buffer YET AGAIN, because you need to
make a copy of the std::vector in split() to return it to the caller.
As you can see, the existing way uses a lot of memory allocations and
data marshalling, just to split some text.
The problem here is mostly making a temporary std::vector of split text,
before doing any actual useful work (most likely, putting it into an
array or ANOTHER std::vector - if the latter, then that's yet another
memory allocation on top of the memory allocation you already did; this
memory allocation is unavoidable, unlike the ones mentioned earlier,
which should be removed).
So I noticed that since we're iterating over the entire string once
(just to shove its contents into a temporary std::vector), and then
basically iterating over it again - why can't the whole thing just be
more immediate, and just be iterated over once?
So that's what I've done here. I've axed the split() function (both of
them, actually), and made next_split() and next_split_s().
next_split() will take an existing string and a starting index, and it
will find the next occurrence of the given delimiter in the string. Once
it does so, it will return the length from the previous starting index,
and modify your starting index as well. The price for immediateness is
that you're supposed to handle keeping the index of the previous
starting index around in order to be able to use the function; updating
it after each iteration is also your responsibility.
(By the way, next_split() doesn't use SDL_strchr(), because we can't get
the length of the substring for the last substring. We could handle this
special case specifically, but it'd be uglier; it also introduces
iterating over the last substring twice, when we only need to do it
once.)
next_split_s() does the same thing as next_split(), except it will copy
the resulting substring into a buffer that you provide (along with its
size). Useful if you don't particularly care about the length of the
substring.
All callers have been updated accordingly. This new system does not make
ANY heap allocations at all; at worst, it allocates a temporary buffer
on the stack, but that's only if you use next_split_s(); plus, it'd be a
fixed-size buffer, and stack allocations are negligible anyway.
This improves performance when loading any sort of XML file, especially
loading custom levels - which, on my system at least, I can noticeably
tell (there's less of a freeze when I load in to a custom level with
lots of scripts). It also decreases memory usage, because the heap isn't
being used just to iterate over some delimiters when XML files are
loaded.
These comments were probably remnants of some late-night coding session
or something. Anyway, they're not needed; there's nothing to do with SDL
here, and the "Init" is obvious because the function is a constructor.
Contents and scripts should be reset in editorclass::reset(); there's no
reason to reset them again right before you load them from an XML file
in editorclass::load().
Additionally, the resets now consistently use SDL_zeroa() (for contents)
and scriptclass::clearcustom() (for scripts).
I'm partial to slash-asterisk-style comments, so I'll use those here.
Also, having a space after the start of comments is good. I've also
removed the "Add the script if we have a preceding header" comments
since it can be inferred by reading the surrounding code.
Instead of checking the length() of an std::string, just check if
pText[0] is equal to '\0'.
This will have to be done anyway, because I'm going to get rid of the
std::string allocation here, and I noticed this inefficiency in the
indentation, so I'm going to remove it.
The actual unindent will be done in the next commit.
This now means every XML array loading is done with common,
re-duplicated code. The only exceptions to this are special cases other
than the the majority of cases; the majority being a simple matter of
reading an array of integers and putting it into another array.
Seems like the only reason I hadn't caught the <customlevelscore> tag
until now was because I was focused on de-duplicating all the array
loads in Game::loadstats() and below, forgetting about
Game::loadcustomlevelstats().
In order to be able to use the LOAD_ARRAY() and LOAD_ARRAY_RENAME()
macros in Game::loadcustomlevelstats(), they have to be moved to earlier
in the file.
Even if split() didn't use the STL, using this function here is a bit
unnecessary, because a simple SDL_strchr() suffices. Refactoring split()
to not use the STL will break this caller anyway, so I might as well
just refactor this to not use split() in the first place.
This refactor also properly checks if the inputs are valid integers. And
since split() is no longer used, it also rejects inputs ending with a
trailing comma as being invalid, too; this didn't happen previously.
It's intentional that I used is_number() here instead of
is_positive_num(), thus accepting negative numbers; in the future it
might be possible to have negative room coordinates.
Valgrind reported this.
The error here is that the buffer here is only guaranteed to be
initialized up until (and including) the null-terminator, by
SDL_snprintf(). Iterating over the entire allocated buffer is bad and I
should feel bad as the girl who wrote this code; doing that reads
uninitialized memory and passes it to SDL_tolower().
As a bonus, the iterator increment is now a preincrement instead of a
postincrement.
This fixes memory leaking every single time a file gets loaded(!) when
the list of custom levels gets loaded(!!!), which Valgrind reports. This
memory leak is completely my bad; 2.2 properly frees the loaded file,
and VCE uses an std::unique_ptr - which I decided to ignore and not
think about why it would be there.
It's safe to do this free after uMem gets copied into std::string;
although, in the future, I *am* thinking about refactoring this function
(and the tag finder function) to not use std::strings, and I'll have to
be careful to make sure that the memory management with the file is
correct when I do so.
This makes the freesrc argument of Mix_LoadMUS_RW() 1 instead of 0. If
the argument is nonzero, then the passed SDL_RWops will be automatically
freed when m_music is freed, too.
I don't know why this was 0 before. Setting it to 1 fixes a memory leak
that Valgrind reports (which turns into an actual leak every time custom
assets are mounted or unmounted).
This adds a check that the pointer passed to
FILESYSTEM_loadFileToMemory() isn't NULL, and if it is, just returns
early in the function, instead of continuing later and producing a
different, slightly-misleading error message.
Previously, it was guarded behind a check for the length, which is... I
guess still perfectly fine behavior, but there's no reason to have a
length check here; FILESYSTEM_freeMemory() uses SDL_free(), which does a
check that the pointer passed is non-NULL (the pointer that is passed
here, despite not being initialized upon declaration, is guaranteed to
be initialized by FILESYSTEM_loadFileToMemory() anyway, so).
Following Ethan's example of bailing (calling VVV_exit()) if
binaryBlob::unPackBinary() couldn't allocate memory, I've searched
through and found every SDL_malloc(), then made sure that if it returned
NULL, the caller would bail (because you can't do much when you're out
of memory).
There should probably be an error message printed when the process is
out of memory, but unPackBinary() doesn't print an error message for
being out of memory, so this can probably be added later. (Also we don't
really have a logging system, I'd like to have something like that added
in first before adding more messages.)
Also, this doesn't account for any allocators used by STL stuff, but
we're working on removing the STL, and allocation failure just results
in an abort anyway, so there's not really a problem there.
Wow, there are a lot of these. All of these exit paths now use
VVV_exit() instead, which attempts to save unlock.vvv and settings.vvv,
and also frees all resources so Valgrind is happy. This is a good thing,
because previously unlock.vvv/settings.vvv wouldn't be written to if we
decided to bail for a given reason.
It should be between the include of the corresponding header file for
the source file (Script.h) and the includes of other local header files
(the files that are specific to this codebase only); this is the
convention that includes in all other source files follow.
However, it seems like I misplaced this, so I'm fixing it now.
This is just a function that calls the cleanup() in main.cpp, as well as
calls exit().
I would have liked to use SDL_ExitProcess() here, because that function
has ifdefs for different runtime environments. But alas, it's an
internal function and isn't exported. Ah well; exit() seems to be fine
anyway.
If there's a resource that doesn't otherwise need to be cleaned up and
is still alive upon program shutdown, then it should go in cleanup().
This cleans up Screen, GraphicsResources, Graphics buffers, Graphics
tiles, and musicclass audio upon program shutdown.
Even we technically don't NEED to clean these resources up ourselves
(the kernel is going to get rid of all of it anyway, else it'd be a
security problem), I'm doing this because otherwise Valgrind will
complain about these, and then it'd be difficult to see which memory
leaks are real and which are just "well this isn't really a leak but you
haven't freed this thing when the process exited, and that's technically
what a memory leak is".
These are all resources whose cleanup functions can be safely called
even if they haven't initialized anything yet.
This isn't a memory leak (not even Valgrind complains), because it gets
properly cleaned up in GraphicsResources::destroy(). Still, it's memory
that is just laying around not being used, and in the name of
deallocating things as soon as you no longer need them, we should
deallocate the base tilesheet images after we split all of them into
tiles.
This reduces the memory cost of all tilesheet images by half, since we
were essentially keeping around duplicates for nothing; this doesn't
really have much of an impact with conventional tilesheet sizes, since
they're usually small enough, but since 2.3 allowed for tilesheet images
of any size, this is a pretty big deal for really big tilesheet images.
It's okay to do this, even though they also get freed in
GraphicsResources::destroy(), because SDL_FreeSurface() does a NULL
check on the pointer passed to it, and we set the pointer to NULL after
freeing the surfaces.
A quick glance at PhysFS source code will show that PhysFS will bail if
PHYSFS_deinit() is called if it's not initialized.
"Bail" here just means setting an error code and returning early, so
it's not that bad. Still, it's the principle of the thing, and I just
want to ensure that FILESYSTEM_deinit() can be safely called no matter
if the filesystem hasn't initialized yet; having an error set by PhysFS
kind of taints the whole safety thing, even if it does nothing wrong,
no?
(although, speaking of which, we should be handling all errors by
PhysFS, but that's for later...)
These FIXME comments are still correct about code duplication, but
they're incorrect about where exactly the original code is after the
original code got moved around. So I've fixed them to refer to the
correct locations.
We really should get around to de-duplicating the code mentioned in
these comments...
Since musicWriteBlob is a temporary object that gets destroyed at the
end of musicclass::init(), in order to make sure we don't leak memory
and lose all the pointers to the blocks we just allocated in
musicWriteBlob, we need to call its clear() method after writing
BinaryMusic.vvv.