Believe it or not, there are still some remnants of the ActionScript
coding standards in the codebase! And one of them sometimes pops up
whenever an integer division happens.
As it so happens, it seems like division in ActionScript automatically
produces a decimal number. So to prevent that, the game sometimes
subtracts off the remainder of the number to be divided before
performing the division on it.
Thus, we get statements that look like
(a - (a % b)) / b
And probably more parentheses surrounding it too, since it would be
copy-pasted into yet another larger expression, because of course it
would.
`(a % b)` here is subtracting the remainder of `a` divided by `b`, using
the modulo operator, before it gets divided by `b`. Thus, the number
will always be divisible by `b`, so dividing it will mathematically not
produce a decimal number.
Needless to say, this is unnecessary, and very unreadable. In fact, when
I saw these for the first time, I thought they were overcomplicated
_modulos_, _not_ integer division! In C and C++, dividing an integer by
an integer will always result in an integer, so there's no need to do
all this runaround just to divide two integers.
To find all of these, I used the command
rg --pcre2 '(.+?).+?-.+?(?=\1).+?%.+?([\d]+?).+?\/.+?(?=\2)'
which basically matches expressions of the form 'a - a % b / b', where
'a' and 'b' are identical and there could be any characters in the
spaces.
There's really no reason for this simple multiplication plus division to
be in a lookup table. The compiler will optimize it faster than putting
it in a lookup table will, I'm sure.
FILESYSTEM_mountAssets() has a big comment describing the magic numbers
needed to grab FILENAME from a string that looks like
"levels/FILENAME.vvvvvv".
Instead of doing that (and having to write a comment every time the
similar happens), I've written a macro (and helper function) instead
that does the same thing, but clearly conveys the intent.
I mean, just look at the diff. Using VVV_between() is much better than
having to read that comment, and the corresponding SDL_strlcpy().
This function is simple - it takes a given buffer and its size, fills it
with a certain character, and null-terminates it. It's meant to be used
with freshly-created buffers, so we don't copy-paste code.
There's not really any reason for this function to use heap-allocated
strings. So I've refactored it to not do that.
I would've used SDL_strrstr(), if it existed. It does not appear to
exist. But that's okay.
Apparently in C, if you have `void test();`, it's completely okay to do
`test(2);`. The function will take in the argument, but just discard it
and throw it away. It's like a trash can, and a rude one at that. If you
declare it like `void test(void);`, this is prevented.
This is not a problem in C++ - doing `void test();` and `test(2);` is
guaranteed to result in a compile error (this also means that right now,
at least in all `.cpp` files, nobody is ever calling a void parameter
function with arguments and having their arguments be thrown away).
However, we may not be using C++ in the future, so I just want to lay
down the precedent that if a function takes in no arguments, you must
explicitly declare it as such.
I would've added `-Wstrict-prototypes`, but it produces an annoying
warning message saying it doesn't work in C++ mode if you're compiling
in C++ mode. So it can be added later.
next_split_s() could potentially commit out-of-bounds indexing if the
amount of source data was bigger than the destination data.
This is because the size of the source data passed in includes the null
terminator, so if 1 byte is not subtracted from it, then after it passes
through the VVV_min(), it will index 1 past the end of the passed buffer
when null-terminating it.
In contrast, the other argument of the VVV_min() does not need 1
subtracted from it, because that length does not include a null
terminator (next_split() returns the length of the substring, after all;
not the length of the substring plus 1).
(The VVV_min() here also shortens the range of values to the size of an
int, but we'll probably make size_t versions anyway; plus who really
cares about supporting massively-sized buffers bigger than 2 billion
bytes in length? That just doesn't make sense.)
Just like is_positive_num(), an empty string is not a number.
I've also decided to unroll iteration 0 of the loop here so readability
is improved; this happens to also knock out the whole "accepting empty
string" thing, too.
To account for empty strings, we simply have to special-case them.
Simple as that.
This was also a problem with the previous std::string implementation of
this function; regardless, this is fixed now.
The current way "arrays" from XML files are loaded (before this commit
is applied) goes something like this:
1. Read the buffer of the contents of the tag using TinyXML-2.
2. Allocate a buffer on the heap of the same size, and copy the
existing buffer to it. (This is what the statement `std::string
TextString = pText;` does.)
3. For each delimiter in the heap-allocated buffer...
a. Allocate another buffer on the heap, and copy the characters from
the previous delimiter to the delimiter you just hit.
b. Then allocate the buffer AGAIN, to copy it into an std::vector.
4. Then re-allocate every single buffer YET AGAIN, because you need to
make a copy of the std::vector in split() to return it to the caller.
As you can see, the existing way uses a lot of memory allocations and
data marshalling, just to split some text.
The problem here is mostly making a temporary std::vector of split text,
before doing any actual useful work (most likely, putting it into an
array or ANOTHER std::vector - if the latter, then that's yet another
memory allocation on top of the memory allocation you already did; this
memory allocation is unavoidable, unlike the ones mentioned earlier,
which should be removed).
So I noticed that since we're iterating over the entire string once
(just to shove its contents into a temporary std::vector), and then
basically iterating over it again - why can't the whole thing just be
more immediate, and just be iterated over once?
So that's what I've done here. I've axed the split() function (both of
them, actually), and made next_split() and next_split_s().
next_split() will take an existing string and a starting index, and it
will find the next occurrence of the given delimiter in the string. Once
it does so, it will return the length from the previous starting index,
and modify your starting index as well. The price for immediateness is
that you're supposed to handle keeping the index of the previous
starting index around in order to be able to use the function; updating
it after each iteration is also your responsibility.
(By the way, next_split() doesn't use SDL_strchr(), because we can't get
the length of the substring for the last substring. We could handle this
special case specifically, but it'd be uglier; it also introduces
iterating over the last substring twice, when we only need to do it
once.)
next_split_s() does the same thing as next_split(), except it will copy
the resulting substring into a buffer that you provide (along with its
size). Useful if you don't particularly care about the length of the
substring.
All callers have been updated accordingly. This new system does not make
ANY heap allocations at all; at worst, it allocates a temporary buffer
on the stack, but that's only if you use next_split_s(); plus, it'd be a
fixed-size buffer, and stack allocations are negligible anyway.
This improves performance when loading any sort of XML file, especially
loading custom levels - which, on my system at least, I can noticeably
tell (there's less of a freeze when I load in to a custom level with
lots of scripts). It also decreases memory usage, because the heap isn't
being used just to iterate over some delimiters when XML files are
loaded.
This fixes a bug where "12" gets properly evaluated as 12, but "148"
gets evaluated as 1408. It's because `place` gets multiplied by `radix`
again, so `retval` gets multipled by 100 instead of 10.
There's no reason to have a `place` variable, so I've removed it
entirely. This simplifies the function a little bit.
This avoids an unnecessary copy of the input std::vector, since we don't
need to modify it for anything. This cuts down on unnecessary memory
operations.
Apart from the std::string, this function no longer uses the STL.
ss_toi() is a simple function - it converts the input into an int,
taking as many digits as possible until it reaches a non-digit
character, at which point it stops. It's trivial to implement this
without the STL.
I could've used Int() here, but that would've required copying the
string to a temporary buffer to insert a null-terminator (we can't just
use a pointer-and-length data type either, the string functions don't
operate like that - one disadvantage of C strings!). Instead, I decided
to implement my own conversion to int here, because I don't think the
way we humans write our Arabic numerals is going to change anytime soon.
Also, the std::string input is now passed by const reference, instead of
making a copy - cutting down on unnecessary memory operations.
I personally like putting the asterisk with the type, because despite
the language parsing the asterisk as a part of the name, the pointer
part is clearly a part of the return type of the function. Also,
put constness here, to indicate that the input won't be modified inside
the function.
This comment indicates that the function is used by
UtilityClass::GCString(). Which is unnecessary, because the reader can
trivially search for usages of GCChar in the file itself (the 'static'
preceding the function should be a good enough hint) - and if there
aren't any, then the reader will know the function is unused, whereas if
they read the comment, they would have been under the assumption that it
wasn't used. (There might also a compiler warning about it being unused,
which would be more confusing if the comment was still there.)
Point is, comments can get outdated, so removing the comment here makes
the code more self-documenting.
During 2.3 development, there's been a gradual shift to using SDL stdlib
functions instead of libc functions, but there are still some libc
functions (or the same libc function but from the STL) in the code.
Well, this patch replaces all the rest of them in one fell swoop.
SDL's stdlib can replace most of these, but its SDL_min() and SDL_max()
are inadequate - they aren't really functions, they're more like macros
with a nasty penchant for double-evaluation. So I just made my own
VVV_min() and VVV_max() functions and placed them in Maths.h instead,
then replaced all the previous usages of min(), max(), std::min(),
std::max(), SDL_min(), and SDL_max() with VVV_min() and VVV_max().
Additionally, there's no SDL_isxdigit(), so I just implemented my own
VVV_isxdigit().
SDL has SDL_malloc() and SDL_free(), but they have some refcounting
built in to them, so in order to use them with LodePNG, I have to
replace the malloc() and free() that LodePNG uses. Which isn't too hard,
I did it in a new file called ThirdPartyDeps.c, and LodePNG is now
compiled with the LODEPNG_NO_COMPILE_ALLOCATORS definition.
Lastly, I also refactored the awful strcpy() and strcat() usages in
PLATFORM_migrateSaveData() to use SDL_snprintf() instead. I know save
migration is getting axed in 2.4, but it still bothers me to have
something like that in the codebase otherwise.
Without further ado, here is the full list of functions that the
codebase now uses:
- SDL_strlcpy() instead of strcpy()
- SDL_strlcat() instead of strcat()
- SDL_snprintf() instead of sprintf(), strcpy(), or strcat() (see above)
- VVV_min() instead of min(), std::min(), or SDL_min()
- VVV_max() instead of max(), std::max(), or SDL_max()
- VVV_isxdigit() instead of isxdigit()
- SDL_strcmp() instead of strcmp()
- SDL_strcasecmp() instead of strcasecmp() or Win32 strcmpi()
- SDL_strstr() instead of strstr()
- SDL_strlen() instead of strlen()
- SDL_sscanf() instead of sscanf()
- SDL_getenv() instead of getenv()
- SDL_malloc() instead of malloc() (replacing in LodePNG as well)
- SDL_free() instead of free() (replacing in LodePNG as well)
This patch cleans up unnecessary exports from header files (there were
only a few), as well as adds the static keyword to all symbols that
aren't exported and are specific to a file. This helps the linker out in
not doing any unnecessary work, speeding it up and avoiding silent
symbol conflicts (otherwise two symbols with the same name (and
type/signature in C++) would quietly resolve as okay by the linker).
By "unnecessary qualifiers to self", I mean something like using the
'game.' qualifier for a variable on the Game class when you're inside a
function on the Game class itself. This patch is to enforce consistency
as most of the code doesn't have these unnecessary qualifiers.
To prevent further unnecessary qualifiers to self, I made it so the
extern in each header file can be omitted by using a define. That way,
if someone writes something referring to 'game.' on a Game function,
there will be a compile error.
However, if you really need to have a reference to the global name, and
you're within the same .cpp file as the implementation of that object,
you can just do the extern at the function-level. A good example of this
is editorinput()/editorrender()/editorlogic() in editor.cpp. In my
opinion, they should probably be split off into their own separate file
because editor.cpp is getting way too big, but this will do for now.
It's possible that SDL_atoi() could call the libc atoi(), and if a
string is provided that's too large to fit into an integer, then that
would result in undefined behavior. To avoid this, use SDL_strtol()
instead.
If necessary, the caller can provide a fallback to be returned in case
the input given isn't a valid integer, instead of having to duplicate
the is_number() check.
This is simply a wrapper function around SDL_atoi(), because SDL_atoi()
could call libc atoi(), and whether or not invalid input passed into the
libc atoi() is undefined behavior depends on the given libc. So it's
safer to just add a wrapper function that checks that the string given
isn't bogus.
std::isdigit() can be replaced with SDL_isdigit(), but there's no
equivalent for std::isxdigit() in the SDL standard library. So we'll
just use libc isxdigit() instead, it's fine.
Okay, so basically here's the include layout that this game now
consistently uses:
[The "main" header file, if any (e.g. Graphics.h for Graphics.cpp)]
[blank line]
[All system includes, such as tinyxml2/physfs/utfcpp/SDL]
[blank line]
[All project includes, such as Game.h/Entity.h/etc.]
And if applicable, another blank line, and then some special-case
include screwy stuff (take a look at editor.cpp or FileSystemUtils.cpp,
for example, they have ifdefs and defines with their includes).
That's how it should be done, because the SDL headers aren't going to be
installed in this repository. The game was a bit inconsistent before but
now it isn't anymore.
This ensures that endsWith() can be used outside of editor.cpp.
When leo60228 originally wrote endsWith(), it was static, but I asked
him on Discord just now and he more-or-less confirmed that it's fine if
it's not static. If it was static, it would be confined to
UtilityClass.cpp now instead!
Apparently it results in Undefined Behavior if the argument given isn't
representable as an unsigned char. This means that (potentially) plain
char and signed chars could be unsafe to use as well.
It's rare that this could happen in practice, though. std::isdigit() is
only used by is_positive_num() which is only used by find_tag(), so
someone would have to deliberately put something crazy after an `&#` in
a custom level file in order for this to happen. Still, better to be
safe than sorry and all.
This fixes a compile error that could happen where the compiler doesn't
know what std::isdigit() is, but I'm puzzled as to why this wasn't
happening earlier, 'cause I've only been reported that it happens by
only one person.
Unlike, say, the scriptclass i/j/k stuff, these tempstrings are only
purely visual, and there's no known glitches to manipulate them anyway.
Thus I think it's safe to make this cleanup.