VVVVVV

mpo/VVVVVV

Fork 0

mirror of https://github.com/TerryCavanagh/VVVVVV.git synced 2024-11-05 18:59:41 +01:00

Commit graph

Author	SHA1	Message	Date
Dav999-v	3ce4735d50	Add UTF8.c This is a small library I wrote to handle UTF-8. Usage is meant to be as simple as possible - see for example decoding a UTF-8 string: const char* str = "asdf"; uint32_t codepoint; while ((codepoint = UTF8_next(&str))) { // you have a codepoint congrats } Or encoding a single codepoint to add it to a string: std::string result; result.append(UTF8_encode(0x1234).bytes); There are some other functions (UTF8_total_codepoints() to get the total number of codepoints in a string, UTF8_backspace() to get the length of a string after backspacing one character, and UTF8_peek_next() as a slightly less fancy version of UTF8_next()), but more functions could always be added if we need them. This will allow us to replace utfcpp (utf8::unchecked) and also fix some less-than-ideal code: - Some places have to resort to ignoring UTF-8 (next_wrap) or using UCS-4→UTF-8 functions (VFormat had to use PHYSFS ones, and one other place has four lines of code including a std::back_inserter just for one character) - The iterator stuff is kinda confusing and verbose anyway	2023-02-27 23:00:41 -08:00

Author

SHA1

Message

Date

Dav999-v

3ce4735d50

Add UTF8.c

This is a small library I wrote to handle UTF-8.

Usage is meant to be as simple as possible - see for example decoding
a UTF-8 string:

  const char* str = "asdf";
  uint32_t codepoint;
  while ((codepoint = UTF8_next(&str)))
  {
      // you have a codepoint congrats
  }

Or encoding a single codepoint to add it to a string:

  std::string result;
  result.append(UTF8_encode(0x1234).bytes);

There are some other functions (UTF8_total_codepoints() to get the
total number of codepoints in a string, UTF8_backspace() to get the
length of a string after backspacing one character, and
UTF8_peek_next() as a slightly less fancy version of UTF8_next()), but
more functions could always be added if we need them.

This will allow us to replace utfcpp (utf8::unchecked) and also fix
some less-than-ideal code:

- Some places have to resort to ignoring UTF-8 (next_wrap) or using
  UCS-4→UTF-8 functions (VFormat had to use PHYSFS ones, and one other
  place has four lines of code including a std::back_inserter just for
  one character)

- The iterator stuff is kinda confusing and verbose anyway

2023-02-27 23:00:41 -08:00

1 commit