Time Nick Message 10:47 sfan5 man all that character conversion shit is a mess 10:48 sfan5 where even is the difference between narrow_to_wide and utf8_to_wide? 10:50 rubenwardy oof 10:51 rubenwardy imo it should be "utf8_to_utf16" to be precise 10:51 rubenwardy assuming wide is utf16 10:51 sfan5 wchar_t is 4 bytes on linux 10:52 rubenwardy what why 10:53 rubenwardy you need to be able to predict the encodings of strings, having them be different on different platforms is a PITA 10:53 rubenwardy *having the same type be different 10:53 sfan5 now that you mention that I see the issue 10:53 sfan5 our network code assumes wchar_t is 2 bytes 10:57 sfan5 the only reason I can think of for narrow_to_wide/wide_to_narrow is interfacing with the filesystem on windows, but I don't see it used for that 11:00 sfan5 one way I see out of this mess is to turn utf-32 into utf-16 + surrogates when writing to network 11:00 celeron55 windows uses utf16 and linux uses utf32, right? 11:01 celeron55 and irrlicht uses wchar for some of its UI stuff, right? 11:01 sfan5 but I don't know if an old server wouldn't mangle those anyway 11:01 sfan5 yes, yes and no idea 11:02 sfan5 >//! Returns caption of this element. 11:02 sfan5 > virtual const wchar_t* getText() const 11:02 sfan5 looks like a yes 11:02 celeron55 i recall the only reason i had to deal with anything other than 8-bit character sets was that irrlicht required themm 11:02 celeron55 -m 11:03 celeron55 of course when dealing with filesystems it would be important to retain the exact original bytes instead of trying to translate them to utf-8 or any other charset, because filenames aren't necessary valid in any charset 11:03 celeron55 necessarily* 11:04 celeron55 but many serious programs fail at that, trying to convert them to utf-8 11:10 rubenwardy Yeah, we should use utf-8 everywhere and then convert to another format for specific APIs 11:10 rubenwardy or for manip 11:11 MTDiscord ASCII 11:13 rubenwardy no 11:16 MTDiscord :( 11:22 celeron55 yeah, always store as utf-8 11:23 celeron55 also, network serialization should always be utf- 11:23 celeron55 utf-8* 11:23 celeron55 the only exception is file names and paths, which should be stored raw and only converted for displaying 11:25 rubenwardy does the network use wide / utf-16 currently? 11:26 sfan5 yes, for chat messages and a few other things 11:27 celeron55 those are only because i didn't know better at the time 11:28 rubenwardy it shouldn't be too hard to add compat based on protocol versions for that 11:28 sfan5 ...and because nobody looked into this pre-5.0 when we had all chance to change the protocol 11:28 rubenwardy yeah :'( 11:28 rubenwardy > the only exception is file names and paths, which should be stored raw and only converted for displaying 11:28 rubenwardy What about paths stored in config files? 11:29 sfan5 really someone should have looked at every serialized packet at the time and looked which changes should be done 11:29 rubenwardy I think the same rule should apply - store in utf8, convert for APIs. Where store means in common classes, rather than platform-specific local variables 11:29 sfan5 (another thing that's been annoying me is serializing ten or more u8's when a flag field would be enough) 11:29 rubenwardy > really someone should have looked at every serialized packet at the time and looked which changes should be done 11:29 rubenwardy Yeah, this was a really massive oversight. 5.0 was poorly planned 11:31 rubenwardy I don't think we should ever break network compat again, but instead remove backwards compat after a few releases and raise minimum version. Obviously, this doesn't help for things that are very hard to fix and keep compat 11:35 rubenwardy Something that's been annoying me is the fact that player attachments positions don't take BS into account - they're 10x larger than expected 12:15 sfan5 technically wide_to_narrow can be useful to support system locales that aren't utf-8, but I don't think that worked before either 12:16 sfan5 so nobody will miss it if that disappears 12:29 sfan5 oh and wide strings do make sense when you know you're working on unicode string, stuff like isspace() won't work on utf-8 12:34 MTDiscord Wait, never break network compat again? That's not going to fly 12:34 MTDiscord Oh, you mean just have a sliding window of support 12:34 MTDiscord That makes more sense 12:36 MTDiscord So if we change how a block is sent, we might have three functions in play max, and select the right one based on reported client version 12:37 MTDiscord For new stuff, like a "play midi" packet, we would just not send it to clients that don't know what to do with it 12:37 sfan5 a sliding window of support is what we had before 12:37 sfan5 with the exception that 5.0 was a planned protocol break 12:38 MTDiscord Gotcha, I've only been around since 4.17 and only learned about the engine after 5.0 12:39 MTDiscord We still have a lot of work ahead, haha 15:03 sfan5 just wasted half an hour because Server::handleCommand_ChatMessage does its own serialization instead of using the function provided by NetworkPacket 15:03 sfan5 wtf 15:11 sfan5 #10879 15:11 ShadowBot https://github.com/minetest/minetest/issues/10879 -- [NO SQUASH] Fix narrow/wide and UTF-16 mess by sfan5 16:22 sfan5 merging #10859, #10819, #10799, #10661 in 10m 16:22 ShadowBot https://github.com/minetest/minetest/issues/10859 -- Handle changes caused by CMake minimum version bump by sfan5 16:22 ShadowBot https://github.com/minetest/minetest/issues/10819 -- [DONT SQUASH] Rework use_texture_alpha by sfan5 16:22 ShadowBot https://github.com/minetest/minetest/issues/10799 -- Clarify key_value_swap's edge case by Grissess 16:22 ShadowBot https://github.com/minetest/minetest/issues/10661 -- [MANUAL SQUASH] Settings: Proper priority hierarchy by SmallJoker 16:24 Krock_ oh boi 16:29 rubenwardy very nice 16:46 sfan5 zsh: segmentation fault (core dumped) ./bin/minetest 16:46 sfan5 now this isn't a good sign 16:47 sfan5 0x00005555556e5547 in PlayerDatabaseFiles::serialize(RemotePlayer*, std::ostream&) () 16:47 sfan5 ^ Krock 16:47 Krock checking 16:52 Krock reproduced when exiting the world 16:54 sfan5 happening after ~10s in-game here 16:55 Krock on quit, PlayerSAO is NULL 16:55 Krock same after a few seconds in-game. interesting. 17:00 Krock okay, it's a typo 17:00 Krock - serialize(ss, player); 17:00 Krock + serialize(&testplayer, ss); 17:03 Krock tested, works. will push https://krock-works.uk.to/u/patches/0001-PlayerDatabaseFiles-Fix-segfault-while-saving-a-play.patch in 10 minutes 17:13 Krock pushing 17:13 Krock 9a177f009 17:56 sfan5 thanks 19:31 sfan5 #10881 19:31 ShadowBot https://github.com/minetest/minetest/issues/10881 -- Set UTF-8 codepage in Windows manifest by sfan5 19:31 sfan5 thoughts? 19:37 MTDiscord is it possible that https://github.com/minetest/minetest/pull/10731 could be merged before the feature freeze starting tomorrow? 19:39 Krock can't test that but it looks good 19:40 sfan5 I tested it 19:41 sfan5 @Jonathon I'll add it to the milestone, if we have consensus we can make an exception for it 19:41 MTDiscord thank you 19:43 MTDiscord also, one more (mtg this time) https://github.com/minetest/minetest_game/pull/2808 since addressed the old concerns, and probably will useful for servers 19:47 rubenwardy nice find re: #10881 19:48 ShadowBot https://github.com/minetest/minetest/issues/10881 -- Set UTF-8 codepage in Windows manifest by sfan5 21:36 sfan5 merging #10881 in 5m 21:38 ShadowBot https://github.com/minetest/minetest/issues/10881 -- Set UTF-8 codepage in Windows manifest by sfan5