IRC log for #minetest-dev, 2024-09-25

Time	Nick	Message
00:03		SFENCE joined #minetest-dev
00:21		SFENCE joined #minetest-dev
00:38		SFENCE joined #minetest-dev
00:55		SFENCE joined #minetest-dev
01:04		SFENCE joined #minetest-dev
01:15	MTDiscord	<greenxenith> Note to devs/Zughy for discussion in the near future for the rename: It may be a good idea to officialize the core namespace rather than add another namespace.
01:37		SFENCE joined #minetest-dev
01:38		v-rob joined #minetest-dev
01:39	v-rob	I agree. The core namespace already exists, and it also has the benefit of being fork-agnostic. No need to have three namespaces--core, minetest, and newname
01:41	v-rob	(It will also make it so I don't have to configure my brain in and out of builtin mode--I frequently use minetest by accident when I should use core)
01:46	MTDiscord	<warr1024> Especially since "the former game and aspiring engine formerly known as Minetest" would just be too long to type out each time I wanted to make an API call.
01:46		YuGiOhJCJ joined #minetest-dev
01:55		SFENCE joined #minetest-dev
02:06		SFENCE joined #minetest-dev
02:23		SFENCE joined #minetest-dev
02:38		SFENCE joined #minetest-dev
03:53		v-rob joined #minetest-dev
03:56	MTDiscord	<jordan4ibanez> Call it nodecore
03:57	MTDiscord	<jordan4ibanez> https://tenor.com/view/beluga-the-cat-hakosh1307-hakosh-beluga-cat-hug-gif-22532913
04:00		MTDiscord joined #minetest-dev
04:41		SFENCE_ joined #minetest-dev
05:18		v-rob joined #minetest-dev
05:22		v-rob_ joined #minetest-dev
05:32		v-rob_ joined #minetest-dev
11:08		pgimeno joined #minetest-dev
11:32	MTDiscord	<andrey2470t> Core devs, I would like to know whether the idea itself of having the texture atlas in MT (what my #15061 does) is approved generally by you or not? If it is not, I would like to know which reasons could be of it? It is also tagged now as "on the roadmap". I ask it because it distracts other reviewers from reviewing of the PR because they don't know in which state it is as it was found out from Desour in the PR.
11:32	ShadowBot	https://github.com/minetest/minetest/issues/15061 -- Texture atlas for mapblocks meshes by Andrey2470T
12:00	celeron55	added a comment
12:00	celeron55	it's an "if" for me. in short, it must be clearly better than the previous one which was removed
12:00	celeron55	if it is, then it's a "yes"
12:27	[MatrxMT]	<Zughy> have we changed globalsteps somehow with 5.9.X? If I play locally everything works smoothly, but if I play online things go slower
12:27	[MatrxMT]	<Zughy> Like, Block League stamina bar stutters and Block League submachine gun shoots more slowly. Try it on AES
12:28	[MatrxMT]	<Zughy> those are both operation that should happen every 0.1s
12:28	[MatrxMT]	<Zughy> *operations
13:47		hwpplayer1 joined #minetest-dev
13:57		SFENCE joined #minetest-dev
13:59	MTDiscord	<herowl> Would be great if somebody could take a look at #15176
13:59	ShadowBot	https://github.com/minetest/minetest/issues/15176 -- Decouple Ambient Occlusion from Shadows
15:16	[MatrxMT]	<Zughy> "fun" bug, I don't know if Minetest related: I had left the client open for a while, online. I come back to find out the server crashed more than hour ago and when I reconnect, the mouse starts lagging everywhere on my PC. Like, it moves smoothly but clicks weren't always working. I close the client and it starts behaving correctly
15:16	[MatrxMT]	<Zughy> *an hour ago
15:44	[MatrxMT]	<Zughy> well #15193
15:44	ShadowBot	https://github.com/minetest/minetest/issues/15193 -- Different globalstep compared to <5.9?
15:49	[MatrxMT]	<Zughy> just to be clear: this thing completely breaks my minigame, so if any core dev could look into it, I'd really appreciate it
16:55		SpaceMan1ac joined #minetest-dev
17:09		SpaceManiac joined #minetest-dev
17:41	sfan5	since 5.9.1 or 5.9.0?
17:43	sfan5	okay that's answered in there
17:44	sfan5	have you tried measuring if the delta time is actually bigger than 0.1?
18:29		pgimeno_ joined #minetest-dev
18:32	[MatrxMT]	<Zughy> sfan5: we'll run a few tests in a couple of hours. Best way to do that?
18:40		pgimeno joined #minetest-dev
18:42		Sokomine joined #minetest-dev
18:49	MTDiscord	<landarvargan> Possibly by checking register_globalstep()'s dtime var
18:49		hwpplayer1 joined #minetest-dev
18:50	sfan5	core.get_time_us() can measure the actually elapsed time without relying on the server step
19:14	Krock	node timers too, I think
19:14	sfan5	planning to merge #15151 later. if there are any concerns please scream
19:14	ShadowBot	https://github.com/minetest/minetest/issues/15151 -- [manual squash] Divorce map database locking from env lock by sfan5
19:17	Krock	AAAAAAAAAAAAAAAAAAAAAAAAAAA
19:19	Krock	I did have a look at the PR a few times - the concept makes sense overall, although I do not fully understand by how much the locking situation is now improved.
19:21	sfan5	previously while waiting for the database to return the block data into an std::string, we'd be holding the env lock
19:21	sfan5	that is now no longer the case
19:23	sfan5	it is a simple but massive improvement if it wasn't for the lock contention that starts being an issue on highly loaded servers
19:23	sfan5	(not a new issue per se but more servere now because the emergethread releases the lock for longer time)
19:27	Krock	I see. Thanks for the explanation. This is also largely documented in yieldToOtherThreads, which is nice to see.
19:28	Krock	no concerns. the code looks good. feel free to merge later
19:33	MTDiscord	<warr1024> That PR increases lag on servers, apparently by design, i.e. we're essentially trying to fix starvation of the emerge thread by starving the main thread. It seems to mitigate the problem when disk performance is bad, but creates a problem when it isn't and MT is CPU-limited.
19:35	MTDiscord	<warr1024> If we decide to merge it, it may help some people, and may hurt others. In my case, I've already basically had to mitigate the disk issue by changing backup processes to avoid making the disk too busy, and testing so far showed that trade-off worse for me.
19:35	MTDiscord	<warr1024> I think we're going to want a more permanent solution really soon, with like proper queues or something.
19:36		SFENCE joined #minetest-dev
19:38	MTDiscord	<warr1024> As of right now, server performance is a big obstacle for me, and this may affect my ability to adopt 5.10 when it's released, if it remains in the state it's in now. If it'll be fixed by 5.11 or 5.10.1 or something then that's fine, I can wait a bit ... but if it ends up trapping me on 5.9, that's something to consider I suppose.
19:41	MTDiscord	<warr1024> The whole thing arose from the fact that MT fails to achieve single-threaded performance, and that even fully saturated it might only be using ~0.85 cores. It's weird that somehow we reached the conclusion that reducing MT's utilization of available resources somehow solves this.
19:41		Sharpman joined #minetest-dev
19:43	sfan5	I can't tell if this is supposed to be a insult toward my work
19:44	MTDiscord	<greenxenith> That's what you got from that?
19:44	MTDiscord	<greenxenith> Not the "I have many concerns about broken performance due to this PR"?
19:46	sfan5	it's "the situation is already bad and you are intentionally making it worse" paraphrased
19:47	Krock	why is less locking worse?
19:47	MTDiscord	<greenxenith> I see
19:47	sfan5	i think the sensitive point is the fix-lock-contention-by-sleeping workaround
19:47	Krock	if the overall server performance is an issue, then the emergethread will have to be paused until it catches up
19:48	Krock	(as mapblock loading and generating does come with additional callbacks and traffic)
19:50	MTDiscord	<warr1024> I'm not criticizing your work, I'm criticizing the suggestion that we're ready to merge it when discussion is still ongoing about consequences.
19:50	MTDiscord	<warr1024> I was asked to scream, I think I did so quite politely.
19:50	Krock	hmm. isn't the 1 ms sleep far faster than what the emerge thread can process? is the loop executed more than once?
19:50	sfan5	to be clear I am not happy with that either, it's awful. in my (admittedly artificial) tests it performed okay. I don't want to bloat up the PR into a total emerge code rewrite, so I'm willing to merge it now and see if it turns out to the unfortunate problem specific to Warr1024's setup or a flood of complaints about 5.10.0-dev shows up next week
19:51	sfan5	not implying your setup does not matter, but that would make the problem lower priority
19:51	sfan5	Krock: with 50ms of server thread lag, as few as 4ms of sleep were enough to get map loading up to speed
19:53	MTDiscord	<warr1024> I'd actually really like to see progress made on this problem, whether it's a "proper" solution, a workaround, or just some experiments to gather data. I've just had a lot of trouble reproducing this "in the lab" and want to make sure that if I'm using "production" servers to get field data, I can mitigate risks.
19:53	sfan5	wouldn't it be possible to capture a lag profile on a prod server and replicate that in a lab using sleeps?
19:55	sfan5	this idea seems so simple that it shouldn't work
19:55	MTDiscord	<warr1024> I've actually had pretty lousy reliability simulating lag with sleeps. The problem is I don't have a good way to "sleep until the current step takes X time" because I don't actually know how long the current step has been running for by the time my code hits. I also don't understand well enough the various things happening in what order within the step loop.
19:56	MTDiscord	<warr1024> My szutil_lag mod can't even trust dtime, it basically just busy-waits until the current time is at least the time it reached the previous point in the step loop, plus whatever lag time is configured.
19:57	MTDiscord	<warr1024> What really kills the experience isn't the average step time but large spikes, and if those lead to large extra sleep times in proportion, that exacerbates the problem ... but the emerge thread should only need a certain amount of time to acquire a lock, and waiting longer may not necesarily help it.
19:58	sfan5	it needs to acquire the lock once or twice per emerged block
19:59	sfan5	this isn't any different from before either, but it holds it for much shorter. which somehow leads to this
19:59	MTDiscord	<warr1024> Yeah, that's ... unfortuante. If it could just acquire the lock once, and hang onto it for all the blocks it's got queued up, that'd be so much better ... but I guess "queued up" is sort of the root of the issue here, and what we don't have time right now to make happen.
19:59	Krock	sfan5: random question. do you happen to know whether the emerge manager is smart enough to cancel out duplicate requests, or such that are outdated (i.e. block already loaded)?
20:00	sfan5	short answer: yes
20:00	Krock	okay good
20:00	MTDiscord	<warr1024> haha, that's a shame, because if it weren't, that sounds like it'd be an easy win 👤
20:00	MTDiscord	<warr1024> 😄
20:02	sfan5	perhaps I could cap the sleep time so that if max_simultaneous_block_sends_per_client (40) blocks are emerged it stops
20:02	sfan5	that could reduce the lag impact and still provide reasonable map loading for users
20:02	sfan5	but could also do nothing
20:03	MTDiscord	<warr1024> you could also sprinkle verbose logging all over the place to look for whether it WOULD have an impact, or whether things are already not peaking past 40.
20:05	MTDiscord	<warr1024> tbh when dealing with "lag spike" problems, all the steps that AREN'T lag spikes really mess up the statistics. I got some value out of making a version of the jitprofiler that instead of writing to a file, records samples to an array, and only flushes that array to a file if the previous step dtime is > 0.25. That showed me immediately that "lag spikes" have a different performance profile than "normal" steps, and at least some
20:05	MTDiscord	things that stood out on the unfiltered graph were minuscule when problems were happening.
20:05	MTDiscord	<warr1024> Sadly it hasn't yet pointed me to what IS the problem, just what ISN'T that looked like it might have been.
20:06	Krock	yieldToOtherThreads is currently written such that sleeps are proportional to dtime. But this dtime originates from the server thread that you're blocking
20:06	sfan5	to be exact it is proportional if the emerge queue is reasonably filled and it makes progress
20:07	sfan5	I guess an upper limit of 10 would really make sense before merging
20:07	MTDiscord	<warr1024> Hmm, I didn't know that there was actually progress being made.
20:07	Krock	indeed
20:08	sfan5	well we can't judge which progress is good so that can still cause problems
20:09	sfan5	it could go like 150 300 301 302 303 304 306 310 312 315 and we should have stopped after the second one
20:09	MTDiscord	<warr1024> It does kind of suck to have 1ms per mapblock though ... it's not hard to queue up hundreds of mapblocks by traveling, and having the main thread sleep hundreds of ms is noticeable when the server is already running into lag spike issues ... which it will, since saves are also still happening, are a can of worms we haven't opened yet, and will now be contending with all the load activity.
20:09	sfan5	it's not 1ms per block
20:10	sfan5	1ms for the emerge thread to do as many blocks as it can
20:10	sfan5	limited by the response time of your db, of course
20:10	sfan5	this workaround doesn't work when both the CPU and db are slow
20:10		SFENCE joined #minetest-dev
20:10	MTDiscord	<warr1024> oic, so it just doesn't try to reacquire the lock during that time, but the emerge thread can repeatedly release and reacquire it?
20:10	Krock	I'd expect it to process at most 2 per step
20:10	Krock	(step = 1 ms)
20:11	sfan5	yes, the lock is simply left unlocked for that time
20:12	MTDiscord	<warr1024> So the emerge thread has to acquire some lock to check the queue and pull at least one item from it, then it does some loading work in background until it has a "ready to use" mapblock, and then just needs to acquire the lock to connect that mapblock to the env? Or is there something more complex it's doing with that lock?
20:12	MTDiscord	<warr1024> er I mean while holding that lock?
20:12	sfan5	no, that's a good explanation of what happens
20:13	Krock	and then there's the mapgen part which makes the code execution somewhat non-predicable in time
20:14	MTDiscord	<warr1024> Yeah, though in both my lab and my field test cases, mapgen shouldn't have been a factor as all the mapblocks in question were existing and saved.
20:15	MTDiscord	<warr1024> I suppose to be thorough we should test mapgen scenarios too, just to be certain they are no worse too. Maybe I'll need to work on a more extensive lab setup for this.
20:15	sfan5	otoh mapgen still holds the lock for longer (due to on_generated) so maybe it just happens to not be affected
20:16	MTDiscord	<warr1024> I'm starting to think that I should just run more non-trivial but repeatable setups in my lab tests, like snapshots of a developed world and complex game, and then just trust the lab results then.
20:17	sfan5	that would be worthwhile
20:17	MTDiscord	<warr1024> I ran my tests in #15125 against a vanilla MTG world, where it was basically all just loading and saving mapblocks, because I was looking for a "minimal repro" setup, but now that the problem has been "proven" it's probably more useful to be looking at "real-world-like" scenarios so we can get a good idea of the impact of changes.
20:17	ShadowBot	https://github.com/minetest/minetest/issues/15125 -- Busy Disks Cause Lag
20:18	sfan5	fwiw all my test are with either minetest_classic or minetest_game, both very careful with performance
20:18	sfan5	I expect nodecore to be quite stressful for the engine due to all the entities alone
20:19	MTDiscord	<warr1024> I might just have to start recording every single dtime to a huge array, and then flush it out to disk every minute or so, and start crunching the statistics on every server step. Looking at averages or other window-limited aggregates can be deceptive if something happens in the background that throws a huge outlier into the data.
20:20	MTDiscord	<warr1024> When it comes to server performance, from what I can tell, NodeCore's biggest impact is ABM saturation. From what I can tell, entities have minimal impact; I see no measurable impact on dtime from having like 8000 of them loaded, and the only real consequence of it seems to be client-side rendering, if they're within visible range.
20:21	MTDiscord	<warr1024> If entities are not physical, they don't seem to process collision, and if they don't have on_step callbacks, they don't impact the lua tick.
20:21	sfan5	you don't use get_objects_in_range often then?
20:21	MTDiscord	<warr1024> I've completely eliminated that function, actually recently, though I was never using it very heavily.
20:22	MTDiscord	<warr1024> most of the entities are tied to a specific node position, so I can have a node position "key" I can index them by, and I can just loop over luaentities each tick.
20:23	sfan5	good :)
20:23	sfan5	it would be another bottleneck otherwise
20:24	MTDiscord	<warr1024> Since virtually none of my entities are static_save either, it's kind of necessary to loop over them externally instead of using on_step, because they can get deleted pretty much any time. I used to have serious problems with the player wield attachments disappearing but I'm starting to thing I solved them by switching to the "inside out" globalstep approach.
20:33	sfan5	2024-09-25 22:32:06: [Main]: Server::AsyncRunStep() [ms] _________________ 52x 207.634
20:33	sfan5	2024-09-25 22:32:06: [Main]: Server::SendBlocks(): Collect list [ms] _____ 1x 12
20:33	sfan5	2024-09-25 22:32:06: [Main]: Server::SendBlocks(): Send to clients [ms] __ 1x 21
20:33	sfan5	this is with 200ms of server thread lag
20:33	sfan5	whoops
20:33	sfan5	2024-09-25 22:32:06: [Main]: Server::yieldTo...() progress [#] ___________ 36x 50.972
20:33	sfan5	2024-09-25 22:32:06: [Main]: Server::yieldTo...() sleep [ms] _____________ 36x 7.833
20:34	sfan5	these two
20:34	sfan5	takeaways: the sleep workaround wasn't activated every server step; sleep was 7ms on average when activated
20:34	sfan5	and for every ms slept the emerge thread managed to process 6.5 blocks
20:37	MTDiscord	<warr1024> that sounds pretty good
20:37	MTDiscord	<warr1024> I wonder if we can set an upper bound, like "if the server step spikes past 1000ms, don't do the sleep on this step" or something.
20:38	MTDiscord	<warr1024> If a 600ms step turns into 610, that's no problem. If it turns into 700, that could be a problem. If a 1200ms step turns into like 1400 then that's pretty bad.
20:38	sfan5	even with 800ms of server lag the map loads very slowly, but not unplayably so. the yield workaround always sleeps 10ms, but this shouldn't be so bad
20:39	sfan5	Warr1024: I think the server could benefit from an upper limit in general, where upon reaching it it defers non-essential things to the next server step
20:40	MTDiscord	<warr1024> Yeah, I've actually got a number of mechanics that do that already, but it's not something that we could do at like the engine level, it requires each game/mod to make that call itself, and there's no clean way to preempt lua.
20:40	MTDiscord	<warr1024> The thing is, at this point, it's basically just not my lua code that's running most of the time, unless there's some new serious bug.
20:41	MTDiscord	<warr1024> The biggest performance issues I have are in the C++ code with things like loading/saving data and running ABMs, and in some cases it can be really non-obvious how decisions made on the lua side impact behaviour on the C++ side.
20:41		fluxionary joined #minetest-dev
20:42		hwpplayer1 joined #minetest-dev
20:42	MTDiscord	<warr1024> Like, Lars just discovered that if you delete a field that was marked private, it marks the block as dirty, which triggers a mapblock send packet later. I haven't even asked yet about whether that would affect whether the block is considered dirty for save-to-disk purposes.
20:42	sfan5	"considered dirty for save-to-disk purposes" <- obviously, yes
20:43	MTDiscord	<warr1024> There are a lot of things in the engine where if something has a value X, and you set it to value X, it's still marked as "dirty" and sent somewhere, which may have nontrivial cost ... but then sometimes it's impractical to ask the engine whether it was already set to X, or there's a cost associated with that itself.
20:43	MTDiscord	<warr1024> For some things, I can maintain a cache, but for others it's totally impractical ... like, I can't cache every potential node metadata, for example.
20:44	sfan5	nodemeta is unaffected, setting a value that is already present is a no-op
20:44	MTDiscord	<warr1024> In my case I'm more concerned about clearing a value that may already be not-present.
20:45	MTDiscord	<warr1024> Though I guess this conversation already gave me like 2 or 3 ideas for things I can do to try to reduce marking blocks as dirty.
20:46	MTDiscord	<warr1024> It'd be nice if there were some way to tell why a mapblock got marked as dirty, like, did a param0 get written, or a param2, or a value in meta...
20:46	sfan5	also applies to set_string(k, "") to clear something
20:46	sfan5	does not apply to mark_as_private
20:47	MTDiscord	<warr1024> So if I want to set_string on a node meta, am I better off doing a get_string first and checking to make sure the value isn't already equal?
20:47	sfan5	no, that's wasted time
20:47	MTDiscord	<warr1024> Oh, wait, I got it backwards, okay that makes sense
20:48		SFENCE joined #minetest-dev
20:48	MTDiscord	<warr1024> so I should blindly set the value and trust that the engine won't mark something dirty, but I SHOULDN'T blindly mark as private
20:48	sfan5	yes
20:48	sfan5	this could just be fixed for mark_as_private too
20:48	MTDiscord	<warr1024> heh, that'd be nice
20:49	MTDiscord	<warr1024> I was wondering if I need to somehow check if it's already marked ... but I don't know if there's a getter for that.
20:49	MTDiscord	<warr1024> One thing I can do though at least is not mark as private if I'm clearing a meta field, since that's redundant and I'm pretty sure that trips the dirty flag based on what Lars said.
20:56	MTDiscord	<warr1024> does set_float(key, 0) do the same thing as set_string(key, "") or does it actually write a 0 or something? 🤔
20:57	sfan5	puts a "0"
20:59	MTDiscord	<warr1024> Thanks. So that means set_string(key, "") is the only way to "delete" a field? I've been doing stuff like if value ~= 0 then meta:set_float(key, value) else meta:set_string(key, "") end to try to keep meta "compact" and avoid extra keys. get_float(key) returns 0 whether the field is "0" or absent, it seems...
21:02	MTDiscord	<warr1024> 'puts a "0"' ... feels kinda like a bug 😅 unless there's some really good reason why it shouldn't remove the field that I just don't understand.
21:05		SFENCE joined #minetest-dev
21:20		SFENCE joined #minetest-dev
21:47	[MatrxMT]	<Zughy> sfan5: dtime goes between 0.09 and 0.1
21:47	[MatrxMT]	<Zughy> a user tested on their server, same issue
21:48	sfan5	isn't 0.1 what you want to see?
21:50	[MatrxMT]	<Zughy> yes, but it's definitely not 0.1 when you shoot
21:50	[MatrxMT]	<Zughy> it is locally
21:50	[MatrxMT]	<Zughy> I'd say it's around 0.25
21:51	[MatrxMT]	<Zughy> even weapons requiring 0.2s are a little bit slower
21:51	sfan5	you mean the measured value is not 0.1 when shooting or it doesn't "feel" like that?
21:52	[MatrxMT]	<Zughy> the value is 0.1 but when shooting it doesn't feel like that
21:52	sfan5	sounds like it might be network then
21:52	sfan5	can you attach a verbose log to the issue?
21:53	[MatrxMT]	<Zughy> anything in particular? Like, when shooting, when the server starts..?
21:54		SFENCE joined #minetest-dev
21:54	sfan5	my suggestion is to start the server, join, shoot for 5 seconds, wait for 5 seconds, stop server
21:55	sfan5	no hurry btw, tomorrow suffices
22:12		SFENCE joined #minetest-dev
22:16	[MatrxMT]	<Zughy> sfan5: published
22:21		hwpplayer1 joined #minetest-dev
22:22		SFENCE joined #minetest-dev
22:33	[MatrxMT]	<Zughy> if it's not enough, we'll try to feature the start and shut down as well tomorrow
22:34		panwolfram joined #minetest-dev
22:58		SFENCE joined #minetest-dev
23:05		Eragon joined #minetest-dev
23:17		SFENCE joined #minetest-dev
23:50		YuGiOhJCJ joined #minetest-dev
23:52		Mantar joined #minetest-dev
23:53		SFENCE joined #minetest-dev

← Previous day | Channels | #minetest-dev index | Today | Next day → | Google Search | Plaintext