Time  Nick      Message
01:15 MTDiscord <greenxenith> Note to devs/Zughy for discussion in the near future for the rename: It may be a good idea to officialize the core namespace rather than add another namespace.
01:39 v-rob     I agree. The core namespace already exists, and it also has the benefit of being fork-agnostic. No need to have three namespaces--core, minetest, and newname
01:41 v-rob     (It will also make it so I don't have to configure my brain in and out of builtin mode--I frequently use minetest by accident when I should use core)
01:46 MTDiscord <warr1024> Especially since "the former game and aspiring engine formerly known as Minetest" would just be too long to type out each time I wanted to make an API call.
03:56 MTDiscord <jordan4ibanez> Call it nodecore
03:57 MTDiscord <jordan4ibanez> https://tenor.com/view/beluga-the-cat-hakosh1307-hakosh-beluga-cat-hug-gif-22532913
11:32 MTDiscord <andrey2470t> Core devs, I would like to know whether the idea itself of having the texture atlas in MT (what my #15061 does) is approved generally by you or not? If it is not, I would like to know which reasons could be of it? It is also tagged now as "on the roadmap". I ask it because it distracts other reviewers from reviewing of the PR because they don't know in which state it is as it was found out from Desour in the PR.
11:32 ShadowBot https://github.com/minetest/minetest/issues/15061 -- Texture atlas for mapblocks meshes by Andrey2470T
12:00 celeron55 added a comment
12:00 celeron55 it's an "if" for me. in short, it must be clearly better than the previous one which was removed
12:00 celeron55 if it is, then it's a "yes"
12:27 [MatrxMT] <Zughy> have we changed globalsteps somehow with 5.9.X? If I play locally everything works smoothly, but if I play online things go slower
12:27 [MatrxMT] <Zughy> Like, Block League stamina bar stutters and Block League submachine gun shoots more slowly. Try it on AES
12:28 [MatrxMT] <Zughy> those are both operation that should happen every 0.1s
12:28 [MatrxMT] <Zughy> *operations
13:59 MTDiscord <herowl> Would be great if somebody could take a look at #15176
13:59 ShadowBot https://github.com/minetest/minetest/issues/15176 -- Decouple Ambient Occlusion from Shadows
15:16 [MatrxMT] <Zughy> "fun" bug, I don't know if Minetest related: I had left the client open for a while, online. I come back to find out the server crashed more than hour ago and when I reconnect, the mouse starts lagging everywhere on my PC. Like, it moves smoothly but clicks weren't always working. I close the client and it starts behaving correctly
15:16 [MatrxMT] <Zughy> *an hour ago
15:44 [MatrxMT] <Zughy> well #15193
15:44 ShadowBot https://github.com/minetest/minetest/issues/15193 -- Different globalstep compared to <5.9?
15:49 [MatrxMT] <Zughy> just to be clear: this thing completely breaks my minigame, so if any core dev could look into it, I'd really appreciate it
17:41 sfan5     since 5.9.1 or 5.9.0?
17:43 sfan5     okay that's answered in there
17:44 sfan5     have you tried measuring if the delta time is actually bigger than 0.1?
18:32 [MatrxMT] <Zughy> sfan5: we'll run a few tests in a couple of hours. Best way to do that?
18:49 MTDiscord <landarvargan> Possibly by checking register_globalstep()'s dtime var
18:50 sfan5     core.get_time_us() can measure the actually elapsed time without relying on the server step
19:14 Krock     node timers too, I think
19:14 sfan5     planning to merge #15151 later. if there are any concerns please scream
19:14 ShadowBot https://github.com/minetest/minetest/issues/15151 -- [manual squash] Divorce map database locking from env lock by sfan5
19:17 Krock     AAAAAAAAAAAAAAAAAAAAAAAAAAA
19:19 Krock     I did have a look at the PR a few times - the concept makes sense overall, although I do not fully understand by how much the locking situation is now improved.
19:21 sfan5     previously while waiting for the database to return the block data into an std::string, we'd be holding the env lock
19:21 sfan5     that is now no longer the case
19:23 sfan5     it is a simple but massive improvement if it wasn't for the lock contention that starts being an issue on highly loaded servers
19:23 sfan5     (not a new issue per se but more servere now because the emergethread releases the lock for longer time)
19:27 Krock     I see. Thanks for the explanation. This is also largely documented in yieldToOtherThreads, which is nice to see.
19:28 Krock     no concerns. the code looks good. feel free to merge later
19:33 MTDiscord <warr1024> That PR increases lag on servers, apparently by design, i.e. we're essentially trying to fix starvation of the emerge thread by starving the main thread.  It seems to mitigate the problem when disk performance is bad, but creates a problem when it isn't and MT is CPU-limited.
19:35 MTDiscord <warr1024> If we decide to merge it, it may help some people, and may hurt others.  In my case, I've already basically had to mitigate the disk issue by changing backup processes to avoid making the disk too busy, and testing so far showed that trade-off worse for me.
19:35 MTDiscord <warr1024> I think we're going to want a more permanent solution really soon, with like proper queues or something.
19:38 MTDiscord <warr1024> As of right now, server performance is a big obstacle for me, and this may affect my ability to adopt 5.10 when it's released, if it remains in the state it's in now.  If it'll be fixed by 5.11 or 5.10.1 or something then that's fine, I can wait a bit ... but if it ends up trapping me on 5.9, that's something to consider I suppose.
19:41 MTDiscord <warr1024> The whole thing arose from the fact that MT fails to achieve single-threaded performance, and that even fully saturated it might only be using ~0.85 cores.  It's weird that somehow we reached the conclusion that reducing MT's utilization of available resources somehow solves this.
19:43 sfan5     I can't tell if this is supposed to be a insult toward my work
19:44 MTDiscord <greenxenith> That's what you got from that?
19:44 MTDiscord <greenxenith> Not the "I have many concerns about broken performance due to this PR"?
19:46 sfan5     it's "the situation is already bad and you are intentionally making it worse" paraphrased
19:47 Krock     why is less locking worse?
19:47 MTDiscord <greenxenith> I see
19:47 sfan5     i think the sensitive point is the fix-lock-contention-by-sleeping workaround
19:47 Krock     if the overall server performance is an issue, then the emergethread will have to be paused until it catches up
19:48 Krock     (as mapblock loading and generating does come with additional callbacks and traffic)
19:50 MTDiscord <warr1024> I'm not criticizing your work, I'm criticizing the suggestion that we're ready to merge it when discussion is still ongoing about consequences.
19:50 MTDiscord <warr1024> I was asked to scream, I think I did so quite politely.
19:50 Krock     hmm. isn't the 1 ms sleep far faster than what the emerge thread can process? is the loop executed more than once?
19:50 sfan5     to be clear I am not happy with that either, it's awful. in my (admittedly artificial) tests it performed okay. I don't want to bloat up the PR into a total emerge code rewrite, so I'm willing to merge it now and see if it turns out to the unfortunate problem specific to Warr1024's setup or a flood of complaints about 5.10.0-dev shows up next week
19:51 sfan5     not implying your setup does not matter, but that would make the problem lower priority
19:51 sfan5     Krock: with 50ms of server thread lag, as few as 4ms of sleep were enough to get map loading up to speed
19:53 MTDiscord <warr1024> I'd actually really like to see progress made on this problem, whether it's a "proper" solution, a workaround, or just some experiments to gather data.  I've just had a lot of trouble reproducing this "in the lab" and want to make sure that if I'm using "production" servers to get field data, I can mitigate risks.
19:53 sfan5     wouldn't it be possible to capture a lag profile on a prod server and replicate that in a lab using sleeps?
19:55 sfan5     this idea seems so simple that it shouldn't work
19:55 MTDiscord <warr1024> I've actually had pretty lousy reliability simulating lag with sleeps.  The problem is I don't have a good way to "sleep until the current step takes X time" because I don't actually know how long the current step has been running for by the time my code hits.  I also don't understand well enough the various things happening in what order within the step loop.
19:56 MTDiscord <warr1024> My szutil_lag mod can't even trust dtime, it basically just busy-waits until the current time is at least the time it reached the previous point in the step loop, plus whatever lag time is configured.
19:57 MTDiscord <warr1024> What really kills the experience isn't the average step time but large spikes, and if those lead to large extra sleep times in proportion, that exacerbates the problem ... but the emerge thread should only need a certain amount of time to acquire a lock, and waiting longer may not necesarily help it.
19:58 sfan5     it needs to acquire the lock once or twice *per* emerged block
19:59 sfan5     this isn't any different from before either, but it holds it for much shorter. which somehow leads to this
19:59 MTDiscord <warr1024> Yeah, that's ... unfortuante.  If it could just acquire the lock once, and hang onto it for all the blocks it's got queued up, that'd be so much better ... but I guess "queued up" is sort of the root of the issue here, and what we don't have time right now to make happen.
19:59 Krock     sfan5: random question. do you happen to know whether the emerge manager is smart enough to cancel out duplicate requests, or such that are outdated (i.e. block already loaded)?
20:00 sfan5     short answer: yes
20:00 Krock     okay good
20:00 MTDiscord <warr1024> haha, that's a shame, because if it weren't, that sounds like it'd be an easy win 👤
20:00 MTDiscord <warr1024> 😄
20:02 sfan5     perhaps I could cap the sleep time so that if max_simultaneous_block_sends_per_client (40) blocks are emerged it stops
20:02 sfan5     that could reduce the lag impact and still provide reasonable map loading for users
20:02 sfan5     but could also do nothing
20:03 MTDiscord <warr1024> you could also sprinkle verbose logging all over the place to look for whether it WOULD have an impact, or whether things are already not peaking past 40.
20:05 MTDiscord <warr1024> tbh when dealing with "lag spike" problems, all the steps that AREN'T lag spikes really mess up the statistics.  I got some value out of making a version of the jitprofiler that instead of writing to a file, records samples to an array, and only flushes that array to a file if the previous step dtime is > 0.25.  That showed me immediately that "lag spikes" have a different performance profile than "normal" steps, and at least some
20:05 MTDiscord things that stood out on the unfiltered graph were minuscule when problems were happening.
20:05 MTDiscord <warr1024> Sadly it hasn't yet pointed me to what IS the problem, just what ISN'T that looked like it might have been.
20:06 Krock     yieldToOtherThreads is currently written such that sleeps are proportional to dtime. But this dtime originates from the server thread that you're blocking
20:06 sfan5     to be exact it is proportional if the emerge queue is reasonably filled and it makes progress
20:07 sfan5     I guess an upper limit of 10 would really make sense before merging
20:07 MTDiscord <warr1024> Hmm, I didn't know that there was actually progress being made.
20:07 Krock     indeed
20:08 sfan5     well we can't judge which progress is good so that can still cause problems
20:09 sfan5     it could go like 150 300 301 302 303 304 306 310 312 315 and we should have stopped after the second one
20:09 MTDiscord <warr1024> It does kind of suck to have 1ms per mapblock though ... it's not hard to queue up hundreds of mapblocks by traveling, and having the main thread sleep hundreds of ms is noticeable when the server is already running into lag spike issues ... which it will, since saves are also still happening, are a can of worms we haven't opened yet, and will now be contending with all the load activity.
20:09 sfan5     it's not 1ms per block
20:10 sfan5     1ms for the emerge thread to do as many blocks as it can
20:10 sfan5     limited by the response time of your db, of course
20:10 sfan5     this workaround doesn't work when both the CPU and db are slow
20:10 MTDiscord <warr1024> oic, so it just doesn't try to reacquire the lock during that time, but the emerge thread can repeatedly release and reacquire it?
20:10 Krock     I'd expect it to process at most 2 per step
20:10 Krock     (step = 1 ms)
20:11 sfan5     yes, the lock is simply left unlocked for that time
20:12 MTDiscord <warr1024> So the emerge thread has to acquire some lock to check the queue and pull at least one item from it, then it does some loading work in background until it has a "ready to use" mapblock, and then just needs to acquire the lock to connect that mapblock to the env?  Or is there something more complex it's doing with that lock?
20:12 MTDiscord <warr1024> er I mean while holding that lock?
20:12 sfan5     no, that's a good explanation of what happens
20:13 Krock     and then there's the mapgen part which makes the code execution somewhat non-predicable in time
20:14 MTDiscord <warr1024> Yeah, though in both my lab and my field test cases, mapgen shouldn't have been a factor as all the mapblocks in question were existing and saved.
20:15 MTDiscord <warr1024> I suppose to be thorough we should test mapgen scenarios too, just to be certain they are no worse too.  Maybe I'll need to work on a more extensive lab setup for this.
20:15 sfan5     otoh mapgen still holds the lock for longer (due to on_generated) so maybe it just happens to not be affected
20:16 MTDiscord <warr1024> I'm starting to think that I should just run more non-trivial but repeatable setups in my lab tests, like snapshots of a developed world and complex game, and then just trust the lab results then.
20:17 sfan5     that would be worthwhile
20:17 MTDiscord <warr1024> I ran my tests in #15125 against a vanilla MTG world, where it was basically all just loading and saving mapblocks, because I was looking for a "minimal repro" setup, but now that the problem has been "proven" it's probably more useful to be looking at "real-world-like" scenarios so we can get a good idea of the impact of changes.
20:17 ShadowBot https://github.com/minetest/minetest/issues/15125 -- Busy Disks Cause Lag
20:18 sfan5     fwiw all my test are with either minetest_classic or minetest_game, both very careful with performance
20:18 sfan5     I expect nodecore to be quite stressful for the engine due to all the entities alone
20:19 MTDiscord <warr1024> I might just have to start recording every single dtime to a huge array, and then flush it out to disk every minute or so, and start crunching the statistics on every server step.  Looking at averages or other window-limited aggregates can be deceptive if something happens in the background that throws a huge outlier into the data.
20:20 MTDiscord <warr1024> When it comes to server performance, from what I can tell, NodeCore's biggest impact is ABM saturation.  From what I can tell, entities have minimal impact; I see no measurable impact on dtime from having like 8000 of them loaded, and the only real consequence of it seems to be client-side rendering, if they're within visible range.
20:21 MTDiscord <warr1024> If entities are not physical, they don't seem to process collision, and if they don't have on_step callbacks, they don't impact the lua tick.
20:21 sfan5     you don't use get_objects_in_range often then?
20:21 MTDiscord <warr1024> I've completely eliminated that function, actually recently, though I was never using it very heavily.
20:22 MTDiscord <warr1024> most of the entities are tied to a specific node position, so I can have a node position "key" I can index them by, and I can just loop over luaentities each tick.
20:23 sfan5     good :)
20:23 sfan5     it would be another bottleneck otherwise
20:24 MTDiscord <warr1024> Since virtually none of my entities are static_save either, it's kind of necessary to loop over them externally instead of using on_step, because they can get deleted pretty much any time.  I used to have serious problems with the player wield attachments disappearing but I'm starting to thing I solved them by switching to the "inside out" globalstep approach.
20:33 sfan5     2024-09-25 22:32:06: [Main]:   Server::AsyncRunStep() [ms] _________________   52x  207.634
20:33 sfan5     2024-09-25 22:32:06: [Main]:   Server::SendBlocks(): Collect list [ms] _____    1x      12
20:33 sfan5     2024-09-25 22:32:06: [Main]:   Server::SendBlocks(): Send to clients [ms] __    1x      21
20:33 sfan5     this is with 200ms of server thread lag
20:33 sfan5     whoops
20:33 sfan5     2024-09-25 22:32:06: [Main]:   Server::yieldTo...() progress [#] ___________   36x  50.972
20:33 sfan5     2024-09-25 22:32:06: [Main]:   Server::yieldTo...() sleep [ms] _____________   36x   7.833
20:34 sfan5     these two
20:34 sfan5     takeaways: the sleep workaround wasn't activated every server step; sleep was 7ms on average *when* activated
20:34 sfan5     and for every ms slept the emerge thread managed to process 6.5 blocks
20:37 MTDiscord <warr1024> that sounds pretty good
20:37 MTDiscord <warr1024> I wonder if we can set an upper bound, like "if the server step spikes past 1000ms, don't do the sleep on this step" or something.
20:38 MTDiscord <warr1024> If a 600ms step turns into 610, that's no problem.  If it turns into 700, that could be a problem.  If a 1200ms step turns into like 1400 then that's pretty bad.
20:38 sfan5     even with 800ms of server lag the map loads very slowly, but not unplayably so. the yield workaround always sleeps 10ms, but this shouldn't be so bad
20:39 sfan5     Warr1024: I think the server could benefit from an upper limit in general, where upon reaching it it defers non-essential things to the next server step
20:40 MTDiscord <warr1024> Yeah, I've actually got a number of mechanics that do that already, but it's not something that we could do at like the engine level, it requires each game/mod to make that call itself, and there's no clean way to preempt lua.
20:40 MTDiscord <warr1024> The thing is, at this point, it's basically just not my lua code that's running most of the time, unless there's some new serious bug.
20:41 MTDiscord <warr1024> The biggest performance issues I have are in the C++ code with things like loading/saving data and running ABMs, and in some cases it can be really non-obvious how decisions made on the lua side impact behaviour on the C++ side.
20:42 MTDiscord <warr1024> Like, Lars just discovered that if you delete a field that was marked private, it marks the block as dirty, which triggers a mapblock send packet later.  I haven't even asked yet about whether that would affect whether the block is considered dirty for save-to-disk purposes.
20:42 sfan5     "considered dirty for save-to-disk purposes" <- obviously, yes
20:43 MTDiscord <warr1024> There are a lot of things in the engine where if something has a value X, and you set it to value X, it's still marked as "dirty" and sent somewhere, which may have nontrivial cost ... but then sometimes it's impractical to ask the engine whether it was already set to X, or there's a cost associated with that itself.
20:43 MTDiscord <warr1024> For some things, I can maintain a cache, but for others it's totally impractical ... like, I can't cache every potential node metadata, for example.
20:44 sfan5     nodemeta is unaffected, setting a value that is already present is a no-op
20:44 MTDiscord <warr1024> In my case I'm more concerned about clearing a value that may already be not-present.
20:45 MTDiscord <warr1024> Though I guess this conversation already gave me like 2 or 3 ideas for things I can do to try to reduce marking blocks as dirty.
20:46 MTDiscord <warr1024> It'd be nice if there were some way to tell why a mapblock got marked as dirty, like, did a param0 get written, or a param2, or a value in meta...
20:46 sfan5     also applies to set_string(k, "") to clear something
20:46 sfan5     does *not* apply to mark_as_private
20:47 MTDiscord <warr1024> So if I want to set_string on a node meta, am I better off doing a get_string first and checking to make sure the value isn't already equal?
20:47 sfan5     no, that's wasted time
20:47 MTDiscord <warr1024> Oh, wait, I got it backwards, okay that makes sense
20:48 MTDiscord <warr1024> so I should blindly set the value and trust that the engine won't mark something dirty, but I SHOULDN'T blindly mark as private
20:48 sfan5     yes
20:48 sfan5     this could just be fixed for mark_as_private too
20:48 MTDiscord <warr1024> heh, that'd be nice
20:49 MTDiscord <warr1024> I was wondering if I need to somehow check if it's already marked ... but I don't know if there's a getter for that.
20:49 MTDiscord <warr1024> One thing I can do though at least is not mark as private if I'm clearing a meta field, since that's redundant and I'm pretty sure that trips the dirty flag based on what Lars said.
20:56 MTDiscord <warr1024> does set_float(key, 0) do the same thing as set_string(key, "") or does it actually write a 0 or something? 🤔
20:57 sfan5     puts a "0"
20:59 MTDiscord <warr1024> Thanks.  So that means set_string(key, "") is the only way to "delete" a field?  I've been doing stuff like if value ~= 0 then meta:set_float(key, value) else meta:set_string(key, "") end to try to keep meta "compact" and avoid extra keys.  get_float(key) returns 0 whether the field is "0" or absent, it seems...
21:02 MTDiscord <warr1024> 'puts a "0"' ... feels kinda like a bug 😅 unless there's some really good reason why it shouldn't remove the field that I just don't understand.
21:47 [MatrxMT] <Zughy> sfan5: dtime goes between 0.09 and 0.1
21:47 [MatrxMT] <Zughy> a user tested on their server, same issue
21:48 sfan5     isn't 0.1 what you want to see?
21:50 [MatrxMT] <Zughy> yes, but it's definitely not 0.1 when you shoot
21:50 [MatrxMT] <Zughy> it is locally
21:50 [MatrxMT] <Zughy> I'd say it's around 0.25
21:51 [MatrxMT] <Zughy> even weapons requiring 0.2s are a little bit slower
21:51 sfan5     you mean the measured value is not 0.1 when shooting or it doesn't "feel" like that?
21:52 [MatrxMT] <Zughy> the value is 0.1 but when shooting it doesn't feel like that
21:52 sfan5     sounds like it might be network then
21:52 sfan5     can you attach a verbose log to the issue?
21:53 [MatrxMT] <Zughy> anything in particular? Like, when shooting, when the server starts..?
21:54 sfan5     my suggestion is to start the server, join, shoot for 5 seconds, wait for 5 seconds, stop server
21:55 sfan5     no hurry btw, tomorrow suffices
22:16 [MatrxMT] <Zughy> sfan5: published
22:33 [MatrxMT] <Zughy> if it's not enough, we'll try to feature the start and shut down as well tomorrow