Minetest logo

IRC log for #minetest-dev, 2014-10-14

| Channels | #minetest-dev index | Today | | Google Search | Plaintext

All times shown according to UTC.

Time Nick Message
00:04 alexxss joined #minetest-dev
00:06 rickmcfarley joined #minetest-dev
00:08 kahrl does anyone else get weird crosses above all dry shrubs when fsaa >= 2? http://i.imgur.com/zvhhjwf.png
00:08 kahrl (this screenshot is with fsaa = 16, but it happens with anything but fsaa = 1 or 0)
00:09 kahrl strangely enough it doesn't seem to happen with other plantlike nodes
00:13 kahrl this is with an ATI Radeon HD 6850 using the opensource ati driver on gentoo
00:14 VanessaE I saw that effect once before with something RBA was working on
00:14 VanessaE at the time I think it was the texture "wrapping through" the bottom of the mesh and back to the top
00:14 ShadowNinja kahrl: I can reproduce, also a line above the hand.
00:15 kahrl VanessaE: seems plausible
00:15 kahrl maybe it happens with other plantlike nodes too but I can't see it because it's too short
00:17 kahrl ShadowNinja: not getting that here
00:17 ShadowNinja kahrl: Got it with fsaa=64 or so.
00:18 kahrl ah, let me try
00:18 kahrl at fsaa=16 the extruded wield meshes are already all broken up through (lines between the pixels)
00:19 kahrl not happening with fsaa=64 either, but I think my card can't handle that setting
00:24 OldCoder I'd really appreciate help with the RE-SEND RELIABLE bug. This is a lockup that seems to be a fundamental issue.
00:24 OldCoder All obvious patches including one by Zeno` are failing
00:26 kahrl OldCoder: only sapier (and maybe c55) are likely to be able to help with that, I think
00:27 OldCoder It is regrettable; this bug is completely fatal. There is no workaround at all.
00:27 OldCoder I will need to shut down and do not wish to do so. I am looking at the code again.
00:27 OldCoder It should be a simple fix.
00:27 OldCoder But it escapes me.
00:28 kahrl OldCoder: issue #?
00:28 eeew joined #minetest-dev
00:28 OldCoder It has been reported in the wild about 4 times but I don't know if it has an issue number yet
00:28 OldCoder Miner_48 did the research
00:28 OldCoder He forwarded a number of web pages which showed that other people have the issue
00:29 OldCoder Zeno` knows the code and may help when he wakes up
00:29 OldCoder He offered a patch yesterday but it had no effect
00:29 OldCoder I am experimenting with other patches now
00:29 OldCoder Essentially, connection.cpp locks up in the code that says RE-SEND RELIABLE
00:30 OldCoder Millions and millions of the lines are printed
00:30 kahrl writing an issue with all the relevant information might help sapier to get up to speed
00:30 OldCoder You are correct
00:30 OldCoder If Zeno` returns I will seek his advice regarding what to say; as he has now worked on the same code
00:32 VanessaE *looks at clock*  Zeno should be here within the next hour or so
00:32 OldCoder Yes, we will see
00:32 OldCoder Now that I have started some new worlds I wish to keep them up; but am now on time IRL. I will speak again with him.
00:33 OldCoder Lockups are the worst type of issue. With crashes, at least you can restart.
00:37 kahrl I can think of something worse: map corruption :P
00:40 OldCoder It is correct
00:40 ShadowNinja OldCoder: You can write a scipt that checks the logs for those re-send messages, and restarts the server if it finds too many of them.
00:40 OldCoder ShadowNinja, the ultimate kludge :-)
00:40 ShadowNinja You'll have to pipe stdout/stderr to it.
00:40 OldCoder No
00:41 OldCoder The messages appear in debug.txt
00:41 OldCoder A daemon could watch there
00:41 OldCoder But the problem occurs often sometimes
00:41 OldCoder Restarting every few minutes is not viable
00:41 ShadowNinja OldCoder: I didn't say it wasn't hacky.  ;-)
00:41 OldCoder Never mind hacky
00:41 OldCoder Is a game that shuts down every few minutes going to work?
00:41 OldCoder It appears that I have attracted a lot of mobile devices with one world
00:42 OldCoder The mobiles are confusing the code
00:42 OldCoder In short, Kindle equals Minedeath
00:42 ShadowNinja OldCoder: Yes, that works too, but I'd use the pipe so the high-verbosity messages are just kept in memory (debug logs can baloon to GBs in size with high enough log levels).
00:42 OldCoder Indeed. Not viable either way due to frequency of lockups.
00:43 OldCoder Sometimes hours, but sometimes every few minutes. This has been happening for about four days now. I have just made another experimental patch.
00:44 OldCoder This is probably a high-value issue to address, with the rise of mobiles in popularity
00:45 * kahrl tries https://gist.github.com/kahrl/9f28eca10f3d62c9bd63 right now to reproduce the problem maybe
00:47 ShadowNinja OldCoder: Is this on all of your servers?
00:48 OldCoder On two worlds so far. Ones that have recently become popular.
00:48 OldCoder You are familiar with one of them.
00:48 * OldCoder reviews the gist
00:49 OldCoder kahrl, that is interesting, what does the random disconnect do?
00:49 OldCoder random drops packets? clever
00:49 kahrl ok, doesn't seem to be enough to get it to lock up
00:49 OldCoder So it simply doesn't send them and they pile up. Will they be classified in the RE-SEND RELIABLE group?
00:50 OldCoder What constitutes a RELIABLE packet?
00:50 kahrl yeah it resends them but since the likelihood is still high that the resent ones arrive, it won't lock up
00:50 OldCoder I'm wondering if there is a bug elsewhere. If it helps, sequence numbers sometimes jump from 500 to 65000
00:50 OldCoder Would this be normal?
00:50 kahrl a reliable packet is one that one can't afford to drop
00:51 kahrl e.g. position updates can be dropped (they will be resent soon anyway), but chat messages can't
00:51 OldCoder Not RELIABLE then but ESSENTIAL ?
00:51 kahrl reliable = essential
00:51 OldCoder Got it, thanks
00:51 OldCoder So what might trigger millions of resends?
00:51 OldCoder If we look at it differently
00:52 kahrl a peer suddenly disappearing?
00:52 OldCoder Hm
00:52 OldCoder In this case
00:52 kahrl although it wouldn't be literally millions
00:52 OldCoder It *is*
00:52 OldCoder This is what has caused the lockups
00:52 OldCoder Millions of lines (literally) of RE-SEND RELIABLE
00:52 OldCoder Is this a clue?
00:52 kahrl over what time period?
00:53 OldCoder 1062624 of those lines today
00:53 OldCoder And that is *with* patches intended to limit them
00:53 OldCoder I suspect it was 5 million yesterday
00:55 kahrl perhaps the server fails to time out such peers for some odd reason
00:55 OldCoder kahrl, I'm guessing that something corrupts sequence numbers or queues
00:56 OldCoder The default code drops the packets, reliable or not, if the resend count exceeds 5
00:56 OldCoder Yet millions of lines are printed
00:56 OldCoder What might this imply?
00:56 OldCoder I think the resend count might be ending up as -50000 or something
00:56 OldCoder I'm adding a kludge to address this if so
00:57 * OldCoder rests briefly and thanks you
00:57 ShadowNinja OldCoder: Or reliable messages are continually being added to the queue.
00:57 OldCoder Hm
00:57 kahrl could you make a histogram of how many resends there are each minute?
00:57 OldCoder But what would add so many of them?
00:57 OldCoder Yes. If the problem occurs after the latest patch.
00:58 OldCoder Resetting the log file now
00:58 kahrl I wonder if it stays at about 12 re-sends per second all the time or if it suddenly piles up
00:58 OldCoder Let me look at yesterday's
00:58 rickmcfarley joined #minetest-dev
00:59 OldCoder http://minetest.org/lockup.txt
00:59 OldCoder kahrl, if you are curious, kindly glance at this file ^
00:59 ShadowNinja OldCoder: is your `if k->resend_count > 5) break;` patch before the print line?  Also, does breaking cause the packet to be removed from the queue?
01:00 ShadowNinja OldCoder: You can try to add a DisconnectPeer line there instead.
01:00 OldCoder It was before the print line.  And I don't think it caused removal. Zeno` produced a patch that was to prevent this but it also did not work. I tried the DisconnectPeer an hour ago.
01:00 OldCoder I think something is corrupted else. Well, it will become clear in time.
01:01 OldCoder Thank you for remarks. I might go over all of the patches tried for more ideas.
01:01 * OldCoder rests briefly
01:01 OldCoder Zzz
01:01 proller yes, connection.cpp corrupted
01:01 OldCoder Hm? Code is believed to need work?
01:02 OldCoder I will fiddle with it further in a few minutes after a nap
01:02 kahrl that was a very brief rest :P
01:02 ShadowNinja OldCoder: proller is a troller, ignore him.
01:02 OldCoder proller troller?
01:02 OldCoder "The wheels on the troll go round and round..."
01:02 ShadowNinja OldCoder: Yep, in fact "troller" is his alt nick.
01:02 OldCoder Um, O.K.
01:03 proller yes, but sometimes i can said too true things
01:03 OldCoder I thought C55 disliked, um, trolling in this channel
01:03 * OldCoder must rest for 5 to 10 minutes
01:04 proller reverting connection.cpp to year ago state can help
01:04 VanessaE /kick proller stop trolling
01:04 VanessaE :P
01:04 ShadowNinja OC: I told him, he agreed to quiet him but said that he wouldn't bother doing it himself.
01:04 proller but i'm trying to help
01:05 VanessaE proller: reverting to code that's 10x slower and far less reliable is no solution and you KNOW it.
01:05 VanessaE strike that, more like 100x slower
01:05 proller old code was tunable to sppeds higher than now
01:05 proller and 100x stable
01:06 VanessaE then how about you offer up a patch that actually fixes the problems you perceive.
01:06 ShadowNinja Seems like no ops are arround now, other than possible kwolekr.
01:07 ShadowNinja I've PMd sfan though.
01:07 proller ShadowNinja, you need to change some spaces around
01:12 OldCoder J,
01:12 OldCoder Hm
01:12 OldCoder What if I put
01:12 OldCoder a few milliseconds delay in the loop?
01:13 OldCoder Maybe the resends are too fast
01:13 kahrl it'll probably lock up even faster
01:14 kahrl but I guess you could try
01:18 OldCoder Hm
01:18 OldCoder Even faster?
01:18 * OldCoder does not think that sounds desirable
01:22 OldCoder kahrl, it appears to be 100s of resends per second if that is what you were curious about
01:22 OldCoder But never on every packet
01:23 OldCoder Perhaps dozens instead of 100s
01:32 OldCoder ShadowNinja, disconnectPeer, forceTimeout, or both?
01:34 ShadowNinja OldCoder: forceTimeout sounds like the right function to use.
01:36 OldCoder Experimenting
01:37 OldCoder Didn't work before but still playing with it
01:37 OldCoder Added a 20ms delay on resend
01:37 OldCoder We'll see if that makes it worse
01:38 kaeza joined #minetest-dev
01:47 OldCoder Hm. The 20ms delay may have helped or may be coincidence.
02:04 zat joined #minetest-dev
02:43 NakedFury joined #minetest-dev
02:45 mos_basik__ joined #minetest-dev
02:49 MikeFair_ joined #minetest-dev
02:52 monte joined #minetest-dev
03:13 GrimKriegor joined #minetest-dev
03:21 rmilan joined #minetest-dev
03:21 Robby joined #minetest-dev
04:02 kaeza joined #minetest-dev
04:31 sol_invictus joined #minetest-dev
04:41 MikeFair joined #minetest-dev
04:47 Miner_48er joined #minetest-dev
05:00 kaeza joined #minetest-dev
05:17 werwerwer joined #minetest-dev
05:46 kaeza joined #minetest-dev
05:46 HLuaBot joined #minetest-dev
05:46 harrison joined #minetest-dev
05:46 rickmcfarley joined #minetest-dev
05:53 Hunterz joined #minetest-dev
06:09 mos_basik joined #minetest-dev
06:27 darkrose joined #minetest-dev
07:17 ninnghazad joined #minetest-dev
08:04 shmanceloticus joined #minetest-dev
09:11 PenguinDad joined #minetest-dev
09:53 chchjesus joined #minetest-dev
09:56 Amaz joined #minetest-dev
10:13 FR^2 joined #minetest-dev
10:28 ImQ009 joined #minetest-dev
12:53 Hunterz joined #minetest-dev
13:03 ImQ009 joined #minetest-dev
13:15 ImQ009 joined #minetest-dev
13:22 iqualfragile joined #minetest-dev
14:00 VanessaE joined #minetest-dev
14:07 AnotherBrick joined #minetest-dev
14:33 rickmcfarley joined #minetest-dev
14:45 dhasenan joined #minetest-dev
15:49 NakedFury joined #minetest-dev
15:56 zat joined #minetest-dev
15:57 Calinou joined #minetest-dev
16:01 Sokomine joined #minetest-dev
16:01 OldCoder A problem world stayed up overnight and so did my client. I may have a patch for the RE-SEND RELIABLE problem.
16:01 OldCoder Sokomine, VanessaE, sfan5, ShadowBot, celeron55 ^
16:02 sfan5 I'd suggest pasting the patch somewhere :)
16:02 OldCoder Of course. But testing is required and I'll also ask you or others 1 or 2 key questions.
16:03 OldCoder Wished to indicate progress on an intractable problem.
16:03 OldCoder Will tweak it further and then post.
16:03 proller joined #minetest-dev
16:04 OldCoder sfan5, just 1 question for now. What are negative consequences of increasing minimum resend timeout from 0.1 to 0.5 ?
16:04 OldCoder A question for anybody else as well ^
16:04 RealBadAngel joined #minetest-dev
16:05 sfan5 OldCoder: a client will experience even more delay if a packet gets lost
16:05 sfan5 s/client/socket/
16:05 OldCoder sfan5, without the increase, the game locks up
16:06 OldCoder Not the client, but the entire game, it appears
16:06 sfan5 like not being able to look around?
16:06 OldCoder The world goes dead entirely; a lockup
16:07 OldCoder For everybody
16:07 sfan5 hm
16:07 OldCoder I have spent about 5 days on this
16:07 sfan5 someone should debug that
16:07 OldCoder <- did
16:07 OldCoder But no explanation
16:07 sfan5 ..by looking at the traces of all threads when the server is locked up
16:07 OldCoder It continued to run
16:07 OldCoder But was busy with millions of RE-SEND RELIABLs
16:08 OldCoder Literally, millions of them
16:08 OldCoder My guess is corruption somewhere. Zeno` gave me a patch to force timeouts but it was insufficient.
16:08 sfan5 maybe someone can mistake the error in your patch
16:08 sfan5 wat
16:08 OldCoder He didn't make a mistake
16:08 sfan5 maybe someone can find the mistake in the patch*
16:08 OldCoder I said, millions
16:09 OldCoder There was no mistake in the patch; it was simply incomplete
16:09 OldCoder connection.cpp as it stands is not functional
16:09 OldCoder Worlds can easily fall into a state where millions of RE-SEND RELIABLEs occur
16:09 OldCoder When I say, millions, I refer to 10 to the sixth power
16:09 OldCoder That is a lot of zeroes!
16:10 OldCoder sfan5, review if you wish: http://minetest.org/lockup.txt
16:11 OldCoder ^ That file was produced by the unpatched server (i.e., upstream as it was)
16:11 OldCoder
16:11 sfan5 "WARNING: ACKed packet not in outgoing queue" o.o
16:11 OldCoder Indeed. Theories?
16:11 sfan5 lemme look at connection.cpp
16:11 OldCoder I was supposed to speak with Sapier but he has not been here for days
16:12 ShadowNinja [NickServ] Last seen  : Oct 13 23:16:21 2014 (18 hours, 55 minutes, 59 seconds ago)
16:12 OldCoder I have missed him since this started
16:12 sfan5 OldCoder: is 70KB/s the expected download speed for minetest.org?
16:12 ShadowNinja OldCoder: /monitor + sapier and wait for him to come online.
16:13 OldCoder Hardly
16:13 OldCoder ShadowNinja, thank you!
16:13 Calinou is it Web download or from-server download?
16:14 OldCoder Zeno` added a forced timeout. Did not work. I added a 50ms delay on resends, code to handle corruption in resend counter field, and increased timeout periods
16:14 OldCoder These changes seem to have improved the situation
16:14 OldCoder Web download from the server can be quite fast. In fact, we are going to gigabit.
16:16 sfan5 ow
16:16 sfan5 the code style in connection.cpp does not follow the guidelines
16:18 sfan5 and why is dynamic_cast used everywhere?
16:21 OldCoder sfan5, proller and VanessaE seem to have a debate about this
16:21 rubenwardy joined #minetest-dev
16:21 OldCoder But the ACKed packet not in outgoing queue error concerns me
16:21 OldCoder As do the millions of resends
16:21 VanessaE I've no real opinion, except that I seem to have no trouble at all with the network code
16:21 OldCoder Usually millions of resends are not needed
16:21 OldCoder VanessaE, that is a clue as your worlds are busy. Yet, see the log file that I posted.
16:22 VanessaE I saw that
16:22 OldCoder My copy of 0.4.10 is probably a few weeks old. Perhaps a temporary issue?
16:22 VanessaE I went looking in my logs for such events and couldn't find anything similar in the busiest of my servers
16:23 OldCoder So, it is a combination of circumstances
16:23 VanessaE though mine aren't all *that* busy these days
16:23 OldCoder Very well
16:23 OldCoder I had a very busy few days
16:26 ImQ009 joined #minetest-dev
16:32 sfan5 OldCoder: speculation: it got an ACK for a packet it doesn't remember because the outgoing queue is probably cleaned, "UpdatePacketTooLateCounter()" seems to suggest this is not critical
16:32 OldCoder All right
16:32 sfan5 OldCoder: more speculation: the outgoing packet queue is cleared before it receives the ack for that packet and then it resends it, receives the ack "too late" .. (cycle continues)
16:32 OldCoder But the millions of RE-SENDs?
16:39 rubenwardy joined #minetest-dev
16:41 GrimKriegor joined #minetest-dev
16:42 proller joined #minetest-dev
16:56 SudoAptGetPlay joined #minetest-dev
16:56 kilbith joined #minetest-dev
16:56 SudoAptGetPlay left #minetest-dev
17:14 Krock joined #minetest-dev
17:36 rickmcfarley joined #minetest-dev
17:46 Miner_48er joined #minetest-dev
17:49 chchjesus joined #minetest-dev
18:14 Du_Draig joined #minetest-dev
18:19 kaeza joined #minetest-dev
18:33 kahrl joined #minetest-dev
18:33 kaeza joined #minetest-dev
18:48 asl joined #minetest-dev
19:44 DuDraig joined #minetest-dev
20:04 ninnghazad|2 joined #minetest-dev
20:06 ninnghazad|2 so i tried to do a pull request, but travis will not compile my patched version, complaining about a missing function in irrlicht, while it compiles and works fine for me here. which irr-version does he use?
20:07 werwerwer joined #minetest-dev
20:08 Amaz joined #minetest-dev
20:15 AnotherBrick joined #minetest-dev
20:26 werwerwer joined #minetest-dev
20:32 kilbith joined #minetest-dev
20:38 PilzAdam joined #minetest-dev
20:41 AnotherBrick joined #minetest-dev
20:58 ShadowNinja ninnghazad|2: Probably 1.7 or 1.8, whichever one you're not using.
21:02 ninnghazad|2 1.7 it must be, already got it compiling
21:37 kaeza joined #minetest-dev
21:39 Fritigern joined #minetest-dev
21:47 diemartin joined #minetest-dev
22:01 diemartin joined #minetest-dev
22:42 proller joined #minetest-dev
22:52 twoelk joined #minetest-dev
23:05 proller joined #minetest-dev
23:20 mos_basik joined #minetest-dev
23:41 exio4 joined #minetest-dev
23:54 twoelk left #minetest-dev

| Channels | #minetest-dev index | Today | | Google Search | Plaintext