Time Nick Message 12:33 erle about the worst thing in improving the build is that to get a correct incremental build i need to rebuild fully many times. 12:34 erle btw, i suspect that a large amount of the complexity reduction in builds if you do them top-down instead of bottom-up is that cmake tries to figure out where the libraries are, e.g. /usr/lib/i386-linux-gnu/libGL.so 12:35 erle if you need the full dependency tree before the build (e.g. to toposort it), you absolutely need to know that 12:35 erle but a recursive top-down build can just give “-lGL” to the toolchain and figure out later what path was used for that 12:36 erle HuguesRoss which leads me to the question: for the huge brittle builds where you nail every dependency down, do you have special tooling or do you instrument some compiler stage? 12:40 erle i have looked at the parts of both gcc and clang where they figure out the dependency information for -MD and the code is messy and can't easily be extended to do what i need, so i wonder if there exists tooling that can do it better 12:45 HuguesRoss As far as header dependencies go, that's covered by our proprietary static analyzer. On projects that use it we generate PCH by usage statistics and let the system automatically adjust include directives in files. On other projects it's just left up to the programmers to figure that shit out, no special solutions there. 12:45 HuguesRoss For library linking, iirc we either build from source as part of the process or point to the compiled binary which is tracked in source control. Package managers have gradually snuck in, but they're only connected to internally-maintained package repos and also they break a lot soooooo 12:50 HuguesRoss The system was originally made for a project with very long compile times, iirc it was a tough sell but the prod's architect felt it was worth investing in a solution to automatically suggest compile speed optims in the code rather than make devs go through it all. This was the first project I contributed to at the company though, so it's been like 12:50 HuguesRoss half a decade. My memory of the specifics is getting hazy at this point 12:52 HuguesRoss To it's credit, I recall it working very well. But it's a pain in the ass to setup on an existing codebase, so only a few productions still use it 13:02 erle i am *really* curious on your opinion on my proof-of-concept once i got it done 13:03 erle right now i have hamstringed myself a bit, as every single change in the build rules rebuilds all affected targets, even if nothing would change 13:03 erle this does, of course, mean my build rules are not fine-grained enough 13:03 erle i.e. a build rule to build all object files may lack a header include path only used by one particular thing 13:04 erle but when i change the general rule, of course everything needs to be recompiled 13:04 erle the solution is, of course, to not make overbroad build rules, which is trivial to do once it is working 13:04 erle since every target has a non-existence dependency on possibly more specific build rules 13:04 erle i.e. once a more specific rule for a target appears, it is out of date, even if the generic rule did not change 13:05 erle this is a major reason why i want to handle ne deps 13:05 erle much more important than the header file case 13:07 erle HuguesRoss btw regarding defines and union builds, how do people solve it? do they use some kind of guard that just aborts the compile if you want to define a thing already defined or so? 13:07 erle i mean, i'm not doing the union build thing 13:07 HuguesRoss We don't :) 13:07 HuguesRoss It's hell 13:07 erle but c++ compliation seems to take a *really* long time comparable to how long it could take 13:08 erle like, in general, not only for minetest 13:08 HuguesRoss Definitely 13:09 erle btw, whoever managed to snuck a single c file into the source code that messes up my initial clean model of ”for every .o i need a .cpp” should probably make *really* sure that it is a polyglot 13:10 erle and given that some libraries look like c libraries, i wonder, how to do that 13:10 erle HuguesRoss do industry people mix C and C++ arbitrarily too? 13:10 HuguesRoss no 13:10 HuguesRoss But I've seen in-house compiled langs which uhhhhh 13:10 HuguesRoss Sure is a decision you can make 13:11 HuguesRoss One thing I would warn about this project of yours--and I'm sure you've already considered it but I'm saying it just be 100% certain--we want to answer that our code is not harder to compile than before, and we want to be sure that things won't fall apart due to lack of knowledge/maintenance if you leave later. I don't think it's impossible to meet 13:11 HuguesRoss those requirements, but they are extremely important for the project's health. 13:11 HuguesRoss s/answer/be sure 13:12 HuguesRoss dictation error :) 13:12 erle HuguesRoss i'm with prof. regehr on this “surely you can add INT_MAX+1, but that's like saying ‘someone told me that in basketball you can't hold the ball and run and look i'm still doing it’.” 13:12 erle i.e. that person is definitely not playing basketball anymore 13:14 erle HuguesRoss literally 100% of my solution is just shell scripting. the code is well-documented, has been working since 2014 (and other implementations were made before) and someone once wrote a phd thesis on the approach (but my code is better). 13:14 erle i think it was a phd thesis. it was a thesis. 13:14 erle i can look it up some time 13:15 HuguesRoss Yeah, I'm not saying it as an attack or anything. I just wanted to state the obvious as a 'just in case' deal, if your solution meets our needs in this regard then that's fine imo 13:15 erle nah, i get it 13:16 erle the approach i am using has been invented by daniel j bernstein 13:16 erle i only implemented it since djb never released his stuff to the public, only documented it 13:17 erle he's the kind of guy who will not respond to questions about licensing of his source code for years 13:17 erle and then at some point will say “fine, it's public domain, leave me alone” but by that point others have implemented it from his lectures 13:17 erle or his notes 13:19 erle HuguesRoss, the biggest two tricks are two programs. one that essentially does “make sure the files given on the command line are up to date, i.e. rebuild them if necessary and rebuild the current target if you rebuilt anything” and the other ”make sure the files given on the command line do not exist, i.e. abort if they exist and rebuild the current target in future builds if they do” 13:20 erle it's a bit weird and i had my difficulties understanding why it works, but essentially, the first thing is what declaring a dependency ultimately means in imperative terms. 13:20 erle so basically, you have shell scripts with two additional commands 13:21 erle unfortunately, as useful as the “make sure that the stuff is up to date” as a command is, the way the toposort bottom-up build systems are structured, it is impossible to implement this kind of thing in them. 13:22 erle so basically djb figured out that build systems can be much more simple if you traverse the graph of build targets from a different direction 13:23 erle wouldn't be the first time when people are wrong for a long time. i mean, postel's principle (be liberal in what you accept) and the shotgun parser pattern (validate during processing at the latest possible point) are both 100% wrong, yet it persisted for what, 40 years or so? 13:24 erle yet, everyone who is shown examples of why they are wrong immediately gets it, even if they can't actually program very well 13:24 erle well, maybe not everyone immediately. but many people take less than 20 minutes. 13:26 erle it's similar to the argument that the world is round based on that ships masts appear on the horizon first or so 13:26 erle you may not figure it out yourself, but it's kinda obvious 13:26 erle round, as in, curved 13:32 erle HuguesRoss i value scepticism and figuring out if stuff fits constraints a lot. if i review code, i try to find every reason why the code is not good – so if i can't find any or the coder convinces me my reasons are not important, then i approve it. so do not worry about me taking it as an attack. what i don't like is “i do not understand it, yet i am against it and i will not engage with the topic enough to be e 13:32 erle ver convinced”, because i think that it is counterproductive to have strong opinions in that case. 13:33 HuguesRoss Sure 13:34 HuguesRoss Though I think lack of understanding can also be useful feedback--an indication that documentation or clarity adjustments are needed to ensure a solution remains maintainable in the long term 13:34 HuguesRoss It shouldn't prevent a solution entirely though 13:34 erle that is true. i have written extensive documentation since 2014 exactly for that reason. and at work, i prefer to let people review my docs that are outside of a project. 13:35 erle because everyone who has worked along me for months will take something for granted. 13:35 HuguesRoss Good! I've been on teams at work on both sides of the documentation spectrum, and the side with more docs is usually a lot better to work with 13:35 erle my experience is a that when you add a new member to a team, you can immediately figure out what has become institutional knowledge that was never written down. 13:35 HuguesRoss Yes 13:35 erle just force them to, e.g. setup a project themselves and you realize that everyone forgot to mention that one library or so 13:36 HuguesRoss It's also a good way to discover problems in your development pipeline that everyone worked around 13:37 erle well, there is the normalization of deviance thing 13:37 HuguesRoss It's why at work my PC is configured with an absolutely batshit insane locale that doesn't exist--otherwise, invariably some float will be written to a file incorrectly 13:37 erle nice 13:37 erle where you woork around a thing so long that you take it for granted, but ultimately, it's bonkers 13:37 HuguesRoss In a multi-language office it's extremely important to avoid 13:37 erle but wouldn't a bogus locale mean that it falls back to C or so? 13:38 erle or is that the intention? 13:38 HuguesRoss Not exactly, this is Windows so I've basically made my own that provides bogus formatting and similar rules 13:38 HuguesRoss basically, anything locale-sensitive will explode when it makes contact 13:38 erle my friend recently installed debian with dutch keyboard layout and uk english language. first surprise: there was no trash bin anymore! there was, however, a wastebasket. ;) 13:39 HuguesRoss hah 13:39 erle also color vs colour and so on 13:40 erle btw, i also value software that is just *done*. at work i am the person for the fire-and-forget-code (i.e. you can not ever change it after the deadline, because it has been deployed on some device or the behaviour must never change or so). 13:40 erle i have met a lot of people that claim stuff can not ever be done, but i aggressively reduce scoee instead of quality. 13:41 erle scope 13:41 erle a hilarious consequence of this is that two of my programs are used internally in several projects and i only learned about it recently, because no one ever had any complaints (apparently i cut the scope small enough and tested them well enough) 13:42 erle this is, of course, not feasible for big programs unless you use erlang or so 13:43 erle HuguesRoss i am curious, how would you rate this build rule to build arbitrary .o files? https://github.com/linleyh/liberation-circuit/blob/master/src/default.o.do 13:44 erle like, on understandability and maintainability 13:44 erle heh, it seems to trip up github syntax highlighting 13:46 HuguesRoss I don't think it should be an issue for someone familiar with shell scripting and the tools in use 13:47 erle technically you don't have to use shell scripts btw 13:47 HuguesRoss I assume the precompile check at the top is for PCH? 13:47 erle yes, and it was both not added by me (it was added by avery pennarun) and is a bit too broad. it does not belong into a general build rule IMO 13:47 erle it belongs into a specific build rule for precompile.o 13:48 erle HuguesRoss so given *your* understanding of shell scripts, is there anything that seems unclear at first and second glance, besides “what do the redo- commands mean?” 13:49 HuguesRoss yeah, I think I'd agree. It did strike me as a bit too specific a case, but I wasn't sure if it would be easy to extract. This is reminding me just how rusty I've gotten over the years lol 13:49 erle it's very easy to extract. you would just put the rule in precompile.o.do 13:49 erle basically default.o.do is a shell script building all .o files, but any other prefix builds the one with that name (and has precedence) 13:50 erle the single drawback is that you'll find it hard to produce a specific rule for default.o, but so far i have never gotten that complaint (neither seems to have any other implementerr 13:50 erle ) 13:50 HuguesRoss Yeah aside from the redo commands, I'd be interested in a more detailed explanation of what strace is looking for. From the discussion and explanation I can guess, but with only the comment it could use a little more clarification 13:51 erle basically, compilers can output stuff they found (dependencies), but not stuff they did not find, but that may dirty the build in the future (non-existence dependencies) 13:51 HuguesRoss yeah 13:52 HuguesRoss It's basically getting a list of all headers being referenced, yeah? 13:52 erle not exactly 13:52 erle it's a list of all the possible headers and precompiled headers and i think some other stuff that might influence a future build, but did not exist at the time 13:52 erle think “i create a header file in /usr/local” or so 13:53 HuguesRoss I see 13:53 erle it's the same type of dependency as on a not-yet-existing more specific build rule 13:54 erle e.g. if you build precompile.o with this rule, it has a dependency on default.o.do, but a non-existence dependency on precompile.o.do 13:54 erle if default.o.do is modified, the target is being rebuilt. but if precompile.o.do is created, the target is also being rebuilt! 13:54 erle in basically all build systems i have encountered the latter case is quietly swept under the rug 13:54 erle but it does matter if you, e.g. bisect 13:55 erle because then build rules may change all the time 13:55 HuguesRoss Gotcha 13:56 HuguesRoss Expanding the explanation or having a separate file that gives a more detailed explanation of non-existance deps could help then, I think 13:56 erle anyways, i hope it is obvious why this approach needs an order of magnitude less code than the ”try to get all dependencies and toposort them” approach which e.g. make uses 13:56 HuguesRoss because your explanation here was clear, but I didn't get the same from the code 13:57 erle true, but the reason for that is probably because most build systems do not handle non-existence dependencies 13:57 erle so it's not like you can build on previously-existing knowledge there 13:58 HuguesRoss That's why it's gotta be explained though, it 13:58 HuguesRoss *it'll be new territory for most 13:58 erle in the build rule? i am not sure about that. the cmake build rules are also not documented inline, it is documented in cmake docs. 13:58 HuguesRoss That's why I mentioned it could be in a separate file 13:58 erle i mean i can do it, but this concept is already explained in the man page of ”redo-ifcreate” 13:59 erle i see 13:59 HuguesRoss In that case, you could also do a "see redo-ifcreate" or something just in case 13:59 HuguesRoss Some readers will just shrug and move on, even if they shouldn't 14:00 HuguesRoss err, not 'also' but 'instead' is what I meant there 14:00 HuguesRoss if you know that there's a good explanation in one of the relevant manpages, that's useful for new readers 14:02 HuguesRoss Anyhow, I *do* also recommend checking with at least one other coredev on this. I can give my opinions and suggestions, but at the end of the day you need buy-in from two people minimum 14:02 HuguesRoss I don't wanna work you up only to discover no-one else agrees with me! 14:07 erle HuguesRoss i *wrote* the man pages years ago 14:07 erle so if my explanations here are good, ig the man pages are good too 14:10 erle HuguesRoss i'm pretty sure that if i manage to indeed make something that is useful for users it will eventually make devs care. given that stuff that is way more complex and arguably less useful has been added too. 14:21 erle HuguesRoss, btw here is some of the notes of djb that i used for my implementation: https://cr.yp.to/redo/honest-nonfile.html https://cr.yp.to/redo/honest-script.html 15:00 erle okay, it seems i have cut it down to a single linker error, i forgot zstd 15:00 erle so if that was the last thing, i'll post the dependency graph soon-ish 15:02 erle also, ”faster” means “faster for the same workload”. when my approach discovers a shit ton of dependencies more, it spends more time checking dependencies. but less time actually rebuilding. i.e. you can replace a heavily CPU-bound full rebuild into a heavily IO-bound incremental rebuild. 15:03 erle s/into/with/ 15:12 erle okay, it seems it works. took a bit more than 80 lines of code, but it's a dirty solution. it can probably be done in less. 15:24 erle ; ./bin/minetest 15:24 erle ./bin/minetest: error while loading shared libraries: libIrrlichtMt.so.1.9.0.7: cannot open shared object file: No such file or directory 15:24 erle great, can someone please tell me what the correct way to do this is? 15:24 Krock LD_PRELOAD="/absolute/path/to/libIrrlicht.so" 15:24 Krock or install it 15:24 erle can i just use libIrrlichtMt.a instead? 15:24 erle when linking 15:25 Krock then you'd need to link it somehow statically 15:25 erle is that a problem? it's not like anyone else is going to use it 15:25 Krock especially because nobody is going to use it, I wish you good luck in trying to get it to work 15:25 erle like, i have a dozen programs or so here that use irrlicht, but only one, minetest, uses irrlichtmt 15:25 erle uh, no one uses libIrrlichtMt.a? 15:26 erle i mean it got generated 15:30 erle > 2022-07-24 17:30:00: WARNING[Main]: Irrlicht: Warning: The library version of the Irrlicht Engine (1.9.0mt6) does not match the version the application was compiled with (1.9.0mt7). This may cause problems. 15:30 erle lol okay 15:30 erle i'm not going to overhaul the irrlichtmt build now (it's more than double the cmake bullshit than minetest proper has), but my proof-of-concept works. 15:31 erle also there is no need to do so 15:32 erle now i generate the dependency graph 15:35 erle oh wow, it works :) 15:36 erle i doubt anyone who will actually take the time to comprehend it will doubt that it's a better approach (at least on linux). 15:47 erle uh-oh 15:48 erle i have ~97k edges in the dependency graph and redo-dot is not yet done dumping it. 15:50 erle uh, the full dependency information i gathered is 26mb. no wonder it's next to impossible to figure it out before the build. 15:52 erle HuguesRoss i believe i am very deep in “i can not plot this” territory 15:55 Goobax[m]1 Salut 15:55 Goobax[m]1 Hello 16:17 erle HuguesRoss celeron55 this is the set of dependencies and non-existence dependencies my ~85-lines-of-shell-script-build figured out for src/mapgen/mapgen.o https://mister-muffin.de/p/b82a.jpg – the dashed lines represent relationships that cmake can not represent. the solid lines represent relationships that cmake could represent, but likely does not. as far as i can see, the graph is only missing dependencies on th 16:17 erle e compilation toolchain binaries and environment variables. adding that is trivial, but requires a full rebuild. what do you think? the irrlichtmt dependencies may seem weird, but they rely on what the compiler reported, so whatever happened there is the fault of whoever wrote minetest or gcc, but it can't have been me. 16:18 erle if you have any issue about miscompilations or can remember a file where it happened, i'd be interested, so i can investigate 16:19 erle oh yeah, also by design this can not result in the same state that cmake can be in, where you have to delete files to make it working 16:23 erle this also means that any of the dashed lines offers a constructive proof of how to cause a miscompile 16:28 erle oh, i was mistaken, the graph is indeed incomplete. 19:57 erle HuguesRoss celeron55 here is my WIP PR for reliable builds. the text is very long, but you can skip to “How to test” to see it in action https://github.com/minetest/minetest/pull/12592 19:57 erle in fact, i think the PR description is longer than the actual code hehe 19:58 erle also please reopen #11749 as i have shown that it is possible to solve this issue in <100 lines of shell 19:58 ShadowBot https://github.com/minetest/minetest/issues/11749 -- CMake does not capture all dependencies, causing erroneous incremental builds & build failures 20:06 sfan5 that PR is a laughable piece of unportable hacks 20:06 sfan5 I know you didn't ask for my honest opinion but here is it anyway 20:10 schwarzwald[m] Are we getting anything more detailed or is that it? 20:10 erle you asked for a proof of concept 20:12 erle sfan5 you are entirely correct that this is a pile of quick hacks. i merely wanted to prove that it is *possible* right now, because several people asked me to and also because i wanted to dispel your intuition that solving these issues is way too hard to be attempted or maintained. 20:13 erle if there is any positive feedback about it, i will of course work on it until people stop wondering how i find the motivation to pile shit up that high 20:13 sfan5 I have a feeling the part I consider a hack is pretty central to your solution 20:13 sfan5 but don't let this prevent you from collecting feedback 20:13 erle the strace thing? 20:13 erle that's incredibly hacky 20:14 erle the strace thing mostly serves to demonstrate that you will only know the full set of dependencies after the build, regardless of how much you try, unless you never have fallback paths and freeze the filesystem state (which minetest is both not doing). 20:15 rubenwardy we have so many bigger issues than this 20:15 erle could be, but i happen to have a solution and am committed to work on it until its good enoughtm 20:15 erle also you probably can rebuild in 3 minutes, i can't 20:17 erle you don't have to like my code, but the approach to build top-down and using hashes instead of bottom-up and using timestamps is objectively superiour in every case where you are not limited by speed of hashing (i.e. if you build a 16GB sd card image, a hashing approach is likely garbage tier). 20:17 schwarzwald[m] I think decreasing the time between compiles on slower machines is super valuable long term. 20:17 erle schwarzwald[m] you only say that because you have a slow machine 20:17 erle which ties into my motive of why i made this, because i have a slower machine 20:17 erle schwarzwald[m], please try it then 20:19 schwarzwald[m] schwarzwald[m]: Well, if all potential developers have fast machines then never mind. 20:20 erle hehe 20:20 erle schwarzwald[m], pls come to #minetest