The Alert Reader.

The following is a conversation I recently had with an alert and inquisitive reader. It has been redacted for brevity; handles/names/irrelevant things have been removed. Some text has been turned into links to relevant material.

Anyone still reading this page for reasons related to its original purpose is encouraged to read this; it may help to answer some questions about the long-term direction of my work and how it differs from that of other OS crackpots.


reader: I have some questions about _Unfeature V_ and macros in Loper Lisp. dunno if you have a minute?

me: fire away

reader: okay. so _Law IV_ states that “Compilation is to be considered a form of caching, and thus shall happen solely behind the scenes, like all other forms of caching”, but traditional Lisps do macro expansion at compile-time. So I’m wondering when and how macros are expanded in Loper Lisp, and how they relate to self-modifying code in the manner of Dr Massalin’s Synthesis OS.

me: spot-on! i favour macro as 1st class data type, as in ‘f-expr.’

reader: ah, like in Kernel! So according to Shutt’s thesis, fexprs are a bitch to compile if they have access to the dynamically scoped environment. how do you plan to get around that? if at all?

me: Quite easily, actually. Are you familiar with the concept of ‘dataflow CPU’ ?

reader: I thought about this quite a lot, and it amazes me that so few people seem to realise that call-by-name can be decoupled from the macro mechanism. Yes, but only minimally. treat me as an idiot, if it doesn’t waste your time

me: well

reader: unfortunately it’s very difficult to find resources about non-Von Neumann archs

me: the closest analogy familiar to most computer users is: a spreadsheet!

reader: mmm. I noticed that there was some parallel with reactive programming

me: when something changes, everything it depends on is updated automagically. so, in a dataflow machine, memory is split into cells forming dependency graphs. the cpus, if you can still call them that, merely walk the graphs.

reader: oh weird, I hadn’t even managed to piece that together. so how does that relate to fexprs? I can’t make the cognitive leap

me: well. the difficulties of compiling run-time anything vanish if you switch to a dataflow machine. the concept of ‘compile’ can also be made to vanish. as does the von neumann bottleneck.

reader: oh, because everything at runtime is (in effect) concurrent?

me: correct.

reader: gah, thanks. I’ve been struggling with this for weeks

me: a dataflow machine should be thought of as a collection of circuits rather than procedures. think ‘fpga’ but of arbitrary size. (in practice, fpga with swappable bitstream pages.)

reader: yeah. now I’m thinking S40 (Moore) but for lisp instead of forth

me: in order to picture this concept correctly, you really have to let go of the concept of cpu in the usual sense.

me: so, for instance, think of the terminal you are now typing into.
on a true dataflow box, it would simply be the end of a circuit (screen memory) directly linked to a circuit with the keyboard matrix decoder on the other end of it.
you thus also lose the concept of ‘interrupts’ or ‘processes’.
a dataflow box doesn’t need a scheduler,
or an interrupt controller.

reader: huh. how do you manage when multiple inputs are trying to write to the same output?

me: inputs cannot try anything

reader: also, I can see already how this helps with _Law I_!

me: outputs can depend on one or more inputs.
with the logic function (e.g. AND, OR) specified.
inputs can only sit there and be depended on (or not.)

reader: this is going to take a while for me to get used to, I apologise. do you have any references on this? I apologise if I missed some on your site—I downloaded the whole site and consolidated a single page version for repeated reading, so it’s not for a lack of effort…

me: there are not terribly many references.
look for the post on my site labaled ‘the book’.

reader: Kogge?

me: right.

reader: oh, Kogge has some stuff about this?

me: quite a bit.

reader: blast. it’s so hard to pick it up. going to have to hit up my library by the looks of it

me: and some bibliographic paths to follow.
i must warn you, you won’t find any satisfactory references.
because it never really went anywhere.
i ended up having to derive mostly from first principles.

reader: right. I read a little about Muller C-gates at least

me: afaik nobody’s built a dataflow machine on C-gates. but it strikes me as the loudly, abundantly obvious thing to do.

reader: do you mind if I pass on your comment about fexprs in Loper to interested friends? it’ll be comments in public though.
I’ve been trying to convince them about fexprs for a while now. one of them is on the Scheme R7RS committee.
I tried to get him to add fexprs to Scheme. he told me to go fuck myself, haha

me: do what you like, but let it be known that i hate R7 (and R6) with a passion.

reader: hehe. will do!

me: and can’t stand the systems where ‘f-expr’ is presently found. i’m not particularly interested in implementing lisps on von neumann cpus these days.

reader: yeah, very much understood. I tried to write several alternatives myself…

me: and don’t particularly care how people do it.

reader: but I never really understood the important of dataflow until now.
if you do write any Loper-oriented posts in future, I’d be interested to hear more about the Loper Lisp. you said in 2010 that “I still work on my own Lisp dialect and underlying system, but in the meantime I have been productive in Common Lisp”

me: it is almost entirely absent from the literature, unless you have access to a large university library

reader: in the old days you said Common Lisp is an abomination

me: it’s an abomination, but less so than the supposed replacements.

reader: and then after imbibing more Dr Naggum and following his prescription, I think, you ended up saying you wanted to be a Common Lisp virtuoso and that CL is a Stradivarius compared to the usual C-machine kazoo based shit

me: more or less. it’s a poorly kept stradivarius, in very marginal condition

reader: but what I’m wondering is what the divergence is in your view of a (dataflow) lisp compared to Common Lisp

me: but remains a stradivarius.

me: if you want to see what dataflow feels like on your own skin, pick up an fpga and program in verilog for a bit

reader: I think, though, a huge missing piece of my understanding was just how radical are the changes that dataflow causes

me: it isn’t a general-purpose language by any means (the only data type is: the bit) but will give you a sense of the flavour.

reader: fpga and verilog: sounds better than writing spreadsheets at least… thanks for the tip

me: a terrible language, but will give you a feel for what it’s like to write a program where every line ‘executes’ simultaneously.

me: (because it is a circuit rather than a procedure)

me: the proprietary scripting language ‘mathematica’ supports dataflow. but, unsurprisingly, dog-slow.

reader: I always sort of thought that about CPU designs in general. whilst software designers say “how many cycles is this going to take?”, a hardware designer knows how many cycles something is going to take on a chip: one. or else

me: dataflow never caught on because you absolutely need dedicated hardware.
if you want real efficiency, you can’t even use standard RAM. so it gets dismissed as nuttery, on the rare occasions it comes up.

reader: not even DDR4?

me: nope. the memory bus becomes the bottleneck.

reader: ah, the Von Neumann bottleneck

me: because you can only address one machine word (or, if you pay through the nose for ‘dual port’, two) at a time.

reader: right, I see. so it’s like computing a new generation of Conway’s Life

me: precisely.

reader: this is a little depressing, I must admit. I’ve been thinking obessively about how a Loper device (which I define as adhering to the Laws + the Lisp Unfeatures) cutting as many corners as possible, to get something to ship; to get people to understand; to get people to use it. I know you’ve said it will take astronomical time, and that you only expect it to have a ripple effect at best.
I thought if you were going to do it the proper way, I might be able to hack something together the ghetto way.

me: if you want ghetto, visit ‘Urbit’.

reader: but increasingly, I think that any equivocation over the bedrock is wrong, wrong, wrong.

reader: yeah, I’ve been through Urbit… and TempleOS for that matter. love ‘em. but yeah…

me: the latter is far closer to a decent example than urbit.

reader: I’ve run TempleOS on QEMU. can’t say the same for Urbit of course

me: mr. m gave me a ‘dukedom’ (’DYS’) and i still think urbit is fundamentally retarded. because it builds on a turd.

reader: I managed to get Movitz, the bare metal Common Lisp, working too. tried emulating every Lisp machine that I know of, too. success rate: bloody terrible. (only got one working. you own a real one)

me: x86 is simply not a very habitable place.

reader: went through the T3 Explorer code, at least. some funny stuff in there

me: for ab-initio os work

reader: yeah. just coercing an x86-64 chip into a 64-bit friendly state is… hell

me: http://herpolhode.com/rob/utah2000.pdf (Rob Pike on Systems Research)

reader: it’s a fucking 64-bit chip! why does everything have to be backwards compatible with the pyramids!

me: see ’standards’ section. do you know just what is involved with getting… a usb mouse working?

me: not even talking about a thumb drive. the standards document is book-length.

reader: yeah, I looked at USB drivers. hell. absolute hell.

me: just for usb mass storage. now do video (pick a chip… then come back in a year or two with a working 2d raster display. maybe.)

reader: one of the ramshackle approaches I was thinking of was to rip drivers out of an existing OS, just use those, bury the C-machine below the Loper tide mark. I am slowly unlearning my idiocy

me: been there, done that.

reader: circa 2000, there used to be an OSKit, I think it was developed by the University of Utah, that was modularised for this. that died, nobody maintained it. people are leaving the bedrock behind. they don’t even care these days, just a decade later

me: it’s a fundamentally brain-damaged idea. see ‘bedrock complexity.’

me: the pieces into which something breaks, when it breaks, are inescapably relevant.

reader: yeah, the LEGO blocks

reader: instruction set isomorphic to a high level programming language. that’s why I asked you today about the Loper Lisp

reader: very interested in the LEGO blocks you have in mind

me: if an interrupt could stop your cpu between, say, an ADD instruction executing, and the results being copied from the ALU to memory, there would be chaos.

me: (there were machines where this could actually happen.)

reader: heh!

reader: one of my meme-mottos: “NOTHING IS THREAD SAFE”

reader: (I bring it out when somebody writes thread safe code and then experience the natural thread bug that takes them by surprise)

me: this is then papered over by locking.

reader: or STM

me: resulting in a single-cpu machine that wedges routinely.

me: (see python’s global interpreter lock, etc)

reader: (as you know, there is no such thing as broken software. it either works, or it doesn’t work. nice parallel to the two speeds, now that I think about it)

reader: ah yes, the GIL…

reader: I also love when a dual core machine runs everything on one core because its kernel is too stupid to realise what can be shunted off to the other core

reader: (and this is “just works” Appleware)

me: the entire procedural paradigm is incurably stupid

me: i’m not particularly interested in attempts to improve it.

me: the correct number of cpus is… zero. instead, dataflow fabric that one could buy ‘by the kilo.’

me: so, in a dataflow machine, memory is split into cells forming dependency graphs.

me: performing the functions of both storage (nonvolatile) and computation.

reader: gold dust like this is what my mind is going to be running on until I can pick up Kogge

me: pick up a book on fpga and see the diagram of how the macrocells and block RAM are connected.

reader: er… CPU merges into RAM, essentially?

me: correct.

me: you end up with LUTs for the logic, programmable crossbar interconnects for the routing, and small giblets of memory scattered throughout.

me: one could use conventional DRAM, and even magnetic disk, but these would have to be used as transparent swaps

reader: have you fabbed any working fpga dataflow designs yet? or are they all in verilog?

me: i have quite a bit of experimental code.

me: presently stuck on reverse-engineering the xilinx toolchain.

me: you can’t actually do what i have in mind unless you have access to either 1) silicon fab 2) fpga with entirely known design

me: lacking the $1B for (1), i went with (2)

This entry was written by Stanislav , posted on Sunday December 15 2013 , filed under Computation, Hardware, Hot Air, Lisp, LoperOS, Philosophy, Reactions . Bookmark the permalink . Post a comment below or leave a trackback: Trackback URL.

9 Responses to “The Alert Reader.”

  • Phil Jones says:

    Clearly an alert reader.

    BTW, did you watch this? http://www.youtube.com/watch?v=dkIuIIp6bl0

  • threebuttonmouselifeform says:

    Hi Stanislav,

    1. In the above exchange when you were asked to explain the difference between Common Lisp and a hypothetical “Dataflow Lisp” you answered: “If you want to see what dataflow feels like … pick up an FPGA and program in Verilog … [Verilog] isn’t a general purpose language … a terrible language, but will give you a feel for what it’s like to write a program where every line executes simulatenously”

    2. Your conversation partner used Loper Lisp and Dataflow Lisp (seemingly) interchangably and there was no objection or disambiguation from your part.

    3. In the past you have also mentioned that a “proper computer” would have an “instruction set isomorphic to a modern high level programming language”, and your conversation partner mentioned this here.

    4. You also stated “the procedural paradigm is incurably stupid”.

    Now,

    1. I have an intuitive wishy washy fuzzy understanding of that Sussman and Steele paper which established (at least some kind of) equivalence between alphas (actors (i.e. concurrent asynchronous)) and lambdas.

    2. I have an intuitive fuzzy wishy washy understanding of FPGAs

    3. I have more concrete understanding of (synchronous!) data flow programming (with some hands on experience).

    4. Finally, I have an understanding of how a compiler, when it is given (or derives) some knowledge such as “this function is pure” or “this function is associative” or “this function is idempotent” can transform a program that executes sequentially to a program that (also) executes concurrently.

    So, I (perhaps mistakenly) conclude that it’s possible (and even desirable) to have a programming language (not unlike a Maclisp (e.g. Common Lisp)) with some notions of “procedures” (i.e. a sequence of expressions whose purpose is purely operational (i.e. not their value but their “side effect”)) compile down to an (even asynchronous) data flow graph. I also trust, but have no understanding on quite how, this graph could then become a physical circuit on a physical machine.

    So, what I am not clear about is whether Loper’s “bedrock abstraction” ISA will be a traditional Lisp machine-like ISA (i.e. procedural) but with pervasive implicit concurrent execution, or whether it will be something more (semantically) like Verilog which meets your five “basic qualifications” of Lispiness?

    P.S. I appreciate your efforts and your writings greatly. I hope you find the material comfort needed to devote more of your time and energy to this worthwhile pursuit.

  • St Gregory of Nyssa says:

    Couldn’t you get an FPGA of completely know design more easily by writing a simulator, as opposed to using one in real life?

    • Stanislav says:

      Dear St Gregory of Nyssa,

      Certainly, there is always ‘Verilator’ and the like.
      But try actually using it for something non-trivial.
      Simulating parallel logic on a sequential machine, even purely for design purposes, is an utter misery.

      Yours,
      -Stanislav

      • St Gregory of Nyssa says:

        After reading your answer I feel sorry for asking such a dumb question. Thanks for explaining.

        You know, however, going with Option #1 won’t require a billion dollars. You could hire an existing semiconductor foundry and get by with just a million dollars.

  • Colt Wilkens says:

    A million dollars is still extremely cost-prohibitive for a single person. Even with access to a service like MOSIS, having a single, small chip (around 0.9 mm x 0.9 mm) fabricated costs upwards of $25,000 on older manufacturing processes.

  • St Gregory of Nyssa says:

    If I may, I would like to question your idea of a bedrock that one can read and write in directly.

    I do not doubt that we can do better than assembly or even scheme as the lowest foundation. I’m familar with Dr Kogge’s book front to back. But as a program gets more complicated, it starts to rely on or plead for more and more language constructs.

    I believe, and feel free to disagree, that Scheme R4RS was orthogonal and clean. Yet, a person trying to write something that isn’t trivial would quickly reach the point of diminshing returns, and progress on his project gets slower and slower. A third party reading his code will also find it less and less intelligible. The CLOS in Common Lisp, on the other hand, holds off this threshold for another half mile. But with huge complexity of implementation.

    Do you have proof that a language with an elegant specification can handle non-toy examples and programs of industrial strength? When your machine actually gets made this will be even more of an issue, since it will raise the standard of program complexity.

    PS. My previous question was dumb because I realized that one would hit the speed ceiling really fast, if he simulated parallel with sequential. It’s still possible, however, to produce a far superior version of Verilator using Common Lisp.

  • Observer says:

    Speaking of merging CPU and RAM, you might take a look at the work discussed here:
    https://www.schneier.com/blog/archives/2013/12/friday_squid_bl_403.html#c2821863

  • “Redesigning computer architecture” is something well beyond my pay grade, though I made a limp wristed stab at a wildly impractical attempt at finding ways of generalizing analog computing models when I was loafing around waiting for people to sign my dissertation. I do recall quite a few interesting ideas in the “progress in computing” archives. Some of them even resembled what I think you’re getting at with data flow architecture (Hava Siegelman’s is the only one whose name I can recall from memory; she has the great advantage of being alive and working on such things, including hardware, I think, at Umass).

    Perhaps equally unhelpful is a recent observation that Arthur Whitney seems to be working on something that … at least in principle … could end up doing urbit-like things. I had wondered why he’d want to put the K language on a cell phone, and figured it was for some dumb reason like, “it’s easier to write programs on a tablet when you can write complex things in one line of code.” I now suspect he’s writing this OS to solve urbit like problems. No sign of a security model yet, but Arthur is one to watch
    http://archive.vector.org.uk/art10501320

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">