"Pest" v. 0xFE.
This is a continuation of "Pest" v. 0xFF.
Edit: A note for the innocent -- this series of articles concerns an algorithm! It is not, at the time of writing, fully implemented anywhere! There is now a prototype! (Thank you, thimbronion!) Or even entirely complete.
Edit: There is a working draft of v. 0xFD. (Note: it may change without warning!)
Edit: Various problems and discussions.
Edit: billymg made a mirror of this page.
The document (very much a work in progress!) is available as a vtree. You will need:
- A V-tron. If you have no idea what V is, start here.
- pest_spec.kv.vpatch
- pest_spec.kv.vpatch.asciilifeform.sig
- pest_spec_FE.kv.vpatch
- pest_spec_FE.kv.vpatch.asciilifeform.sig
Add the above vpatches and seals to your V-set, and press to pest_spec_FE.kv.vpatch.
To "compile" the document to HTML, run make (this requires the "markdown2" utility.)
Please submit any proposed changes to this spec in vpatch form.
The full text is reproduced below, and any reader able to spare the time for a careful reading is invited to comment! 
This version is obsolete! Please read v. 0xFD !
Click here for a printer-friendly version of this text.

 
                 
                
I've waited a while to read this, and I'm not disappointed. I did dislike the focus on IRC regarding the Operator Console; any implementation of Pest I build will use a unique interface that spares me from that terrible protocol. For quite a while I've entertained the idea of an implementation which accepts my messages from one end, and sends all deduplicated incoming messages from another; this idea will require some changes, but nothing that shatters the base idea; I like the idea of a tool which converts the conversation stored in a record file into a doubly-linked list for perusal, with additional linking for the message fields, perhaps similar to Ted Nelson's Zig Zag.
I'd wondered how initial setup would be handled, and this out-of-band method is probably the only decent way to do this. I like the command terminology and its patterns. I seem to recall encryption wasn't part of the older design, but it doesn't place undue burden on implementation; I should be able to implement the Serpent cipher in one day. Surely little-endian isn't used purely because of the Serpent cipher using that? While it's minor, and transparent at the message level anyway, why was this chosen over big-endian, which is also the Internet order?
I'm interested in a client which never dynamically allocates, and so there need be suggested limits on such things as aliases, hearsay buffers, and whatnot. A good default for how frequently to send Ignore messages may be each second, with substitution when the transmission queue isn't empty. Should the fifteen minute time boundary be set in stone, rather than merely be the suggested default; why has there been a shift back to epoch time, what I read referred to as "political time", over individual station time? I'll certainly have more ideas later.
A Pest user is a Pester.
Dear Verisimilitude,
Thanks for the detailed read, and do stay tuned for the following drafts!
The IRC compatibility is for the obvious reason: I intend to use the existing logotron; and, along with probably every other current Dulapnet resident, also intend to use my existing IRC client for a rather long time, and certainly have no desire to wait for someone to write an entirely new one before switching to P2Pism.
Initial keying is to be handled out of band (how else ?) via GPGgram or in person.
Little-endianism is specified because I am sick and tired of catering to the "network order" inherited from the glory days of Sun Micro etc. but existing today on, AFAIK, no recent iron whatsoever.
Re: allocation -- in Ada you can allocate variably-dimensioned arrays on stack frames, this is amply illustrated in FFA, and ought to suffice.
Re: Ignore messages (and likewise rekeys, though these have not yet made it into the draft) -- I deliberately left the frequency of their occurrence unspecified. Things on which two peers do not actually need to agree in order to interoperate, do not need to be nailed down in the spec IMHO.
Re: epoch time -- is simply individual station time. It is impossible to prevent replays in the general case if there is no agreement on time. And I am entirely uninterested in any variant of the protocol where replays cannot be prevented in the general case.
IMHO it is possible for all operators to get within +/-15min agreement without resorting to centralized NTPism.
Yours,
-S
I think big-endian is best, but it's clear only a war will solve the question forever; it's of minor importance. I referred to dynamically allocating after any program prologue, and with only predictable stack allocations. Ideally, space for so many peers and whatnot, not necessarily used then, could be allocated at the start and then left alone.
That rekey idea is interesting; ideally, nowhere near the limit will be purposed, but there are at least a few interesting commands to add. Lastly, the time limit only prevents replays outside of the half hour window, right; I suppose the deduplication is expected to handle what remains.
Dear Verisimilitude,
> ...the time limit only prevents replays outside of the half hour window, right; I suppose the deduplication is expected to handle what remains
Correct.
Yours,
-S
Still reading but I think the deduplication buffer needs to be increased to line speed * 1 hour.
Consider a busy net with more than 257 kB of broadcast traffic within the 1 hour window. I could continuously replay this data at a single node for at least 15 mins, as by the end of the recorded data it will start bumping the earliest packets out of the deduplication queue. Since they are flood routed the node will happily forward that to all connected nodes, overwhelming their deduplication buffer, so on and so forth.
Dear Adam,
Why would replayed data affect the contents of the victim's deduplication queue? You can't "bump" a message out of the queue by sending in the same message, it is already there. And sending in a message which got bumped on account of age will do nothing -- it will be rejected as stale.
And recall that it's 256kB or 1 hour, whichever weighs more, rather than merely 256kB.
The amount of storage required for the queue is in fact exactly linespeed * hour, as specified.
The 256kB figure is simply an optimization for speeding up deduping in a low-traffic net. One could even make a case against it ("all packets should be processed in worst-case time!".)
Yours,
-S
Dear Adam,
On second thought, it isn't clear to me that there is any point in storing messages older than 15 minutes!
Yours,
-S
Ah I see, I somehow misread the "whichever represents a greater number of messages."
Yes 1 hour is more than you need, but I believe you need to store the last 30 minutes of messages to account for allowing up to 15 minutes of forward timestamp skew. A message received with a timestamp 15 minutes in the future, needs to be held in deduplication queue for at least 30 minutes before the timestamp will become stale enough that it won't make it to deduplication.
I do think the specification would be simpler if it only specified holding the last 30 minutes of messages, and drop the bit about the 256 kB, as storing old messages serves no purpose, they will be rejected due to stale timestamp anyways.
Dear Adam,
> the last 30 minutes... ...allowing up to 15 minutes of forward timestamp skew
Good point.
Yours,
-S
> Little-endianism is specified because I am sick and tired of catering to the "network order" inherited from the glory days of Sun Micro etc. but existing today on, AFAIK, no recent iron whatsoever
Network byte order (big endian) is still here and hton*() ntoh*() available to translate as needed. What problems does forcing little endian solve?
The thing about the argument from triviality, is that it's symmetric. What problems does using big endian solve, or avoid? "Network byte order" will continue to be used in IP and UDP headers forever, but there's no particular reason anything within that needs to have it, just because it's going over a wire, as long as the endianness is explicitly specified (which it should be in any case).
I'd also point out that even for C projects, hton and ntoh depend on an unsafe form of type punning that equivocates uint types with their byte stream representations. Better to treat uints as uints, and byte sequences as byte sequences, converting explicitly as needed. This also applies to reading and writing files.
What's the negative implication if I set my station to accept packets with any timestamp? A case that occurs to me where a stale but valid packet could be received is if a relaying peer receives a message and then loses internet connectivity for more than 15 minutes immediately afterwards. If they get it out eventually, and the signature is still valid, why would I want to not see it?
Dear DangerNorm,
If you process stale packets, your station can be flooded with replays of captured legitimate packets, i.e. it is DDoSable, and in fact will serve as a DDoS-amplifier against your peers, as it will relay the stale rubbish (if it happens to be a broadcast) to them.
In a flood-routed network, this would be catastrophic.
Yours,
-S
A single unsolicited UDP packet containing garbage bytes can cause a node to perform O(WOT size) HMAC checks; this seems like a DDoS vector, not a source of resistance.
Dear Matt,
IMHO if your machine is not specced to process worst-case (O(WOT)) verification at GB/s (or whatever your line rate is), your machine is underspecced (or your WOT is far too big, and you ought to split your station into N independent substations.)
Yours,
-S
1.No retransmission? What if a station briefly loses connectivity while a conversation is in progress? The station will then spam the operator with "xyz forked" messages per your spec.
2.Is HMAC-512 supposed to be HMAC-SHA512?
3.Using old logger only means it has to be compatible with IRC for messages. Specifying the station control interface as IRC-like is unnecessary.
Dear apeloyee,
Welcome back! (Please consider to visit dulapnet!)
Re: 1 -- There's a retransmission mechanism in the draft of 0xFD.
Re: 2 -- Correct.
Re: 3 -- The intent is for the station apparatus to be controlled entirely via a (machine-local) IRC client, to avoid the chore of writing GUIware that is largely identical to existing softs.
Yours,
-S
Since packets are ACK'd by echoing, a MitM can memory hole messages in a way undetectable by either sender or recipient, but in considering this, I also realized that with the small size of messages and the limits of human communication rate, this seems like a protocol that could practically use channel saturation as a means of thwarting traffic analysis, in which IGNORE messages are sent to all peers on a fixed schedule, and substituted with real ones when a message enters the transmission queue (or a message needs to be ACK'd). Doubles as a liveness check, and at least makes memory holing conspicuous as a loss of contact.
What's the motivation for having ACKing by echoing, rather than with, say, the hash of the plaintext?
Dear DangerNorm,
> Since packets are ACK'd by echoing...
ACKs are slated to go away entirely in the 0xFD draft: they are quite unnecessary after we introduce getdata.
> a MitM can memory hole messages in a way undetectable by either sender or recipient...
Your (or your peer's) ISP certainly could decide to (or simply through malfunction) throw away every other packet; or even proclaim that the only packets that will be routed are to be to or from e.g. Facebook. IMHO it is not correct to call this a "MITM", as the enemy is unable to discriminate among the packets other than by source or destination, or to usefully substitute his own in place of the original.
Neither is such an action "undetectable", it is entirely detectable via SelfChain and NetChain -- while there is any connectivity at all available to the pair of peers in question.
If anything, Pest may be (AFAIK) the first attempt at a 100% IP-agnostic protocol (i.e. a peer can make use of as many IP addresses as he has control of, and for so long as some of them work some of the time he will be in business) -- and a step on the road to "regaining the original Internet", where "censorship is damage and we route around it automatically."
> a protocol that could practically use channel saturation as a means of thwarting traffic analysis, in which IGNORE messages are sent to all peers on a fixed schedule, and substituted with real ones when a message enters the transmission queue...
This is entirely doable, and even discussed previously in the logs -- but would be rather costly to apply after the protocol is no longer used strictly for chat, but also for e.g. warez.
Yours,
-S