Saturday, November 7, 2020

How they handle multi-episode stories in a TV series

Related to: Expecting Short Inferential Distances (yes, linking LW is still a thing).

Below are some remarks I've made in scattered form in several forums, but I thought I'd collect them here.

Problem: A TV series will run for many episodes, and the writers will want to build up a storyline over many of them, which is necessary for a satisfying payoff. But viewers don't necessarily watch it all at once and keep the whole thing in their heads.

I call this the "narrative equivalent" of expecting short inferential distances (see the link above) -- just as people are used to an explanation not requiring a lot of steps, they don't expect a given scene or episode to need a lot of previous viewing to understand.

So, how do writers solve this problem? Here are the four general ways, with examples (feel free to offer more TV series for a category!):

A) Don't bother #1: Each episode is self-contained.

This is known as the "episodic" approach, as opposed to serial. Each episode can be understood without knowing anything about the preceding episodes, so there's no need to worry about this problem at all.

Examples: Star Trek (at least the original or TNG), South Park, The Simpsons (earlier seasons)

Downside: It's hard to feel investment when you know nothing will matter, that things will just return to where they were at the end. It also limits how much build-up (and corresponding payoff) the story can have.

B) Don't bother #2: It's the viewers' job to keep up.

Each episode just assumes you have the entire previous history in your head, maybe with some small reminders of previous relevancies for assistance.

Example(s): Game of Thrones

Downside: It only works for really devoted fans who will binge the whole thing and try on their own to stay up to speed.

C) Formal recaps

You've seen them: the narrator says, "Previously, on [this TV series]...", and then you get enough short clips to establish the relevant context for current episode.

Examples: 24 (after season 1), Battlestar Galactica, Burn Notice

Downside: A lot of people don't like them and think they're cheesy. (I've never understood this mentality, but there it is.) It also may force you to reveal what things from previous episodes will be relevant, taking away their surprise when revealed. There is a slight break in immersion since you have to go "out of the world" for them.

D) In-world recaps

Same as the previous, except you're not taken "out of the world" to do it; instead, there is a scene, within the story, that doubles as exposition of the things a formal recap would cover.

Examples: Breaking Bad, Bojack Horseman

Downside: It heavily constrains how you write the story and forces in pointless scenes that shatter immersion because they're retelling things -- and more slowly! -- the characters should know.

Well, there you have it. That about summarizes the different ways writers handle (or avoid handling) long storylines!

Wednesday, April 1, 2020

An HTTP status code to say "you messed up but I'll handle your request anyway"

So apparently, the Internet Engineering Task Force is going to introduce a new HTTP status code. Just like there's the 404 for "File not found", we're soon going to have "397 Tolerating", similar to a redirect.

The way it would work is, if you send a request that violates some standard, but the server can identify the probable intent of your request, it will reply with a "397 Tolerating" to say, "oh, you messed up, and here's how, but I'm going to reply to what I think you meant".

This is much better than the options we had before, which were either a) unnecessarily reject the request, or b) silently reply to the intended meaning but with no notice that was happening. This lets you tell the client you're tolerating their garbage!

My contact at IETF send me an early draft of the RFC, which you can access at the link below.

RFC 8969: HTTP Status Code 397: Tolerating

Pastebin Link

Wednesday, March 18, 2020

My presentation on using SAT solvers for constraint and optimization problems

Because of the virus we had to hold this meetup virtually, and I was slated to present there for the Evening of Python Coding. Since we made a recording of it, I can now share. Enjoy my not-ready-for-prime-time voice! (Yes, I need to update my profile picture ... badly.

Friday, May 31, 2019

Solving my first (Ghidra) reverse engineering challenge

I was pointed to this challenge by this article, and had heard about the NSA's new Ghidra reverse engineering tool. I was able to solve it without reading much of the article! Here's what I did.

Setup: They give you a compiled ELF binary that runs a program that prompts you for a password. It will tell you if you guessed correctly.

I was going to use this as a change to learn the Ghidra tool, with help from the article.

First problem: It's compiled to run on Linux Debian x86. I was doing it on a MacBook. So, it wouldn't run.

I noticed that Ghidra offers you the option to export the binary as C code (which makes sense, as part of Ghidra's reverse engineering is to convert a binary into assembly and slightly-more-readable C-like code). So, I figured I could just compile that to run on my Mac.

It didn't work though, since Ghidra exports an incomplete version of the code that uses some invalid types, and doesn't define all of its labels properly, which required a lot of manual work that was increasingly appearing like it would take too long.

So I figured I'd just run the binary in a Docker container. I found the Debian i386 version and pulled it down. (I had a long side diversion here getting a setup so I could edit files on my machine that would appear as I want in the Docker container, but the details are uninteresting besides this clever hack for getting the "copy file" command into your docker one-liner.)

So, I was able to run the binary.

Second problem: Somehow, it thought I was using a "debuguer" -- which I would be, at some point, but is strange, because I wasn't using one yet:

Don't use a debuguer !

I looked through the decompiled code to see where it might be doing this and found:

lVar1 = ptrace(PTRACE_TRACEME,0,1,0);
if (lVar1 < 0) {
    puts("Don\'t use a debuguer !");
    /* WARNING: Subroutine does not return */

Hm, okay, well that "if" statement maps to this part of the assembly:

0804868c 79 11 JNS LAB_0804869f

That is, according to this handy reference, "Jump if not sign." Well, whatever "sign" is, I want it to do the opposite -- change "JNS" to "JS", so it "jumps if sign" and skips that block -- the one that gives the mean message and exits the program.

I haven't figured out a good way to format that line, but 0804868c is the (hexadecimal) location within the binary, 79 11 is the hex version of the binary command itself, JNS the assembly term (just a mnemonic device) for that command, and LAB_0804869f is the argument passed to the command -- in this case, a label for where to jump to in the program.

According to that spec, the JNS command maps to 79, and if I want it to be JS, I need to replace it with 78. But...

Third problem: I don't have an easy way to edit binaries.

A convenient way to look at them is as a hex(adecimal) dump in a "hex editor", but I hadn't done that for a few months. Hex editors are useful because they show you the raw hex, the offset where it appears in the file, and, off to the side, the ASCII/text equivalent of that hex. Fortunately, I found this comment, showing how to repurpose my text editor, vim, to double as an easy hex editor. I can just open it up, edit the hex values, save, and it updates everything.

You can then run a utility, "gobjdump", to see the code as assembly and verify that e.g. JNS changed to JS as expected.

Fortunately, when I ran that edited binary, it did what I wanted: not accuse me of using a debugger, and proceed with the rest of the program.

The rest of the hurdles are similar: there's some if-test that can send the execution into a block that you don't want it to. Fortunately, this binary is structured ... stupidly, from a security perspective. It has that same kind of if-block for validating whether you entered the right password. Right afterward, if the check succeeded, it prints the password. Not your input, no -- it prints the correct password, which you can later validate on an unmodified run.

So, it's just a matter of tweaking the code so that the you always enter the block that outputs the password. (They were smart enough not to have the password itself appear as an obvious string in the binary, at least.)

Once I saw the output, I could submit it at the challenge site and very it was correct.

Now for something harder!

Tuesday, February 13, 2018

NAND to Tetris: Nothing short of awesome

Been a while since I posted, huh?

Well, here's an interesting development in my life: I finally started the NAND to Tetris course (part 2) on Coursera, which is based on the instructors' site.

I can say, whole-heartedly, that the course is awesome, and the first thing in a long time that has gotten me into a flow state.

The idea behind the course is this: they take you through (virtually) building a computer from a very basic level, showing all the abstraction layers that produce a computer's behavior.

It starts with logic gates -- specifically, the NAND gate (hence the name), since it's universal. Your first project is to use a hardware simulator to build several other logical gates from NAND. (NAND is just an AND gate with the outputs inverted, so that true and true yield false, everything else yields true.) The second project is to to build the arithmetic logic unit (ALU) in a CPU out of the components, so you basically have a configurable circuit: based on six control bits, you perform one of several possible functions on two 16-bit inputs.

The next project incorporates "flip flops", which allow you to repeatedly execute that circuit while also writing to and reading from some some persisted memory. Your inputs to the circuit then function as a kind of machine code.

The project after that then has you implement functions in that machine code, writing in an assembly language the authors created for the course that has a precise mapping to the machine code inputs in the CPU from the previous lesson. I can honestly say it was a really fun project to implement multiplication directly in assembly language!

Later on you write a compiler from a high level language into assembly (which then converts simply into machine code), in a way that's broken into two steps: a compiler from the high level language into a virtual machine that works on a stack with memory blocks, and a compiler from the virtual machine commands to assembly. I just recently finished that latter part (which comes first).

Those layers then build up to making a game that runs on your CPU, then an OS, and some other stuff I haven't delved into.

In the course of the projects, you use a hardware simulator, an assembler (for converting the assembly language into machine code), a CPU simulator, and a virtual machine simulator.

In order to catch up with the class an have enough leeway for a weekend trip, I went through much of the course over the weekend, and enjoyed every minute of it! I especially like how the break the projects into manageable pieces. For example, in the VM-to-assembly part, it first has you implement simple push/pop/add operations on a stack and run a test to verify that you can compile those commands. Further test suites give you a manageable set of operations to add.

I've written an actual compiler now! (albeit limited use...)

(Consider how fun this is, and how naturally it comes to me, maybe I picked the wrong field.)

One interesting challenge is that the CPU only has two registers call A and D, and only the A register is used when accessing memory, to know which word of memory to look at. It took me a while to figure out how you could say "look at the value in the memory location refered to in the memory location zero." Before I saw how you could do it, I implemented one project by having two separate code branches for the two possible values at the memory location!

Would love to link my code for the projects, but they discourage that to leave the challenge for others to solve.

Monday, November 21, 2016

Another Slashdot memory: Ah, the brick-and-mortar analogies...

So, I remembered another funny Slashdot exchange (again, no link).

Story: Some online retailer got in trouble for filtering their "customer reviews" so that only the positive remarks (about listed products) stayed and everything else was deleted.

A: "Wow, that's pretty scummy. They don't have the right to just clip out negative reviews."

B: "Don't they? I mean, it's their site; they have the right to set whatever editorial standards they want."

C: "Sure, but there's still an issue of consumer fraud and deception. For a brick-and-mortar analogy, imagine that Barnes and Noble started hosing 'book discussion nights' at their stores and promoted it as such. But you quickly notice that whenever someone says something negative about a book sold by B&N -- and only those books -- that person gets a tap on the shoulder from security, pulled aside, and asked to leave.

"In that case, it would indeed be correct to say they can expel whoever they want, but it's still fundamentally fraudulent to represent that event as being for 'book discussion' rather than 'book promotion'."

In other news, I finally found one of the ones that I thought I couldn't! The Armadillo rocket failure mentioned in this post was actually this conversation. The actual (but truncated) exchange went like this (note the links to original comments):

A: "And to think, they want us all to ride in these things commercially...."

B: "John and his team have an excellent track record thus far, and have continued to make safety a main issue. I'm sure that this experience will teach them even more, helping to make the next flight even safer."

C: "You mean even safer than a huge orange fireball?

"I don't know, that's a pretty high bar."

Saturday, February 27, 2016

Some of my geeky tech jokes -- with explanations!

I know the line: explaining a joke is like dissecting a frog; you understand it better, but it dies. Still, not everyone will get these, and I figure I might as well have a place where you at least get a chance. So here are some of my own creations, explained.

Girl, you make me feel like a fraudulent prover in a stochastic interactive zero-knowledge proof protocol ... because I really wish I had access to your random private bits!

Explanation: In a stochastic zero-knowledge proof protocol, there is a prover and a verifier, where the former wants to convince the latter of something. But for proof to work, the verifier must give the prover unpredictable challenges. Think of it like a quiz in school -- it's not much of a quiz if you know the exact questions that will be on it.

The information to predict the challenges is known as the verifier's private random bits Those with a legit proof don't need this, but a fraudulent prover does. Thus, a fraudulent prover in a stochastic interactive zero-knolwedge proof protocol wants access to the verifier's "random private bits".

A historian, a geologist, and a cryptographer are searching for buried treasure. The historian brings expertise on practices used by treasure hiders, the geologist brings expertise on ideal digging places, and the cryptographer brings expertise on hidden messages.

Shortly after they start working together, the cryptographer announces, "I've found it!!"

The others are delighted: 'Where is it?'

The cryptographer says, "It's underground."

'Okay, but where underground?'

"It's somewhere underground!"

'But where specifically?'

"I don't know, but I know it's underground!"

'Slow down there. If all you know is that it's underground, then in what sense did you "find" anything? We're scarcely better off than when we started!'

"Give me a break! I just gave you an efficiently-computable distinguishing attack that separates the location of the treasure from the output of a random oracle. What more could you want?"

Explanation: In cryptography, an encryption scheme is considered broken if an attacker can find some pattern to the encrypted message -- i.e. they can identify telltale signs that it wasn't generated by a perfect random number generator, a "random oracle". Such a flaw would be called a "distinguishing attack". So in the cryptography world, they don't care if the attack actually allows you to decrypt the message; they stop as soon as they find non-randomness to the encrypted data. Applied to a treasure hunt, this means they would give up as soon as they conclude that the treasure location is non-random, which the cryptographer here things s/he's done simply by concluding that it's "underground".

So, 16-year-old Johnnie walked into an Amazon Web Services-run bar...

"Welcome," said the bartender. "What are you drinking?"

Johnnie replied, 'What've you got?'

"Well, we have a selection of wines and the beers you see right here on tap. But if you prefer, we also have club soda and some juices."

Johnnie thought, Wait a second. Why is he telling me about the wines and beers? Does he even realize ... ?

'Okay, I'll take the Guinness.'

"Bottle or draft?"


"Alright, and how will you be paying?"

Johnnie only had large bills from his summer job and gave the bartender a C-note.

"Sorry, but I gotta check to make sure this is real." The bartender took out a pen and marked it, then counted out the change. Johnnie reached for the beer.

"Hold on a second! Make sure to use a coaster!" The bartender slipped one under the glass. "Okay, now enjoy!"

Johnnie lifted up the glass to drink. Before he was able to sip, the bartender swatted it out of his hand.

"WHAT ARE YOU THINKING!?! Don't you know 16-year-olds can't drink!"

Explanation: On the AWS site, they will gladly let you click on the "Launch server" button and go through numerous screens and last-minute checks to configure it, and only at the very last stage does it say, "oops, turns out you don't have permission to do that" -- so it's like a bartender that takes you through a entire transaction, even verifying irrelevant things (like whether the money is real), while knowing the whole time he can't sell to you.

How is a Mongo replica set like an Iowa voter?

In primary elections, they only vote for candidates they think are electable!

Explanation: Databases can have "replica sets" where there are multiple servers that try to have the same data; secondary servers depend on an agreed-upon "primary" to be the "real" source of data. Often times, the primary server goes down, so they have to decide on a new primary, known as a "primary election". But there are some restrictions on who they will vote for -- if they e.g. have reason to believe that a server can't be seen by other members, and in those cases it will regard that server as unelectable. So you can get funny messages about "server42 won't vote for server45 in primary election because it doesn't think it's electable".