Friday, May 31, 2019

Solving my first (Ghidra) reverse engineering challenge

I was pointed to this challenge by this article, and had heard about the NSA's new Ghidra reverse engineering tool. I was able to solve it without reading much of the article! Here's what I did.

Setup: They give you a compiled ELF binary that runs a program that prompts you for a password. It will tell you if you guessed correctly.

I was going to use this as a change to learn the Ghidra tool, with help from the article.

First problem: It's compiled to run on Linux Debian x86. I was doing it on a MacBook. So, it wouldn't run.

I noticed that Ghidra offers you the option to export the binary as C code (which makes sense, as part of Ghidra's reverse engineering is to convert a binary into assembly and slightly-more-readable C-like code). So, I figured I could just compile that to run on my Mac.

It didn't work though, since Ghidra exports an incomplete version of the code that uses some invalid types, and doesn't define all of its labels properly, which required a lot of manual work that was increasingly appearing like it would take too long.

So I figured I'd just run the binary in a Docker container. I found the Debian i386 version and pulled it down. (I had a long side diversion here getting a setup so I could edit files on my machine that would appear as I want in the Docker container, but the details are uninteresting besides this clever hack for getting the "copy file" command into your docker one-liner.)

So, I was able to run the binary.

Second problem: Somehow, it thought I was using a "debuguer" -- which I would be, at some point, but is strange, because I wasn't using one yet:

Don't use a debuguer !

I looked through the decompiled code to see where it might be doing this and found:

lVar1 = ptrace(PTRACE_TRACEME,0,1,0);
if (lVar1 < 0) {
    puts("Don\'t use a debuguer !");
    /* WARNING: Subroutine does not return */

Hm, okay, well that "if" statement maps to this part of the assembly:

0804868c 79 11 JNS LAB_0804869f

That is, according to this handy reference, "Jump if not sign." Well, whatever "sign" is, I want it to do the opposite -- change "JNS" to "JS", so it "jumps if sign" and skips that block -- the one that gives the mean message and exits the program.

I haven't figured out a good way to format that line, but 0804868c is the (hexadecimal) location within the binary, 79 11 is the hex version of the binary command itself, JNS the assembly term (just a mnemonic device) for that command, and LAB_0804869f is the argument passed to the command -- in this case, a label for where to jump to in the program.

According to that spec, the JNS command maps to 79, and if I want it to be JS, I need to replace it with 78. But...

Third problem: I don't have an easy way to edit binaries.

A convenient way to look at them is as a hex(adecimal) dump in a "hex editor", but I hadn't done that for a few months. Hex editors are useful because they show you the raw hex, the offset where it appears in the file, and, off to the side, the ASCII/text equivalent of that hex. Fortunately, I found this comment, showing how to repurpose my text editor, vim, to double as an easy hex editor. I can just open it up, edit the hex values, save, and it updates everything.

You can then run a utility, "gobjdump", to see the code as assembly and verify that e.g. JNS changed to JS as expected.

Fortunately, when I ran that edited binary, it did what I wanted: not accuse me of using a debugger, and proceed with the rest of the program.

The rest of the hurdles are similar: there's some if-test that can send the execution into a block that you don't want it to. Fortunately, this binary is structured ... stupidly, from a security perspective. It has that same kind of if-block for validating whether you entered the right password. Right afterward, if the check succeeded, it prints the password. Not your input, no -- it prints the correct password, which you can later validate on an unmodified run.

So, it's just a matter of tweaking the code so that the you always enter the block that outputs the password. (They were smart enough not to have the password itself appear as an obvious string in the binary, at least.)

Once I saw the output, I could submit it at the challenge site and very it was correct.

Now for something harder!

Tuesday, February 13, 2018

NAND to Tetris: Nothing short of awesome

Been a while since I posted, huh?

Well, here's an interesting development in my life: I finally started the NAND to Tetris course (part 2) on Coursera, which is based on the instructors' site.

I can say, whole-heartedly, that the course is awesome, and the first thing in a long time that has gotten me into a flow state.

The idea behind the course is this: they take you through (virtually) building a computer from a very basic level, showing all the abstraction layers that produce a computer's behavior.

It starts with logic gates -- specifically, the NAND gate (hence the name), since it's universal. Your first project is to use a hardware simulator to build several other logical gates from NAND. (NAND is just an AND gate with the outputs inverted, so that true and true yield false, everything else yields true.) The second project is to to build the arithmetic logic unit (ALU) in a CPU out of the components, so you basically have a configurable circuit: based on six control bits, you perform one of several possible functions on two 16-bit inputs.

The next project incorporates "flip flops", which allow you to repeatedly execute that circuit while also writing to and reading from some some persisted memory. Your inputs to the circuit then function as a kind of machine code.

The project after that then has you implement functions in that machine code, writing in an assembly language the authors created for the course that has a precise mapping to the machine code inputs in the CPU from the previous lesson. I can honestly say it was a really fun project to implement multiplication directly in assembly language!

Later on you write a compiler from a high level language into assembly (which then converts simply into machine code), in a way that's broken into two steps: a compiler from the high level language into a virtual machine that works on a stack with memory blocks, and a compiler from the virtual machine commands to assembly. I just recently finished that latter part (which comes first).

Those layers then build up to making a game that runs on your CPU, then an OS, and some other stuff I haven't delved into.

In the course of the projects, you use a hardware simulator, an assembler (for converting the assembly language into machine code), a CPU simulator, and a virtual machine simulator.

In order to catch up with the class an have enough leeway for a weekend trip, I went through much of the course over the weekend, and enjoyed every minute of it! I especially like how the break the projects into manageable pieces. For example, in the VM-to-assembly part, it first has you implement simple push/pop/add operations on a stack and run a test to verify that you can compile those commands. Further test suites give you a manageable set of operations to add.

I've written an actual compiler now! (albeit limited use...)

(Consider how fun this is, and how naturally it comes to me, maybe I picked the wrong field.)

One interesting challenge is that the CPU only has two registers call A and D, and only the A register is used when accessing memory, to know which word of memory to look at. It took me a while to figure out how you could say "look at the value in the memory location refered to in the memory location zero." Before I saw how you could do it, I implemented one project by having two separate code branches for the two possible values at the memory location!

Would love to link my code for the projects, but they discourage that to leave the challenge for others to solve.

Monday, November 21, 2016

Another Slashdot memory: Ah, the brick-and-mortar analogies...

So, I remembered another funny Slashdot exchange (again, no link).

Story: Some online retailer got in trouble for filtering their "customer reviews" so that only the positive remarks (about listed products) stayed and everything else was deleted.

A: "Wow, that's pretty scummy. They don't have the right to just clip out negative reviews."

B: "Don't they? I mean, it's their site; they have the right to set whatever editorial standards they want."

C: "Sure, but there's still an issue of consumer fraud and deception. For a brick-and-mortar analogy, imagine that Barnes and Noble started hosing 'book discussion nights' at their stores and promoted it as such. But you quickly notice that whenever someone says something negative about a book sold by B&N -- and only those books -- that person gets a tap on the shoulder from security, pulled aside, and asked to leave.

"In that case, it would indeed be correct to say they can expel whoever they want, but it's still fundamentally fraudulent to represent that event as being for 'book discussion' rather than 'book promotion'."

In other news, I finally found one of the ones that I thought I couldn't! The Armadillo rocket failure mentioned in this post was actually this conversation. The actual (but truncated) exchange went like this (note the links to original comments):

A: "And to think, they want us all to ride in these things commercially...."

B: "John and his team have an excellent track record thus far, and have continued to make safety a main issue. I'm sure that this experience will teach them even more, helping to make the next flight even safer."

C: "You mean even safer than a huge orange fireball?

"I don't know, that's a pretty high bar."

Saturday, February 27, 2016

Some of my geeky tech jokes -- with explanations!

I know the line: explaining a joke is like dissecting a frog; you understand it better, but it dies. Still, not everyone will get these, and I figure I might as well have a place where you at least get a chance. So here are some of my own creations, explained.

Girl, you make me feel like a fraudulent prover in a stochastic interactive zero-knowledge proof protocol ... because I really wish I had access to your random private bits!

Explanation: In a stochastic zero-knowledge proof protocol, there is a prover and a verifier, where the former wants to convince the latter of something. But for proof to work, the verifier must give the prover unpredictable challenges. Think of it like a quiz in school -- it's not much of a quiz if you know the exact questions that will be on it.

The information to predict the challenges is known as the verifier's private random bits Those with a legit proof don't need this, but a fraudulent prover does. Thus, a fraudulent prover in a stochastic interactive zero-knolwedge proof protocol wants access to the verifier's "random private bits".

A historian, a geologist, and a cryptographer are searching for buried treasure. The historian brings expertise on practices used by treasure hiders, the geologist brings expertise on ideal digging places, and the cryptographer brings expertise on hidden messages.

Shortly after they start working together, the cryptographer announces, "I've found it!!"

The others are delighted: 'Where is it?'

The cryptographer says, "It's underground."

'Okay, but where underground?'

"It's somewhere underground!"

'But where specifically?'

"I don't know, but I know it's underground!"

'Slow down there. If all you know is that it's underground, then in what sense did you "find" anything? We're scarcely better off than when we started!'

"Give me a break! I just gave you an efficiently-computable distinguishing attack that separates the location of the treasure from the output of a random oracle. What more could you want?"

Explanation: In cryptography, an encryption scheme is considered broken if an attacker can find some pattern to the encrypted message -- i.e. they can identify telltale signs that it wasn't generated by a perfect random number generator, a "random oracle". Such a flaw would be called a "distinguishing attack". So in the cryptography world, they don't care if the attack actually allows you to decrypt the message; they stop as soon as they find non-randomness to the encrypted data. Applied to a treasure hunt, this means they would give up as soon as they conclude that the treasure location is non-random, which the cryptographer here things s/he's done simply by concluding that it's "underground".

So, 16-year-old Johnnie walked into an Amazon Web Services-run bar...

"Welcome," said the bartender. "What are you drinking?"

Johnnie replied, 'What've you got?'

"Well, we have a selection of wines and the beers you see right here on tap. But if you prefer, we also have club soda and some juices."

Johnnie thought, Wait a second. Why is he telling me about the wines and beers? Does he even realize ... ?

'Okay, I'll take the Guinness.'

"Bottle or draft?"


"Alright, and how will you be paying?"

Johnnie only had large bills from his summer job and gave the bartender a C-note.

"Sorry, but I gotta check to make sure this is real." The bartender took out a pen and marked it, then counted out the change. Johnnie reached for the beer.

"Hold on a second! Make sure to use a coaster!" The bartender slipped one under the glass. "Okay, now enjoy!"

Johnnie lifted up the glass to drink. Before he was able to sip, the bartender swatted it out of his hand.

"WHAT ARE YOU THINKING!?! Don't you know 16-year-olds can't drink!"

Explanation: On the AWS site, they will gladly let you click on the "Launch server" button and go through numerous screens and last-minute checks to configure it, and only at the very last stage does it say, "oops, turns out you don't have permission to do that" -- so it's like a bartender that takes you through a entire transaction, even verifying irrelevant things (like whether the money is real), while knowing the whole time he can't sell to you.

How is a Mongo replica set like an Iowa voter?

In primary elections, they only vote for candidates they think are electable!

Explanation: Databases can have "replica sets" where there are multiple servers that try to have the same data; secondary servers depend on an agreed-upon "primary" to be the "real" source of data. Often times, the primary server goes down, so they have to decide on a new primary, known as a "primary election". But there are some restrictions on who they will vote for -- if they e.g. have reason to believe that a server can't be seen by other members, and in those cases it will regard that server as unelectable. So you can get funny messages about "server42 won't vote for server45 in primary election because it doesn't think it's electable".

Saturday, February 20, 2016

More funny and insightful Slashdot posts I remember

If you liked the last post on this here are some more you might like. This time, I think they're more insightful than funny, but often times, insight is funny!

Also, I thought I'd point out the value of forums: in all of these exchanges, it seems to take three people to get the "aha" moment, not just one or two.

Topic: some new superstrong carbon nanotube material is discovered.

A: So it seems like they're planning to use this stuff for armor. But what about weapons? It seems that anything strong enough to make good armor would also have value as a weapon.

B: That doesn't follow at all! They're completely different use-cases. For example, leather is known to make good armor, but you never see a leather sword!

C: Sure you do -- it's called a "whip" and we dig them up all the time!

Topic: Some dating site is blocked from emailing University of Texas students (at their school email) because of too much spam.

A: Well, their defense is that they're complying with the CAN-SPAM Act, in that they have an unsubscribe link, use their real name, etc., so UT can't just block them wholesale.

B: What? That doesn't matter; that's just the *minimum* requirement for sending such emails. Obviously, any domain owner can impose whatever extra restrictions they want!

C: Yeah, it's like saying, "I have a valid driver's license. I am the legal owner of this vehicle. I have paid the appropriate taxes and registered it. I hold the required liability insurance, and the vehicle is in good working order. I have complied with all applicable traffic laws and operated it in a safe manner. Therefore, I have the right to park on your lawn."

I actually use C's last example when explaining to people the difference between authentication (proving who you are) and authorization (what that identity is allowed to do).

If the minimum wage prohibitions are so easily circumvented ...

The recent Talia Jane story just made me realize we have a possible inconsistently in policy. To get you up to speed, Jane took a low-wage job in the San Francisco Bay Area, hoping to work her way up to her passion of being a social media manager for a major company. But because of rental prices, she paid 85% of per post-tax pay just for rent (!), complained about her employer paying so little, and then was fired.

But as for the inconsistency:

Illegal: paying someone below $X/hour.

Legal: paying someone ($X + $Y)/hour (Y positive) to work in a place where their discretionary income would place them in extreme poverty (e.g. 85% of post-tax on rent).

And yes, that's just an (arguably trivial) corollary of "minimum wage (and tax brackets for that matter) is not automatically cost-of-living-adjusted". But if the goal is to stop people from being taken advantage of with low job offers that hold them in poverty, that seems like a pretty big loophole.

And it's not just that -- let's say someone moves farther out to be able to afford to live there. Then they're traveling an extra N hours just to make each shift which should rightly count against their effective hourly wage.

So, food for thought: what are we really trying to optimize for here? What would the law have to look like to not just avoid these loopholes, but "carve reality at the joints" such that it's fundamentally impossible to scalably circumvent such a law?

If you keep raising the minimum wage for a locality, and people keep commuting greater distances to get that income, what have you accomplished?

Thursday, January 7, 2016

Funny Slashdot exchanges, before they're lost to time

In the time that I was a regular reader of Slashdot, I saw a few exchanges that stayed in my mind. I later went back to find them, but was never able to. So that they're not lost to time, I figured I'd post all the ones I remember. What follows is from memory, and prettied up a bit. (Not trying to plagiarize, if you can find the original post for any of these, let me know.)


[Story: Armadillo Aerospace has a failed rocket launch.]

A: Well, I think we can close the books on Carmack's little project.
B: Come on, now. Private space travel is still in its infancy. There are growing pains. Not everything works the first time. But what's important, is that we're learning from these events. Armadillo is learning. They'll adapt. And the next voyage will be better and safer!
C: You mean, even safer than a big orange fireball?

A: [long rant] So that's the problem with this ban on incandescent light bulbs.
B: Whoa whoa whoa, slow down. There is no "ban" on incandescent light bulbs. It's just that the government passed new efficiency standards, and incandescents don't meet them.
C: Oh, that's clever! I should try that some time: "See, I'm not breaking up with you! I'm just raising my standards to the point where you no longer qualify."

[Story: a pedophile was caught because he took pictures of his acts and tried to blur out the victims' faces, but police analysts were able to unblur them.]

A: Hah! What an amateur! Everyone knows you have to do a true Gaussian blur to destroy the information content of the picture!
B: Yeah, or entropize it by blacking out the whole face.
C: Right. Or, you know, you could just ... not molest children.

(IIRC, C was heavily voted down and criticized for assuming guilt.)

[Story: police used "big data" analytics techniques and discovered that most robberies occur on paydays near check-cashing places, which allowed them to ramp up arrests.]

A: I don't know, this seems kind of big-brothery...
B: Not at all! This is the kind of police work we should applaud! Working only off publicly available, non-private data, they found real, actionable correlations. It wasn't just some bigoted cop working off his gut: "Oh, this must be where the thugs go ..." No, they based it on real data. What's more, it let them avoid the trap of guessing the wrong paydays, which can actually vary! Some people get paid weekly, some biweekly, some of the 1st and 15th. For example, I get paid on the 7th and 21st.
C: So, uh ... where do you cash your checks, by chance?

Wednesday, December 30, 2015

What every programmer *really* should know about ...

So there's a common kind of article that comes up a lot, and annoys me. It's the "What every programmer should know" or "needs to know" about this or that. About time, or names, or memory, or solid state drives or floating point arithmetic, or file systems, or databases.

Here's a good sampling on Hacker News of the kinds of things someone out there is insisting you absolutely have to know before you can knock out "Hello, World", or otherwise meet some higher standard for being a "real programmer".

I hate these articles.

For any one of them, you can find about 99% who don't already know something from the article, and yet they manage to somehow be productive enough that someone is paying to produce a product of value. To me, they come off as, "I want to be important, so I want to make it seem like my esoteric knowledge is much more important than it really is." If taken seriously, these articles would heavily skew your priorities.

So here's my standard for when you're justified in saying "what every programmer needs to know about X":

Think about the tenacious, eager kid from your neighborhood. He picks up on things really quickly. Once he learned to program, he was showing off all kinds of impressive projects. But he also makes a lot of rookie mistakes that don't necessarily affect the output but would be a nightmare if he tried to scale up the app or extend its functionality. Things you, in your experience, with your hard-earned theoretical knowledge, would never do. When you explain it to him, he understands what you mean quickly enough, and (just barely) patches that up in future versions.

What are those shortcomings? What are those insights, that this kid is lacking? What of this kid's mistakes would you want to give advice about to head off as soon as possible? That's what every programmer needs to know.

The rest? Well, it can wait until your problem needs it.