Learn about binary hardening with hardening-check

In my last post, I wrote about using Radare2 to measure cyclomatic complexity, giving us a metric of the number of “moving parts” in a binary. From a risk quantification point of view, this is important because we assume that a more complicated program will be easier to get to misbehave.

But there are a lot of binaries in a Linux install. Let’s do a find against / and see for ourselves. Again, I find myself on my Macbook, so let’s do this from a docker container because it’ll be quicker than a VM. This assumes you already have Docker installed and configured.

mb:~ jason$ docker run --rm -ti fedora bash
[root@0600e63539d7 /]# dnf install -q -y file findutils
[root@0600e63539d7 /]# find / -exec file {} \; | grep -i elf | wc -l
1113

Over a thousand, even in a small docker image! Now imagine you’re a security researcher, or maybe a 1337 hax0r, and you want to find a new vulnerability in one of those binaries. Which ones do you try to attack?

In this scenario, we need to be selective about which binaries we want to target, because fuzzing is time consuming. So which ones will give us the best bang for our buck, so to speak? We want to look for low-hanging fruit, and in order to do that we need to identify binaries that are, basically, poorly built.

This process — looking at binaries, scoring them in terms of how much effort it would take to successfully fuzz them and find a new 0-day — is what the Cyber-ITL is all about. And this is what we want to duplicate in the open source world with the Fedora Cyber Test Lab.

There are a number of mechanisms that have been built into Linux over the years to frustrate this process. I find a good way to learn about those mechanisms is to look at the code for a tool called hardening-check.

hardening-check was written by an a kernel security engineer at Google, Kees Cook, who, from the look of his Twitter profile, is a bit of a Stand Alone Complex fan. This tool came out around 2009, during which time Kees was working as an Ubuntu security engineer. Since then hardening-check has been picked up by other distros, and is just super-handy.

(In the next several posts, we’ll be referring to the hardening-check Perl source code, which is version controlled here.)

First, let’s install and run the tool to get a feel for it. We’ll also install gcc for the next example.

mb:~ jason$ docker run --rm -ti fedora bash
[root@59c1a1a14181 /]# dnf install -q -y hardening-check gcc
[root@59c1a1a14181 /]# hardening-check /bin/ls
/bin/ls:
 Position Independent Executable: yes
 Stack protected: yes
 Fortify Source functions: yes (some protected functions found)
 Read-only relocations: yes
 Immediate binding: yes

Alright, that looks pretty good. Now let’s write another hello world program, but this time we’re going to deliberately use a notoriously unsafe function.

#include <stdio.h>

int main() {
  char str1[12] = "hello world";
  char str2[12];
  sprintf(str2, "%s", str1);
  return 0;
}

For now, we compile it without any extra flags, then run hardening check.

[root@59c1a1a14181 /]# gcc hello_world.c
[root@59c1a1a14181 /]# hardening-check a.out
./a.out:
 Position Independent Executable: no, normal executable!
 Stack protected: no, not found!
 Fortify Source functions: no, only unprotected functions found!
 Read-only relocations: yes
 Immediate binding: no, not found!

Not so great. But we can ask gcc to help us out.

[root@59c1a1a14181 /]# gcc -D_FORTIFY_SOURCE=2 -O2 -fpic -pie -z now -fstack-protector-all hello_world.c
[root@59c1a1a14181 /]# hardening-check a.out
./a.out:
 Position Independent Executable: yes
 Stack protected: yes
 Fortify Source functions: yes
 Read-only relocations: yes
 Immediate binding: yes

That’s better. Now our silly little program is much harder to leverage in an attack.

This brings up a lot of questions, though. Why isn’t every binary compiled with those flags? Why don’t we just tweak gcc so it always applies those protections by default? For that matter, what do all those checks mean? Can they be defeated by attackers?

Stay tuned as we dig into each of those questions, and explore how they apply to our Cyber Test Lab.

Measuring cyclomatic complexity with Radare2

Cyclomatic complexity is a metric that’s used to measure the complexity of a program. It’s one of many binary metrics tracked by the Cyber-ITL, and calculating it is a good first step in repeating their results.

Radare2 makes this super easy, and with r2pipe we can start building out our Python SDK for the Fedora Cyber Test Lab’s open source implementation of the Cyber-ITL’s approach.

macOS Environment Setup

It probably makes more sense to do this work on my Fedora box, but I happen to be on my Macbook. So we’ll use macOS as our environment for this post, which is, honestly, way more difficult.

First, let’s set up Homebrew. Note that in order to do so cleanly, we want to first give my user full control over /usr/local. I prefer to use facls to do this than changing ownership, which seems clumsy. Be sure to replace “jason” with your username.

sudo chmod -R +a "jason allow list,add_file,search,delete,add_subdirectory,delete_child,readattr,writeattr,readextattr,writeextattr,readsecurity,writesecurity,chown,file_inherit,directory_inherit" /usr/local

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

I also like to use the Homebrew version of Python, so we’ll install that as well.

brew install python virtualenv radare2
virtualenv -p /usr/local/bin/python ENV
source ENV/bin/activate
pip install r2pipe

We’re setting up a virtual environment here so we can isolate our installed libraries, but also to ensure we don’t run into PYTHONPATH issues, which is easy to do on macOS with multiple Python interpreters installed.

Simple analysis with r2

Radare2 (pronounced like “radar”) is a fantastic tool and, to be honest, I’ve only scratched its surface. Watching an experienced reverse engineer use r2 is like watching Michelangelo paint. Ok, I’ve never seen that, but I assume it was hella cool.

So let’s write a simple program and analyze it with r2. This assumes you have Xcode Command Line Tools installed, which is from where we’re getting gcc.

#include <stdio.h>

int main() {
  printf("hello world\n");
  return 0;
}

That should look familiar to everybody. Now let’s compile it, analyze it, and ask r2 to calculate its cyclomatic complexity.

(ENV) mb:~ jason$ gcc hello_world.c
(ENV) mb:~ jason$ r2 a.out
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
 -- For a full list of commands see `strings /dev/urandom`
[0x100000f60]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[ ] [*] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan))
[0x100000f60]> afcc
1
[0x100000f60]> q

Not super-interesting, there’s just a single path the code can take, so when we use the afcc command, or “analyze functions cyclomatic complexity,” we get back 1.

Using the formula the Wikipedia article, we can sanity check that result.

M = E − N + 2P,
where
E = the number of edges of the graph.
N = the number of nodes of the graph.
P = the number of connected components.

We get

M = 0 – 1 + 2 * 1
M = 1

For our purposes, P will almost always be 2, so we can treat that like a constant.

Ok, let’s add some more complexity.

#include <stdio.h>

int main() {
	int a = 1;
	if( a == 1 ){
		printf("hello world\n");
	}
	return 0;
}

Compile it and analyze it.

(ENV) mb:~ jason$ gcc hello_world2.c
(ENV) mb:~ jason$ r2 a.out
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
 -- Rename a function using the 'afr <newname> @ <offset>' command.
[0x100000f50]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[ ] [*] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan))
[0x100000f50]> afcc
2
[0x100000f50]> agv
[0x100000f50]> q

Ok, with a conditional in our code we got the complexity to go up. But let’s use the command “agv” or “analyze graph web/png” to get a nice graphic representation of the function graph.

Now let’s use our formula, M = E – N + 2. Three edges, three nodes.

M = 3 – 3 + 2
M = 2

So that tracks. Now once more with an extra conditional statement.

#include <stdio.h>

int main() {
	int a = 1;
	if( a == 1 ){
		printf("hello world\n");
	}
	if(a == 0){
		printf("goodbye world\n");
	}
	return 0; 
}

Lather, rinse, repeat.

(ENV) mb:~ jason$ gcc hello_world3.c
(ENV) mb:~ jason$ r2 a.out
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
 -- Use '-e bin.strings=false' to disable automatic string search when loading the binary.
[0x100000f20]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[ ] [*] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan))
[0x100000f20]> afcc
3
[0x100000f20]> agv
[0x100000f20]> q

Verifying once again, six edges, five nodes:

M = 6 – 5 + 2
M = 3

You get the idea. These are really trivial examples, and when you ask r2 to analyze complex binaries, it can take a really long time. For fun, try the docker runtime, and see what a pain it is to deal with statically linked Go binaries.

Now let’s use r2pipe to do this from Python.

#!/usr/bin/env python

import os
import r2pipe
import sys

r2 = r2pipe.open(sys.argv[1])
r2.cmd("aaa")
cc = r2.cmdj("afcc")
print cc

And now we can do it from Python!

(ENV) mb:~ jason$ python r2cc.py a.out
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
3

BTW, those errors are a known issue and you can ignore them on macOS for now.

Just for fun, let’s look at some macOS binaries.

(ENV) mb:~ jason$ python r2cc.py /bin/ls
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
36
(ENV) mb:~ jason$ python r2cc.py /usr/sbin/chown
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
31
(ENV) mb:~ jason$ python r2cc.py /usr/sbin/postfix
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
24
(ENV) mb:~ jason$ python r2cc.py /bin/bash
syntax error: error in error handling
syntax error: error in error handling
syntax error: error in error handling
177

Crazy that ls and chown are more complex that Postfix! And BASH is far more complex than either. Think of this value as a representation of the number of moving parts in a machine. The more complex it is, the more likely it’ll break.

But what about the quality of those parts in the machine? A complex but well-engineered machine will work better than a simple piece of junk. And so far, we’re only examining the machine while it’s turned off. That’s static analysis. We also need to watch the machine work, which is dynamic analysis. The Cyber-ITL has been doing both.

Stay tuned as we follow them down the quantitative analysis rabbit hole.

Defense in Depth is cool, and you should attend on October 3rd

For five years now, Red Hat and Intel have hosted a small, but very cool security conference called “Defense in Depth (DiD)” in Tyson’s Corner, VA. Its popularity has been increasing, and this year is a sort of watershed event in the show’s history. DiD has, in the past, been very focused on the security of Red Hat’s products; this year we’re casting a wider net around the security of many open source communities.

We even have an infosec A-lister keynoting — none other than David Kennedy of DerbyCon. In case you haven’t been watching CNN, Fox News, or other high-profile media outlets’ reporting on infosec, Dave’s kind of a big deal. His security roots run deep, and I’m super pumped to hear his keynote, “The Changing Tactics of Hackers,” which will talk about how only the first T in TTP tends to change, which can be handy for developing a counter-strategy.

The whole agenda looks cool, but here are some of the talks for which I am particularly stoked:

There are still conference passes available, so if you get geeked on open source security, register here.

Introducing the Fedora Red Team

Last week my colleagues and I started a new Special Interest Group under the Fedora Project: the Fedora Red Team.

The Fedora Red Team’s goal is to become Red Hat’s upstream cybersecurity community. But what does that actually mean?

“Cyber” is a fascinating term with a colorful history. I’ve written about it before, but the punchline is that we owe its ubiquity to William Gibson’s Neuromancer and an obscure US Government paper from the late 90s, referred to as The Marsh Report.

The context of The Marsh Report seems to have stuck — cyber has come to refer to the high-stakes interplay of offensive and defensive infosec, often at nation-state scale. If you want to get ontological, you could say “cyber” includes the three sub-domains of Computer Network Operations (CNO) — Computer Network Defense, Computer Network Exploitation, and Computer Network Attack.

So why a Fedora Red Team? My colleagues and I needed a place to work on offensive tooling, exploit curation, standards, and reference architectures — but we wanted to do it “the open source way.” A Fedora SIG gives us a community place to fail-fast these projects, a few of which I’ll mention here.

The Enterprise Linux Exploit Mapper (ELEM): as Mike Bursell wrote in his blog, many system administrators find themselves unable to patch. The CVE scoring system helps admins decide when to really push for patching, but many CVE descriptions contain language like “this vulnerability may allow an attacker to execute arbitrary code.” And there’s a reason for that — many vulnerabilities don’t have workable POCs. But what about the ones that do? ELEM makes it easy to map vulnerabilities on a local system to known exploits out in the wild. From a defensive perspective it creates a sort of super-criticality for admins so they can say to their management, “somebody can download this exploit and root our box right this minute.” A tool like this has good offensive value as well, and could save pentesters time doing these mappings manually.

The Fedora Cyber Test Lab (FCTL): I’ve written before about the Cyber-ITL and how important it is to the future of infosec. My only complaint is that their TTPs are not open source, and are not repeatable. So let’s fix that. This project will be an open source, automated, fully repeatable project for quantifying risk at a per-binary level. It will focus at first on RHEL, CentOS, and Fedora, but we’d love help from the community with adding other OS targets. I have a rudimentary first version ready to push to GitHub, which I’ll be blogging about in the coming days.

Penetration Testing Execution Standard (PTES): I’ve written before about how much I love PTES. In my mind, it’s a really important component of the art of pentesting. How can you tell a client that you’ve tested their security according to industry best practices without a yardstick by which to measure those practices? Without such a standard, pentesting relies too much on personal bona fides and flashy marketing. A standard like PTES can fix that. The only problem is that it hasn’t really been updated since 2014. Last summer, I rejiggered the wiki markup into ReStructured Text and put it on Read the Docs, which makes it easier to build community participation. The project is ready to be resurrected, and I hope that we’ll be able to work with the original team. But worst-case, we can fork it and go from there. This isn’t necessarily bad, and the PTES founders may have their reasons for wanting to let the project stay as it is. The Red Team SIG should know by early October which direction we’ll be taking.

I’m excited about this SIG, and will be haunting #fedora-security on Freenode IRC, as well as the security@lists.fedoraproject.org list. Every first Friday of the month we’ll be having meetings in IRC at 1400 UTC. Please join the conversation, or send us your pull requests for our GitHub projects.

Also, if you’re in Northern Virginia on 3 October 2017, we’ll be presenting ELEM at Defense in Depth. Register here and geek it up with us face-to-face!

Disclaimer: while I am talking about work stuff, this is my personal blog, and these views are my own and not my employer’s.

Save your AWS budget with Python and boto

My team and I lean heavily on AWS services for prototypes, demos, and training. The challenge that we’ve encountered is that it’s easy to forget about the resources you’ve spun up. So I wrote a quickly little utility that shuts down unnecessary EC2 instances at night.

The Python library, boto, provides an AWS SDK. It’s very easy to use, and many good tutorials exist. Instructions can be found in the README, but here’s a quick overview.

First we import the boto and yaml libraries. (We’re using YAML for our config file markup. ) Then we read in that config file.

import boto.ec2
import yaml

config_file = '/etc/nightly_shutdown.yml'
with open(config_file) as f:
    config = yaml.safe_load(f)

In that config file, we’ve got our region, access and secret keys, and a white list of instance IDs we’d like to opt-out of the nightly shutdown. This last bit is important if you have instances doing long-running jobs like repo syncing, for example.

---
region: us-east-1
access_key: eggseggseggseggs
secret_key: spamspamspamspam
whitelist:
  - i-abcdefgh
  - i-ijklmnop

Now we connect to the AWS API and get a list of reservations. This itself is interesting, as it gives us a little insight into the guts of EC2. As I understand it, a reservation must exist before an instance can be launched.

conn = boto.ec2.connect_to_region(config['region'],
                                  aws_access_key_id=config['access_key'],
                                  aws_secret_access_key=config['secret_key'])

reservations = conn.get_all_reservations()

Now it’s simply a matter of iterating over those reservations, getting the instance IDs, and filtering out the white-listed IDs.

running_instances = []
for r in reservations:
    for i in r.instances:
        if i.state == "running":
            if i.id not in config['whitelist']:
                running_instances.append(i.id)

Finally, we make the API call to stop the instances. Before doing so, we check to be sure there are any running, as this call will throw an exception if the instance ID list is empty.

if len(running_instances) > 0:
    conn.stop_instances(instance_ids=running_instances)

Now you just have to add this to your daily cronjobs and you’ll save a little budget.

Cyber-ITL Round up

I was talking to a friend about the Cyber-ITL. His reaction was, “Wat?” So in case you missed it, an important thing is happening. EDIT: the BlackHat video was DMCAed. Here’s the Def Con version instead, which is better anyway.

Mudge and his wife, Sarah, presented this at BlackHat and Def Con this year.

If you watch only one video in November, make it this one. This is extremely important, and plays a big part in things to come.

Related:

The Cyber-ITL site itself is a little sparse; Mudge has been slowed down a bit by health problems. But there are a few good articles to read: