MIT Lincoln Laboratory: DARPA Intrusion Detection Evaluation

Lincoln Laboratory Scenario (DDoS) 1.0

The content and labeling of datasets relies significantly on reports and feedback from consumers of this data. Please send feedback on this dataset to Joshua W. Haines so that your ideas can be incorporated into future datasets.

Overview

This is the first attack scenario example data set to be created for DARPA as a part of this effort. It includes a distributed denial of service attack run by a novice attacker. Future versions of this and other example scenarios will contain more stealthy attack versions.

This attack scenario is carried out over multiple network and audit sessions. These sessions have been grouped into 5 attack phases over the course of which the adversary probes, breaks-in, installs trojan mstream DDoS software, and launches a DDoS attack against an off-site server.

ADVERSARY: Novice
ADVERSARY_GOAL: Install components for, and carry out, a DDOS attack
DEFENDER: Naive

Description

The premise of the attack is that a relatively novice adversary seeks to show his/her prowess by using a scripted attack to break into a variety of hosts around the Internet, install the components necessary to run a Distributed Denial of Service, and then launch a DDOS at a US government site. As a part of the attack the adversary uses the Solaris sadmind exploit, a well-known Remote-To-Root attack to successfully gain root access to three Solaris hosts at Eyrie Air Force Base. These attacks succeed due to the relatively poor security model applied at the AFB, many services, including the dangerous "sunrpc" service, are proxied through the base's firewall from outside to inside. The attacker is using the Mstream DDOS tool, one of the less sophisticated DDOS tools. It does not make use of encryption and does not offer as wide a range of attack options as other tools, such as TribeFloodNetwork or Trinoo. An Mstream "server", the software that actually generates and sends the DDOS attack packets, is installed on each of the three victim hosts, while an Mstream "master", the software that provides a user-interface and controls the "servers" is installed on one of the victims.

The five phases of the attack scenario are:

IPsweep of the AFB from a remote site
Probe of live IP's to look for the sadmind daemon running on Solaris hosts
Breakins via the sadmind vulnerability, both successful and unsuccessful on those hosts
Installation of the trojan mstream DDoS software on three hosts at the AFB
Launching the DDoS

LLS DDoS Scenario Service Plot

The service plot, shown above, illustrates the attack as seen in the "tcpdump_inside" sensor (the sniffer on the "inside" network). Only attack sessions are plotted here and each phase is denoted in red. The X-axis is time-of-day in Eastern Standard Time and the Y-axis is the tcp or udp service. The hash-marks indicate the presence of a network session at that time and of that service.

Downloading Offline Sensor Data

The data files can be downloaded by clicking on the links below.

Inside Tcpdump file [ gzipped ]
DMZ Tcpdump file [ gzipped ]
Solaris (2.7) BSM Audit Data from mill [ gzipped ]
Solaris (2.5) BSM Audit Data from pascal [ gzipped ]

Downloading Labeling Data

Mid-Level Labeling, based on Solaris host BSM sensors, for entire the scenario. Each file is a list of XML alerts. Each XML alert represents a BSM "exec" record that is part of the attack.

Mid-Level Labeling, based on Network Tcpdump sensors, by phase of the scenario. Each file is a list of XML alerts. Each XML alert represents a network session that is part of the attack.

Download a tar'd gzip'd file of this entire dataset here

Download Labeling Data by Scenario Phase

PHASE 1: The adversary performs a scripted IPsweep of multiple class C subnets on the Air Force Base. The following networks are swept from address 1 to 254: 172.16.115.0/24, 172.16.114.0/24, 172.16.113.0/24, 172.16.112.0/24. The attacker sends ICMP echo-requests in this sweep and listens for ICMP echo-replies to determine which hosts are "up".

Low-Level Labeling:

TCPdump_Inside Labeling: Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
TCPdump_DMZ Labeling: Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
BSM Audit Labeling for MillBinary BSM File, Segmented by time only, Text Praudit -l Output
BSM Audit Labeling for PascalBinary BSM File, Segmented by time only, Text Praudit -l Output

PHASE 2: The hosts discovered in the previous phase are probed to determine which hosts are running the "sadmind" remote administration tool. This tells the attacker which hosts might be vulnerable to the exploit that he/she has. Each host is probed, by the script, using the "ping" option of the sadmind exploit program, as provided on the Internet by "Cheez Whiz". The ping option makes a rpc request to the host in question, asks what TCP port number to connect to for the sadmind service, and then connects to the port number supplied to test to see if the daemon is listening.

Low-Level Labeling

TCPdump_Inside Labeling: Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
TCPdump_DMZ Labeling:Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
BSM Audit Labeling for MillBinary BSM File, Segmented by time only, Text Praudit -l Output
BSM Audit Labeling for PascalBinary BSM File, Segmented by time only Text Praudit -l Output

PHASE 3: The attacker then tries to break into the hosts found to be running the sadmind service in the previous phase. The attack script attempts the sadmind Remote-to-Root exploit several times against each host, each time with different parameters. Since this is a remote buffer-overflow attack, the exploit code cannot easily determine the appropriate stack pointer value as in a local buffer-overflow. Thus the adversary must try several different stack pointer values, each of which he/she has validated to work on some test machines. There are three stack pointer values attempted on each potential victim. With each attempt, the exploit tries to execute one command, as root, on the remote system. The attacker needs to execute two commands however, one to "cat" an entry onto the victim's /etc/passwd file and one to "cat" an entry onto the victim's /etc/shadow file. The new root user's name is 'hacker2' and hacker2's home directory is set to be /tmp. Thus, there are 6 exploit attempts on each potential victim host. To test weather or not a break-in was successful, the attack script attempts a login, via telnet, as hacker2, after each set of two breakin attempts. When successful the attackers script moves on to the next potential victim.

Low-Level Labeling

TCPdump_Inside Labeling: Binary TCPDump File , IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
TCPdump_DMZ Labeling:Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
BSM Audit Labeling for MillBinary BSM File, Segmented by time only, Text Praudit -l Output
BSM Audit Labeling for PascalBinary BSM File, Segmented by time only, Text Praudit -l Output

PHASE 4: Entering this phase, the attack script has built a list of those hosts on which it has successfully installed the 'hacker2' user. These are mill (172.16.115.20), pascal (172.16.112.50), and locke (172.16.112.10). For each host on this list, the script performs a telnet login, makes a directory on the victim called "/tmp/.mstream/" and uses rcp to copy the server-sol binary into the new directory. This is the mstream server software. The attacker also installs a ".rhosts" file for themselves in /tmp, so that they can rsh in to startup the binary programs. On the first victim on the list, the attacker also installs the "master-sol" software, which is the mstream master. After installing the software on each host, the attacker uses rsh to startup first the master, and then the servers. as they come up, each server "registers" with the master that it is alive. The master writes out a database of live servers to a file called "/tmp/.sr".

Low-Level Labeling

TCPdump_Inside Labeling: Binary TCPDump File , IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
TCPdump_DMZ Labeling:Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
BSM Audit Labeling for MillBinary BSM File, Segmented by time only, Text Praudit -l Output
BSM Audit Labeling for PascalBinary BSM File, Segmented by time only, Text Praudit -l Output

PHASE 5: In the final phase, the attacker manually launches the DDOS. This is performed via a telnet login to the victim on which the master is running, and then, from the victim, a "telnet" to port 6723 of the localhost. Port 6723/TCP is the port on which the master listens for connections to its user-interface. After entering a password for the user-interface, the attacker is given a prompt at which he/she enters two commands. The command "servers" causes the UI to list the mstream servers which have registered with it and are ready to attack. the command "mstream 131.84.1.31 5" causes a DDOS attack, of 5 second duration, against the given IP address to be launched by all three servers simultaneously. The mstream DDOS consists of many, many connection requests to a variety of ports on the victim. All packets have a spoofed, random source IP address. The attacker then logs out. The tiny duration was chosen so that it would be possible to easily distribute tcpdump and audit logs of these events -- to avoid them being to large. In real life, one might expect a DDOS of longer duration, several hours or more.

In the case of this scenario, however, it should be noted that the DDoS does not exactly succeed. The Mstream DDoS software attempts to flood the victim with ack packets that go to many random tcp ports on the victim host. The AirForce base firewall, the Sidewinder firewall, is not configured to pass traffic on all these ports, thus the only mstream packets that make it though the firewall are those on well-known ports. All other mstream packets result in a tcp reset being sent to the spoof source address. Thus in the DMZ dump file, one sees many resets apparently coming from "www.af.mil" going to the many spoofed source addresses. These are actually created by the firewall as a result of the reciept of the tcp packet for which the firewall is configured not to proxy!

Low-Level Labeling

TCPdump_Inside Labeling: Binary TCPDump File , IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
TCPdump_DMZ Labeling:Binary TCPDump File, IDMEF Alerts in XML, List of Sessions (1999 Listfile format)
BSM Audit Labeling for MillBinary BSM File, Segmented by time only, Text Praudit -l Output
BSM Audit Labeling for PascalBinary BSM File, Segmented by time only, Text Praudit -l Output

Notes

These data files were collected over a span of approximately 3 hours on Tuesday, 7 March 2000, from 9:25 AM to 12:35 PM, eastern standard time.

The background traffic is of virtually the same content and frequency as was used in the 1999 datasets, therefore, the 1999 data sets may be used to train for this data.

Changes in the background traffic from 1999 to the current year include moving 4 hosts from the 114.0 subnet to the 113.0 subnet and adding the host mill (172.16.115.20) to be the Eyrie AFB Domain Name Service (DNS) server as well as a recipient of telnet traffic.

Background traffic generation was started at 9:00 AM that day to allow spurious network startup traffic to subside.

The Windows NT server hume.eyrie.af.mil had a mis-configuration in the mail server, so there are numerous DNS queries from hume that can be ignored. Hume played no role in the attack scenario except in that it was probed in phases one and two.

Additional information is provided on the documentation page.

top of page