| description | Evidence Formats Reference |
|---|
This document lists every evidence type EvidenceForge can generate, where to find it in the output, and any known limitations.
One generation run emits one output target. The tree below shows default, SOF-ELK®, and Splunk target-specific files where they differ; they are not emitted together.
output/
GROUND_TRUTH.json # Canonical machine-readable ground-truth document
GROUND_TRUTH.md # Human-readable answer key rendered from the JSON document
OBSERVATION_MANIFEST.json # Source-observation manifest for eval
OUTPUT_TARGET.txt # "default", "sof-elk", or "splunk"; missing legacy marker means default
ENVIRONMENT.md # Optional student-facing environment description
data/ # Generated logs for every output target
<hostname.domain>/ # Per-host directories (FQDN)
windows_event_security.xml # Windows Security XML document, or splunk XML event stream
windows_event_sysmon.xml # Sysmon XML document, or splunk XML event stream
ecar.json # Simulated EDR telemetry in eCAR format (NDJSON)
syslog.log # Linux syslog (default/splunk target; RFC5424)
bash_history/<username>.bash_history # Per-user bash history (Linux only)
web_access.log # Web server access log on web_server hosts
proxy_access.log # Forward proxy access log on forward_proxy hosts
<year>/windows_event_security_snare.log # Windows Security Snare/RFC3164 (sof-elk target)
<year>/windows_event_sysmon_snare.log # Sysmon Snare/RFC3164 (sof-elk target)
<year>/syslog.log # Linux syslog (sof-elk target; RFC3164)
<sensor-name>/ # Per-sensor directories (network)
conn.json # Zeek conn.log (NDJSON)
dns.json # Zeek dns.log
http.json # Zeek http.log
ssl.json # Zeek ssl.log
files.json # Zeek files.log
... # Other Zeek logs
<ids-sensor-name>/ # Per-IDS-sensor directories
snort_alert.log # Snort/Suricata IDS alerts
<fw-hostname>/ # Per-firewall-sensor directories
cisco_asa.log # Cisco ASA firewall syslog (default/splunk target)
<year>/cisco_asa.log # Cisco ASA firewall syslog (sof-elk target)
eforge generate --target default|sof-elk|splunk selects the on-disk rendering and
layout inside the generated data/ directory for tools that expect different
formats. Scenario YAML and --formats remain canonical: request
windows_event_security, windows_event_sysmon, syslog, cisco_asa, and so
on, then choose the target at generation time.
When OUTPUT_TARGET.txt is missing, eforge eval treats the dataset as
legacy/default output.
For practical ingestion and validation guidance by target, see
Output Target Ingest Guides.
Target-specific behavior in V1:
| Canonical format | default target |
sof-elk target |
splunk target |
|---|---|---|---|
windows_event_security |
<host>/windows_event_security.xml rooted XML document |
<host>/<year>/windows_event_security_snare.log |
<host>/windows_event_security.xml as one <Event> per line |
windows_event_sysmon |
<host>/windows_event_sysmon.xml rooted XML document |
<host>/<year>/windows_event_sysmon_snare.log |
<host>/windows_event_sysmon.xml as one <Event> per line |
syslog |
<host>/syslog.log as RFC5424 |
<host>/<year>/syslog.log as RFC3164/BSD |
<host>/syslog.log as RFC5424 |
cisco_asa |
<firewall>/cisco_asa.log |
<firewall>/<year>/cisco_asa.log |
<firewall>/cisco_asa.log |
| Zeek | <sensor>/<logtype>.json only when Zeek sensors are configured |
Unchanged | Unchanged |
| Proxy, web access, IDS, eCAR, bash history | Unchanged | Unchanged | Unchanged |
Default target file: <hostname.domain>/windows_event_security.xml
Default target format: XML (<Events><Event>...</Event></Events>)
SOF-ELK target file: <hostname.domain>/<year>/windows_event_security_snare.log
SOF-ELK target format: Snare-style Windows Event Log fields inside an RFC3164 syslog envelope
Splunk target file: <hostname.domain>/windows_event_security.xml
Splunk target format: XML event stream (one complete <Event>...</Event> per line)
Provider: Microsoft-Windows-Security-Auditing (except 1102)
Channel: Security
The default target emits one rooted XML document. The sof-elk target emits
Snare syslog only so SOF-ELK and other syslog/Snare-aware tools can parse the
same canonical Windows Security events without requiring binary EVTX files. The
splunk target reuses the same XML event content as default output but removes
the global <Events> wrapper so Splunk file monitoring can ingest each Windows
event as a separate record on Linux.
| Event ID | Name | Category | Notes |
|---|---|---|---|
| 1102 | Security Log Cleared | Defense Evasion | Different provider (Microsoft-Windows-Eventlog). Uses <UserData> instead of <EventData>. Level=4, Keywords=0x4020. |
| 4624 | Successful Logon | Authentication | Version 2 format. Includes ImpersonationLevel, VirtualAccount, ElevatedToken, TargetLinkedLogonId. LogonTypes: 2 (interactive), 3 (network), 5 (service), 7 (unlock), 10 (RDP), 11 (cached). IPv4 rendered as ::ffff:x.x.x.x. |
| 4625 | Failed Logon | Authentication | Version 0. Keywords=0x8010 (Audit Failure). Includes Status/SubStatus failure codes. Remote failed-auth attempts use established/reset-after-payload network evidence rather than SYN-only probes. |
| 4634 | Logoff | Authentication | Paired with 4624 via matching TargetLogonId. Generated for interactive sessions (type 2/10) at work-day end and for type 3 network logons (including machine account logons on DCs) after short delays. |
| 4648 | Explicit Credentials | Lateral Movement | Fires when RunAs, PsExec, WMIC, or scheduled tasks use alternate credentials. Emitted on the source system. |
| 4672 | Special Privileges Assigned | Privilege Use | Auto-emitted alongside the target-host 4624 for elevated accounts. Privilege lists are selected from data-driven service/admin/UAC profiles in windows_auth_realism.yaml. |
| 4688 | Process Created | Execution | Version 2. Includes CommandLine, ParentProcessName, MandatoryLabel. TokenElevationType indicates UAC status. |
| 4689 | Process Exited | Execution | Paired with 4688. Status always 0x0. |
| 4697 | Service Installed | Persistence | ServiceFileName can contain full command lines. ServiceType 0x10=Own Process. |
| 4698 | Scheduled Task Created | Persistence | TaskContent contains HTML-escaped XML task definition. |
| 4699 | Scheduled Task Deleted | Persistence | Same field structure as 4698. |
| 4700 | Scheduled Task Enabled | Persistence | Same field structure as 4698. No sample data verification (MS docs only). |
| 4701 | Scheduled Task Disabled | Persistence | Same field structure as 4698. No sample data verification (MS docs only). |
| 4720 | User Account Created | Account Management | Full account property fields (25+). Most default to "-". |
| 4723 | Password Change Attempt | Account Management | User changing own password. Can be Audit Failure (0x8010) if policy rejects. |
| 4724 | Password Reset Attempt | Account Management | Admin resetting another user's password. Minimal fields. |
| 4726 | User Account Deleted | Account Management | Minimal fields (Subject + Target + PrivilegeList). |
| 4728 | Member Added to Global Group | Privilege Escalation | e.g., adding user to Domain Admins. |
| 4729 | Member Removed from Global Group | Privilege Escalation | No sample data verification (identical structure to 4728). |
| 4732 | Member Added to Local Group | Privilege Escalation | e.g., adding user to local Administrators. |
| 4733 | Member Removed from Local Group | Privilege Escalation | |
| 4738 | User Account Changed | Account Management | Has unique leading Dummy field (always "-"). Full account property fields. |
| 4756 | Member Added to Universal Group | Privilege Escalation | e.g., Enterprise Admins. |
| 4757 | Member Removed from Universal Group | Privilege Escalation | No sample data verification (identical structure to 4756). |
| 4768 | Kerberos TGT Request | Authentication | Keywords reflect success/failure based on Status field. Successful TGTs use data-driven PreAuthType/TicketOptions/encryption distributions; PKINIT (PreAuthType=15) populates CertIssuerName/CertSerialNumber/CertThumbprint. |
| 4769 | Kerberos Service Ticket | Authentication | TargetUserName includes @DOMAIN suffix. Keywords reflect success/failure. |
| 4770 | Kerberos TGT Renewal | Authentication | Always success. |
| 4771 | Kerberos Pre-Auth Failed | Credential Access | Keywords always 0x8010 (Audit Failure). Key indicator for password spraying. |
| 4776 | NTLM Credential Validation | Authentication | Field names: TargetUserName (not LogonAccount), Workstation (not SourceWorkstation). Status reflects validation success or failure. |
| 5156 | WFP Connection Permitted | Network | Application path uses device format (\device\harddiskvolume1\...). Direction: %%14592=Inbound, %%14593=Outbound. |
Known Limitations:
- EventRecordIDs use probabilistic gaps (15% chance +2-8, 3% chance +20-200) rather than correlating with unlogged events
- Execution ProcessID for auth events uses the lsass.exe PID; for process/WFP events uses the System process (PID 4, now properly registered)
- Account management events (4720-4738) and group membership events (4728-4757) require storyline triggers; they are not generated in baseline activity
- SubjectDomainName correctly uses "NT AUTHORITY" for SYSTEM, NETWORK SERVICE, and LOCAL SERVICE accounts
- 4648 (explicit credentials) fires in baseline for scheduled task execution with randomized counts (2-5/hour) plus storyline lateral movement
- Successful logons, failed logons, logoffs, service logons, machine-account logons, anonymous logons, NTLM validation, and workstation lock/unlock evidence route through the internal auth/session bundles so Windows Security, Linux syslog, EDR/eCAR, DC validation, lock state, and companion network evidence share session IDs, source endpoints, and lifecycle ordering.
- DC-side Kerberos 4768/4769/4770/4771 evidence routes through the internal Kerberos/DC bundle so ticket timing, source IP/port, TGT cache behavior, service-principal identity, and companion KDC network evidence stay aligned.
- Windows audit/account-management events route through the internal Windows audit bundle so subject LogonID/session ownership, target account/group identity, scheduled-task XML, log-clear subject identity, and Sysmon/eCAR thread/process-access context stay aligned.
- Canonical connections route through the internal network-connection bundle so Zeek, EDR/eCAR FLOW, proxy/firewall/IDS companions, DNS/TLS/HTTP/file metadata, endpoint process ownership, and Windows WFP rows share one tuple, source port, hostname, UID/state, and visibility decision.
- Domain controllers receive admin-only baseline activity: type 3 logons from RSAT sessions (mmc.exe runs on the admin workstation, not the DC), type 10 RDP for direct admin access, and no user desktop sessions (no browsers, Office, or user profile artifacts)
- RSAT sessions produce correlated cross-host events: mmc.exe + DLL loads on the workstation, LDAP/RPC connections from workstation to DC, and a type 3 logon on the DC — all within seconds
Default target file: <hostname.domain>/windows_event_sysmon.xml
Default target format: XML (<Events><Event>...</Event></Events>)
SOF-ELK target file: <hostname.domain>/<year>/windows_event_sysmon_snare.log
SOF-ELK target format: Snare-style Windows Event Log fields inside an RFC3164 syslog envelope
Splunk target file: <hostname.domain>/windows_event_sysmon.xml
Splunk target format: XML event stream (one complete <Event>...</Event> per line)
Provider: Microsoft-Windows-Sysmon
Channel: Microsoft-Windows-Sysmon/Operational
The default target emits one rooted XML document. The sof-elk target emits
Snare syslog only and eforge eval maps both variants back to the canonical
windows_event_sysmon format bucket.
| Event ID | Name | Category | Notes |
|---|---|---|---|
| 1 | ProcessCreate | Execution | Version 5. Enriches 4688 with file hashes (SHA1/MD5/SHA256/IMPHASH), FileVersion, Description, Product, Company, OriginalFileName, ParentCommandLine. Hashes are deterministic fakes seeded from image path + hostname. ParentCommandLine is populated from the parent process's actual command line in StateManager (e.g., powershell.exe, cmd.exe /k, Code.exe --folder-uri ...). ParentImage reflects realistic parent-child relationships driven by spawn_rules.yaml — CLI tools parent from shells, GUI apps from explorer.exe, system services from services.exe/svchost.exe. |
| 5 | ProcessTerminate | Execution | Version 3. Emitted alongside Security 4689 and eCAR PROCESS/TERMINATE for the same process exit. Storyline processes terminate with realistic delays based on command type (recon: 0.3-5s, attack tools: 5-30s, persistent/C2: no termination). Fields: ProcessGuid, ProcessId, Image, User. |
| 8 | CreateRemoteThread | Defense Evasion | Version 2. Detects process injection. Source and target process GUIDs, thread start address, StartModule, and StartFunction. Baseline generates benign noise (1-3/hr) from Defender, CSRSS, svchost. Correlated with eCAR THREAD/REMOTE_CREATE. |
| 10 | ProcessAccess | Credential Access | Version 3. Detects credential dumping (e.g., mimikatz accessing lsass.exe). Includes GrantedAccess mask, CallTrace. Baseline generates benign noise (3-8/hr) from Defender, CSRSS, Services.exe. Correlated with eCAR PROCESS/OPEN. |
Known Limitations:
- ProcessGuid is deterministic from (hostname, PID, process creation time), so Events 1/3/5/7/8/10/11/12/13/22 agree for the same known process. The rendered shape follows Sysmon-style machine/time/token morphology rather than RFC UUID version bits.
- File hashes are fake but consistent (same binary on same host always produces same hash)
- Sysmon Event 1 is emitted alongside Security 4688 for the same process creation — both emitters handle
process_createevents - Process create/terminate lifecycle and process-owned file/module/registry/network side effects are coordinated through the internal process-execution bundle so endpoint sources share parent/session identity and source-visible ordering.
- Implemented events focus on the project evidence model: 1, 3, 5, 7, 8, 10, 11, 12, 13, and 22.
File: <sensor-name>/<logtype>.json
Format: NDJSON (one JSON object per line)
Zeek logs are per-sensor. Which connections appear depends on sensor placement (SPAN/TAP), monitored segments, and direction. All Zeek logs for the same connection share a common UID. If no Zeek sensors are configured, EvidenceForge does not emit Zeek logs.
| Log Type | File | Description | Notes |
|---|---|---|---|
| conn.log | conn.json |
Connection metadata | TCP, UDP, ICMP. Includes duration, bytes, packets, conn_state, history. |
| dns.log | dns.json |
DNS queries/responses | A, AAAA, PTR, SRV, TXT, MX, NS, and SOA query types. Automatic connection-prerequisite lookups route through the internal DNS lookup bundle so resolver choice, cache behavior, TTL observations, Zeek DNS/conn fan-out, Sysmon DNS visibility, and companion resolver questions stay consistent with connection hostnames. MX generation avoids CDN-style hostnames; TXT covers SPF/DKIM/DMARC-style background lookups. NXDOMAIN for suffix search. AA flag for internal zones. |
| http.log | http.json |
HTTP transactions | Method, URI, status code, user-agent, response body length, and Zeek trans_depth. Only for port 80 TCP connections. Browser/page-load sessions can reuse one UID for multiple same-flow transactions; file-analyzed responses include resp_fuids/resp_mime_types vectors linked to files.log. |
| ssl.log | ssl.json |
TLS handshakes | TLS version, cipher suite, SNI server_name, and cert_chain_fuids linking to x509 certificates. Generated for port 443 connections. Certificate-chain depth is driven by tls_realism.yaml. |
| files.log | files.json |
File transfers | Extracted from HTTP responses, OCSP responses, and substantial SMB transfers. Uses Zeek-native tx_hosts, rx_hosts, and conn_uids arrays plus fuid, optional filename for SMB, MIME type, byte counts, and md5/sha1/sha256 when the matching analyzer ran. Transfer metadata is built through the internal file-transfer bundle path so FUIDs, hashes, filenames, direction, byte counts, and optional PE analysis stay coordinated. Large/download-scale HTTP responses attach this metadata deterministically; smaller eligible HTTP bodies remain sampled. SMB thresholds, filename templates, and MIME/analyzer mix are driven by smb_file_transfers.yaml. |
| dhcp.log | dhcp.json |
DHCP transactions | Client address, MAC (diversified OUI from network_params.yaml), hostname. Acquisition and renewal route through the internal DHCP lease bundle so Zeek DHCP/conn rows and Linux dhclient syslog companions share one lease identity. DHCP broadcast is treated as link-local: visible to SPAN sensors on the client segment, not routed through unrelated TAP/firewall segments. |
| ntp.log | ntp.json |
NTP synchronization | Server-response records with version, mode 4, stratum, poll interval, and timing fields. NTP rows are emitted only when the matching UDP/123 conn row is response-bearing, so Zeek UID, conn_state/history, bytes, packets, duration, and parser timing agree. Version and poll are stable per client/server association, while stratum, ref-id, precision, root delay, and root dispersion are owned by the responding server. Scenario-defined internal/domain NTP servers are preferred; public fallback servers come from network_params.yaml. |
| x509.log | x509.json |
X.509 certificates | Leaf and intermediate certificate id/fingerprint, subject/issuer, validity (issuer-aware from tls_issuers.yaml), key info, and CA constraints. Intermediate CA certificate profiles are reused by subject/issuer so the same CA does not appear as many different certificates in one dataset. |
| weird.log | weird.json |
Protocol anomalies | Unusual network behavior. Automatic weird generation is currently disabled pending a data-driven Zeek weird compatibility model; explicitly supplied WeirdContext events still render. |
| pe.log | pe.json |
Portable Executable | Windows binary metadata over network. |
| ocsp.log | ocsp.json |
OCSP responses | Certificate revocation responses whose id joins to files.log fuid, matching Zeek file-analysis semantics. |
| packet_filter.log | packet_filter.json |
BPF filter changes | Zeek packet filter status. |
| reporter.log | reporter.json |
Zeek internal messages | Zeek operational status. |
Known Limitations:
- No SMB-specific Zeek log (smb_files.log, smb_mapping.log) — SMB traffic appears in conn.log, substantial transfers can appear in files.log, and file-server activity can also produce host-side eCAR FILE records
- No SMTP log — email traffic appears in conn.log only
- http.log only for port 80; HTTPS content is not decrypted (as expected)
missed_bytesis probabilistic (~3% of long TCP connections) rather than from actual packet capture- All timestamps use 6-digit microsecond precision
File: ecar.json
Format: NDJSON
Simulated EDR telemetry rendered in MITRE CAR-based eCAR format. Represents what an EDR agent would observe.
Record structure: Every eCAR record contains pid and tid as always-present top-level integers (-1 = unavailable). ppid appears on PROCESS events only. The properties map contains event-specific key-value pairs where all values are strings (including ports).
Entity correlation (objectID/actorID graph): Each record carries a persistent objectID (UUID) that identifies the entity being acted upon. Entity lifecycle events share the same objectID — e.g., a PROCESS/CREATE and PROCESS/TERMINATE for the same process, or a USER_SESSION/LOGIN and USER_SESSION/LOGOUT for the same session. The optional actorID field links to the objectID of the entity that performed the action — e.g., a PROCESS/CREATE's actorID points to its parent process's objectID, and a FILE/CREATE's actorID points to the process that created it.
| Object Type | Actions | Notes |
|---|---|---|
| PROCESS | CREATE, TERMINATE, OPEN | CREATE/TERMINATE include pid, ppid, image_path, parent_image_path, command_line, user. Correlated with syslog for CRON jobs and systemd service start/stop on Linux. OPEN maps to Sysmon Event 10 (ProcessAccess) — includes granted_access, target_pid, target_image_path, and target_process_uuid in properties. |
| THREAD | REMOTE_CREATE | Maps to Sysmon Event 8 (CreateRemoteThread). Properties include src_pid, target_pid, target_process_uuid, start_address, and stack addresses matching OpTC eCAR format. Thread ID, target PID, and start address are generated once in RemoteThreadContext and rendered consistently across Sysmon and eCAR. |
| FILE | READ, CREATE, WRITE, DELETE | Generated alongside process activity, baseline SMB file-server access, and modeled transfer receiver evidence such as SCP target-side file creation. |
| FLOW | CONNECT | Network connections from host perspective. Includes src/dst IP, port, protocol. |
| REGISTRY | MODIFY | Windows registry operations. |
| MODULE | LOAD | DLL loads for Windows processes using the same process-aware DLL profile data as Sysmon ImageLoaded events. |
| USER_SESSION | LOGIN, LOGOUT | Logon/logoff events. LOGIN includes outcome (success or failure); Windows successful logons include logon_type, while non-Windows sessions use OS-native session_type values such as ssh, remote, local, or service. Failed attempts include failure_reason/status fields and do not imply an established session. |
| SERVICE | CREATE | Service installation. Correlated with Windows 4697. Includes service_name, image_path (binary path), service_account in properties. |
Known Limitations:
- eCAR format represents an optional EDR layer — not all systems may have it enabled
- FLOW events carry the initiating system process pid when endpoint attribution is available (svchost for DNS/NTP, lsass for Kerberos/LDAP, System PID 4 for SMB, mstsc.exe for RDP); pid/tid fields are omitted when unavailable instead of rendering placeholder IDs
- Limited EDR object diversity on Linux (mainly PROCESS + USER_SESSION)
- File paths cycle through a small set of templates
Default target file: <hostname.domain>/syslog.log
Default target format: RFC5424 syslog with full timestamp year
SOF-ELK target file: <hostname.domain>/<year>/syslog.log
SOF-ELK target format: RFC3164/BSD syslog with PRI
Splunk target file: <hostname.domain>/syslog.log
Splunk target format: RFC5424 syslog with full timestamp year
Authentication and system logs from Linux hosts. The default target emits
flat per-host RFC5424 syslog for SIEM-neutral output. The splunk target keeps
that RFC5424 shape. The sof-elk target emits a BSD/RFC3164 envelope
(<PRI>MMM DD HH:MM:SS HOST APP[PID]: MESSAGE) and
partitions files by event year so SOF-ELK can recover the timestamp year from
the archive path. eforge eval accepts both current target variants plus older
legacy RFC5424 and flat BSD/RFC3164 files. All generated syslog entries are
rendered from SyslogContext on SecurityEvent — the emitter doesn't derive
messages from other contexts. Multi-phase activities such as SSH sessions are
coordinated by action-bundle semantics above individual SecurityEvents: the
bundle owns lifecycle, ordering, source timing, and shared identities, while each
syslog row remains a distinct canonical occurrence. Remote Linux sshd
failed-password rows reuse the same source port as the companion Zeek SSH
connection tuple.
| Program | Description | Notes |
|---|---|---|
| sshd | SSH authentication | Accepted/Failed password, session opened/closed, pam_unix messages. |
| systemd | Service management | Started/stopped service units. |
| systemd-logind | Login sessions | New session, removed session. |
| CRON | Scheduled tasks | cron job execution. |
| kernel | Kernel messages | UFW firewall blocks, uptime, hardware. |
| sudo | Privilege escalation | Command execution via sudo. |
| su | User switching | Switch user events. |
| systemd-timesyncd | NTP sync | Time synchronization status. |
| snapd | Snap packages | Ubuntu snap daemon messages. |
Known Limitations:
- Limited program variety (~9 programs vs 30+ on real servers)
- No application-specific logs (nginx, postfix, mysql, etc.) even when services are declared
- No SSH protocol negotiation messages (key exchange, cipher selection) before auth
- Bash history may be sparse relative to SSH session duration
File: <hostname.domain>/bash_history/<username>.bash_history
Format: Timestamped bash history (#<epoch>\n<command>)
Per-user command history for Linux systems. Baseline SSH sessions to Linux servers generate organic admin commands (ls, df, ps, systemctl, etc.) for realistic admin users (sysadmin, help_desk, developer, security_analyst personas), creating per-user history files on all Linux hosts. Storyline process events inject 0-3 organic noise commands around each attack command for realistic interleaving. Bash-history timing and optional foreground process telemetry are coordinated by the internal Linux shell-command bundle so command text, source-visible timing, and endpoint process evidence stay aligned.
Known Limitations:
- No command typos, tab-completion artifacts, or repeated commands
- No command output or error messages
File: snort_alert.log
Format: Snort fast alert format
Network intrusion detection alerts. Baseline generates false-positive alerts (e.g., ICMP PING, SSH scan, policy violations) correlated with Zeek conn records via canonical SecurityEvent dispatch. Storyline generates true-positive alerts for malicious connections. IDS signature-to-context construction is owned by the internal IDS alert action bundle so Snort/Suricata rows render canonical network/DNS/HTTP evidence rather than independently inventing alert payloads.
Web scan events (web_scan storyline type) generate three layers of IDS alerts:
- Scanner UA detection — identifies the scanning tool by user-agent (non-TLS only)
- Per-path content alerts — curated SID mappings for specific probe paths (non-TLS only)
- Connection-rate threshold — generic scan-rate alerts (both TLS and non-TLS)
Alert format: [gid:sid:rev] where gid defaults to 1, sid identifies the rule, and rev reflects real ET/Community ruleset revision numbers sourced from sample_data/snort/. Each (gid, sid) pair has stable rule identity and carries a rev field.
Known Limitations:
- IDS alert variety is limited to curated SID pools (not full ruleset simulation)
Default target file: <fw-hostname>/cisco_asa.log
SOF-ELK target file: <fw-hostname>/<year>/cisco_asa.log
Splunk target file: <fw-hostname>/cisco_asa.log
Format: Cisco ASA syslog (RFC 3164 BSD syslog with ASA message IDs)
Cisco ASA firewall logs for permitted and denied connections. Produced by firewall-type network sensors with cisco_asa in their log_formats. Each permitted connection generates a Built + Teardown pair; denied connections generate a single Deny record.
| Message ID | Severity | Protocol | Description |
|---|---|---|---|
| 302013 | 6 (info) | TCP | Built inbound/outbound TCP connection |
| 302014 | 6 (info) | TCP | Teardown TCP connection (with duration, bytes, reason) |
| 302015 | 6 (info) | UDP | Built inbound/outbound UDP connection |
| 302016 | 6 (info) | UDP | Teardown UDP connection |
| 302020 | 6 (info) | ICMP | Built inbound/outbound ICMP connection |
| 302021 | 6 (info) | ICMP | Teardown ICMP connection |
| 106023 | 4 (warn) | any | Deny by access-group |
| 305011 | 6 (info) | any | Built dynamic/static NAT translation |
| 305012 | 6 (info) | any | Teardown dynamic/static NAT translation |
| 733100 | 4 (warn) | — | Threat detection scanning alert (automatic, rate-based) |
Example records:
<166>Jun 15 14:23:05 fw01 %ASA-6-302013: Built outbound TCP connection 100042 for inside:10.0.10.50/54321 (10.0.10.50/54321) to outside:45.83.221.50/443 (45.83.221.50/443)
<166>Jun 15 14:24:28 fw01 %ASA-6-302014: Teardown TCP connection 100042 for inside:10.0.10.50/54321 to outside:45.83.221.50/443 duration 0:01:23 bytes 5120 TCP FINs
<164>Jun 15 14:23:10 fw01 %ASA-4-106023: Deny tcp src outside:104.248.71.33/44231 dst inside:10.0.10.50/445 by access-group "outside_access_in" [0x0, 0x0]
<164>Jun 15 14:23:15 fw01 %ASA-4-733100: [Scanning] drop rate-1 exceeded. Current burst rate is 87 per second, max configured rate is 10; Current average rate is 45 per second, max configured rate is 5; Cumulative total count is 2340
Threat detection (733100): The ASA emitter automatically tracks per-source-IP deny rates. When both burst rate (default 10 drops/sec over 20s) and average rate (default 5 drops/sec over 60s) are exceeded, a 733100 alert fires. Can re-fire after a 20-second cooldown if rates remain elevated. Configurable via threat_detection_rate on the firewall sensor (set to 0 to disable).
NAT translation (305011/305012): When nat_rules are configured on the firewall sensor, permitted connections that cross the NAT boundary produce 305011 (Built) and 305012 (Teardown) translation records alongside the normal 302013/302014 connection records. Built messages show post-NAT mapped addresses in parentheses. Outside Zeek sensors see post-NAT IPs; inside Zeek sensors see real IPs.
Baseline deny generation: When deny_ratio > 0 on the firewall sensor, the baseline generates denied connection attempts proportional to allowed traffic. Patterns include external scanning (60%), cross-segment blocked (20%), outbound blocked (10%), and ICMP noise (10%).
Storyline event types: port_scan generates bulk 106023 denies for reconnaissance/scanning. beacon with action: deny generates periodic 106023 denies for blocked malware beaconing. Both produce correlated Zeek conn.log entries on sensors that can see the source-side traffic. Port scans with sufficient rate automatically trigger 733100 threat detection alerts.
Source-only visibility: Denied connections are only visible to sensors on the source side of the firewall. Sensors on the destination side do not see blocked traffic.
Known Limitations:
- Simplified message format — omits IDFW user, internal port numbers, rx_ring metadata
File: web_access.log
Format: Apache/Nginx combined log format
HTTP access logs for web server systems.
Entries use Apache/Nginx combined syntax:
client-ip - username [dd/Mon/yyyy:HH:MM:SS zone] "METHOD path HTTP/version" status bytes "Referer" "User-Agent"
Referer field: Browser-originated traffic carries a realistic Referer distribution — roughly 55% blank (direct/bookmark), 20% search engine (Google/Bing), 20% same-origin, 5% social/news. Bot user-agents (Googlebot, bingbot, AhrefsBot) always have blank Referer. Scanner traffic (web_scan events) follows per-preset rules grounded in real scanner behavior: Nikto sends same-origin Referer on ~30% of requests (partial-crawl mode); gobuster, sqlmap, dirb, and nmap_http send no Referer. This means the Referer field is useful for distinguishing human browsing from automated scans in training exercises.
Known Limitations:
- Only generated for systems with web server role
File: <proxy-hostname.domain>/proxy_access.log
Format: W3C Extended Log Format
Forward proxy access logs for systems with the forward_proxy role. Outbound HTTP/HTTPS traffic is routed through the proxy system. In environment.proxy.mode: transparent, network sensors can still show direct-looking client-to-origin traffic. In mode: explicit, the generator emits client-to-proxy and proxy-to-origin network legs; each Zeek/IDS/firewall sensor sees only the leg its topology can observe. If the proxy denies a request, the transaction stops at the proxy and no proxy-to-origin Zeek, IDS, or firewall evidence is emitted. HTTP/S storyline beacon events from proxied hosts use the same explicit proxy routing, including proxy-side denied CONNECT/GET evidence for action: deny.
The proxy log uses a W3C Extended-style #Fields header:
#Fields: date time c-ip cs-username cs-method cs-uri cs-version sc-status sc-bytes cs-bytes time-taken cs-host cs(User-Agent) cs(Referer) rs(Content-Type) s-cache-result x-proxy-action
Fields are whitespace-delimited; values with spaces, such as User-Agent strings, are rendered with + separators. Missing values are -.
Referrer field: The W3C Extended format output includes a cs(Referer) field, linking subresource requests back to the page that triggered them.
Proxy action field: The x-proxy-action field disambiguates source-native proxy behavior: tunnel-setup for CONNECT setup rows, ssl-inspect for decrypted HTTPS request rows, forward for ordinary forwarded HTTP, and deny/auth-required/gateway-error for proxy-side terminal failures.
CONNECT tunnel behavior: HTTPS traffic generates one CONNECT entry per unique (client_ip, host) pair per session, with a 5-minute idle timeout. Subsequent HTTPS requests to the same host within the timeout reuse the existing tunnel without emitting another CONNECT. The current proxy model assumes TLS interception, so inspected HTTPS requests can also appear as W3C Extended request rows such as GET https://host/path HTTP/1.1.
Status and byte semantics: For explicit proxy mode, client-side Zeek HTTP records describe the client-to-proxy exchange. Plain HTTP denials therefore show the proxy's status code and proxy response size, not the origin's status/body. For intercepted HTTPS, the CONNECT setup status is tracked separately from the inspected request status, so a successful tunnel setup can coexist with a denied inspected GET.
Source-native HTTP semantics: Domain/path planning is resolved before proxy and Zeek HTTP rows are rendered. Public browser-like domains default to HTTPS-first behavior, so plaintext port-80 requests redirect instead of serving login pages; internal hosts and service/update endpoints can keep plaintext source-native behavior. Browser requests also follow no-referrer-when-downgrade semantics, service/update endpoints keep source-compatible User-Agents, and executable/download paths use binary content types with download-scale body sizes.
Session depth: Persona HTTP traffic and inbound web_server human visitors generate multi-request browsing sessions with subresource cascades. Each page load triggers follow-on requests for JS, CSS, images, fonts, and same-origin API calls, producing realistic request clusters in proxy and web access logs. Persona browsing depth is controlled by browsing_intensity; inbound web visitor classes, tool/API requests, and User-Agent pools are controlled by web_session_profiles.yaml.
Known Limitations:
- Only generated for systems with the
forward_proxyrole declared - Non-intercepting tunnel-only HTTPS proxy behavior is not yet modeled
- Cache hit/miss status is probabilistic, with stable web-route status generated upstream
- Limited to HTTP and HTTPS traffic