Skip to content
This page was generated and translated with the assistance of AI. If you spot any inaccuracies, feel free to help improve it. Edit on GitHub

File & Directory Scanning

The sd scan command is the primary way to check files and directories for malware. It runs every file through the multi-layer detection pipeline -- hash matching, YARA rules, and heuristic analysis -- and reports a verdict for each file.

Basic Usage

Scan a single file:

bash
sd scan /path/to/file

Scan a directory (non-recursive by default):

bash
sd scan /home/user/downloads

Scan a directory and all subdirectories:

bash
sd scan /home --recursive

Command Options

OptionShortDefaultDescription
--recursive-roffRecurse into subdirectories
--json-joffOutput results in JSON format
--threads-tCPU coresNumber of parallel scan threads
--auto-quarantine-qoffAutomatically quarantine detected threats
--remediateoffAttempt automatic remediation (delete/quarantine based on policy)
--exclude-enoneGlob pattern to exclude files or directories
--reportnoneWrite scan report to a file path
--max-size-mb100Skip files larger than this size in megabytes
--no-yaraoffSkip YARA rule scanning
--no-heuristicsoffSkip heuristic analysis
--min-severitysuspiciousMinimum severity to report (suspicious or malicious)

Detection Flow

When sd scan processes a file, it passes through the detection pipeline in order:

File → Magic Number Detection → Determine File Type

  ├─ Layer 1: SHA-256 Hash Lookup (LMDB)
  │   Hit → MALICIOUS (instant, ~1μs per file)

  ├─ Layer 2: YARA-X Rule Scan (38,800+ rules)
  │   Hit → MALICIOUS with rule name

  ├─ Layer 3: Heuristic Analysis (file-type-aware)
  │   Score ≥ 60 → MALICIOUS
  │   Score 30-59 → SUSPICIOUS
  │   Score < 30 → CLEAN

  └─ Result Aggregation → highest severity wins

The pipeline short-circuits: if a hash match is found, YARA and heuristic analysis are skipped for that file. This makes scanning large directories fast -- most clean files are resolved at the hash layer in microseconds.

Output Formats

Human-Readable (default)

bash
sd scan /home/user/downloads --recursive
PRX-SD Scan Report
==================
Scanned: 3,421 files (1.2 GB)
Skipped: 14 files (exceeded max size)
Threats: 3 (2 malicious, 1 suspicious)

  [MALICIOUS] /home/user/downloads/invoice.exe
    Layer:   Hash match (SHA-256)
    Source:  MalwareBazaar
    Family:  Emotet
    SHA-256: e3b0c44298fc1c149afbf4c8996fb924...

  [MALICIOUS] /home/user/downloads/patch.scr
    Layer:   YARA rule
    Rule:    win_ransomware_lockbit3
    Source:  ReversingLabs

  [SUSPICIOUS] /home/user/downloads/updater.bin
    Layer:   Heuristic analysis
    Score:   42/100
    Findings:
      - High section entropy: 7.91 (packed)
      - Suspicious API imports: VirtualAllocEx, WriteProcessMemory
      - Non-standard PE timestamp

Duration: 5.8s (589 files/s)

JSON Output

bash
sd scan /path --recursive --json
json
{
  "scan_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "timestamp": "2026-03-21T14:30:00Z",
  "files_scanned": 3421,
  "files_skipped": 14,
  "total_bytes": 1288490188,
  "threats": [
    {
      "path": "/home/user/downloads/invoice.exe",
      "verdict": "malicious",
      "layer": "hash",
      "source": "MalwareBazaar",
      "family": "Emotet",
      "sha256": "e3b0c44298fc1c149afbf4c8996fb924...",
      "md5": "d41d8cd98f00b204e9800998ecf8427e"
    }
  ],
  "duration_ms": 5800,
  "throughput_files_per_sec": 589
}

Report File

Write results to a file for archival:

bash
sd scan /srv/web --recursive --report /var/log/prx-sd/scan-report.json

Exclusion Patterns

Use --exclude to skip files or directories matching glob patterns. Multiple patterns can be specified:

bash
sd scan /home --recursive \
  --exclude "*.log" \
  --exclude "node_modules/**" \
  --exclude ".git/**" \
  --exclude "/home/user/VMs/**"

Performance

Excluding large directories like node_modules, .git, and virtual machine images significantly improves scan speed.

Auto-Quarantine

The --auto-quarantine flag moves detected threats to the quarantine vault during the scan:

bash
sd scan /tmp --recursive --auto-quarantine
[MALICIOUS] /tmp/dropper.exe → Quarantined (QR-20260321-007)

Quarantined files are encrypted with AES-256 and stored in ~/.local/share/prx-sd/quarantine/. They cannot be accidentally executed. See the Quarantine documentation for details.

Example Scenarios

CI/CD Pipeline Scan

Scan build artifacts before deployment:

bash
sd scan ./dist --recursive --json --min-severity suspicious

Use the exit code for automation: 0 = clean, 1 = threats found, 2 = scan error.

Web Server Daily Scan

Schedule a nightly scan of web-accessible directories:

bash
sd scan /var/www /srv/uploads --recursive \
  --auto-quarantine \
  --report /var/log/prx-sd/daily-$(date +%Y%m%d).json \
  --exclude "*.log"

Forensic Investigation

Scan a disk image mounted as read-only:

bash
sudo mount -o ro /dev/sdb1 /mnt/evidence
sd scan /mnt/evidence --recursive --json --threads 1 --max-size-mb 500

Large Scans

When scanning millions of files, use --threads to control resource usage and --max-size-mb to skip oversized files that may slow the scan.

Home Directory Quick Check

Fast scan of common threat locations:

bash
sd scan ~/Downloads ~/Desktop /tmp --recursive

Performance Tuning

FilesApproximate TimeNotes
1,000< 1 secondHash layer resolves most files
10,0002-5 secondsYARA rules add ~0.3ms per file
100,00020-60 secondsDepends on file sizes and types
1,000,000+5-15 minutesUse --threads and --exclude

Factors affecting scan speed:

  • Disk I/O -- SSD is 5-10x faster than HDD for random reads
  • File size distribution -- Many small files are faster than few large files
  • Detection layers -- Hash-only scans (--no-yara --no-heuristics) are fastest
  • Thread count -- More threads help on multi-core systems with fast storage

Next Steps

Released under the Apache-2.0 License.