Zero-Trust Document Sanitization v2.0

Aegis-CDR deconstructs malicious PDF and DOCX files into safe atomic components, surgically stripping threats using Groq-powered LLaMA 3.3 intelligence.

Scroll to Decrypt

The CDR Philosophy

Traditional antivirus asks: "Is this file bad?"
Aegis asks: "What part of this is active content?"

Beyond Detection

Instead of relying on signatures, we assume every file is hostile and rebuild it from mathematically safe primitives.

Pixel-Perfect Safety

Text, images, and formatting remain identical. Only the executable danger (Macros, JS, OLE) is erased.

System Comparative Analysis
[ TRADITIONAL AV ]

File → Signature Scan

✗ Fails on Obfuscation

✗ Reactive (Zero-Day Vulnerable)

~99% Effective Rate

[ AEGIS CDR ]

File → Atomic Decomposition

✓ Strips ALL Active Content

✓ Mathematically Clean Rebuild

100% Structural Safety

Hardened 4-Layer Security

The Sanitization Workflow

01

Fingerprinting

Magic byte validation. Detects MZ (PE) headers disguised as PDF. Blocks extension spoofing at the binary level.

02

Decomposition

Unpacks OPC packages (DOCX) or iterates xref objects (PDF). Maps every potential threat vector in the file tree.

03

Disarming

Surgical removal of /JavaScript, /OpenAction, VBA Macros, and DDE objects. Neutralizes external phishing URIs.

04

AI Synthesis

Groq LLaMA 3.3-70B generates a natural language narrative of found threats and executes a final risk score.

Live_Analysis_Report_1092.json
Critical Threat Detected
100 Risk Score

Status: CRITICAL

14 High-severity threats detected including VBA Macros and PowerShell Cradles.

Scripts & JS Detected
VBA Macros Detected
AI Narrative Generated by LLaMA-3.3-70B

"The document 'malicious_test.docx' contains severe auto-execute scripts designed to spawn shell processes. Aegis has effectively isolated the vbaProject.bin and neutralized 2 encrypted DDE payloads. Reconstruction confirmed safe at 100% visual fidelity."

Python 3.10+
FastAPI
Groq LLM
PyMuPDF