AI-powered HIPAA compliance, PHI/PII detection, and healthcare data security skills for Claude, Cursor, Windsurf, and AI agents.
HIPAA Guardian is a specialized skills collection designed for healthcare professionals, developers, and organizations building HIPAA-compliant systems. It provides automated tools for:
- 🔍 Detecting Protected Health Information (PHI) - All 18 HIPAA identifiers
- ✅ Validating Healthcare Formats - HL7 FHIR, HL7 v2, CDA, X12 EDI
- 📋 Audit Logging - Immutable compliance audit trails per 45 CFR §164.312(b)
- 🛡️ Risk Assessment - Breach risk scoring and remediation guidance
- 🔐 Compliance Mapping - HIPAA, NIST CSF 2.0, HITRUST alignment
- ✅ HIPAA-Ready: Designed for HIPAA BAA (Business Associate Agreement) environments
- ✅ Audit Trail: Supports immutable logging per 45 CFR §164.312(b)
- ✅ Standards Integration: HL7 FHIR R5, NIST CSF 2.0, HITRUST CSF alignment
- ✅ Open Source: MIT License, security-first code review process
Note: This skill collection is designed to support HIPAA compliance but does not guarantee HIPAA compliance. For production environments, consult with your legal and compliance team and execute a Business Associate Agreement (BAA) with any service provider.
Input File/Code
↓
Pattern Matching (18 HIPAA Identifiers)
↓
Confidence Scoring (0-100%)
↓
Risk Assessment
↓
HIPAA Rule Mapping
↓
Report Generation + Remediation
| Identifier | Examples | Risk |
|---|---|---|
| Names | Patient, provider, relatives | HIGH |
| SSN | Social Security Numbers | CRITICAL |
| MRN | Medical Record Numbers | CRITICAL |
| DOB | Date of birth, admission date | HIGH |
| Phone/Fax | All formats detected | MEDIUM |
| Healthcare email addresses | MEDIUM | |
| Address | Streets, cities, ZIP codes | MEDIUM |
| Health Plan ID | Insurance, policy numbers | HIGH |
| Biometric | Photos, fingerprints, voice | CRITICAL |
| Device IDs | Serial numbers, UDI codes | MEDIUM |
# Install all HIPAA Guardian skills
npx skills add 1Mangesh1/hipaa-guardian
# Install specific skill
npx skills add 1Mangesh1/hipaa-guardian --skill hipaa-guardian# Ask Claude/Copilot to scan your codebase
"Scan our backend code for hardcoded PHI like patient names, SSNs, or MRNs"
# Output: Detailed finding report with:
# - File locations
# - Line numbers
# - Risk scores
# - Remediation stepsExample Finding:
{
"file": "database/seeders/PatientSeeder.js",
"line": 42,
"finding": "SSN detected: 123-45-6789",
"identifier_type": "ssn",
"risk_score": 95,
"severity": "CRITICAL",
"remediation": [
"Remove hardcoded SSN from seeder",
"Use faker.js or test data library",
"Use environment variables for test credentials"
]
}# Validate FHIR Patient resource
"Check this FHIR resource for PHI exposure and compliance issues"
{
"resourceType": "Patient",
"id": "pat-123",
"identifier": [{"value": "MRN-2024-001"}],
"name": [{"given": ["John"], "family": "Doe"}],
"birthDate": "1985-01-15"
}
# Returns: ✓ Valid FHIR R5 structure, ✓ All PHI properly identified# Log healthcare data access for HIPAA compliance
"Create an audit log for a user viewing patient medical records"
# Generates compliant entry with:
# - Unique audit ID
# - Exact timestamp
# - User identification
# - Action taken
# - Resource accessed
# - Success/failure status# Full HIPAA compliance assessment
"Audit our codebase for HIPAA compliance and generate a report"
# Creates comprehensive report:
# - Executive summary
# - Findings breakdown by severity
# - HIPAA rule mappings
# - Risk assessment
# - Remediation playbook| Skill | Purpose | Activation Triggers | Version |
|---|---|---|---|
| hipaa-guardian | PHI/PII detection, healthcare format validation, audit logging | "scan for PHI", "HIPAA compliance", "detect PII", "healthcare data security" | 1.2.0 |
| fhir-hl7-validator | HL7 FHIR R5 & HL7 v2 validation | "validate FHIR", "check HL7 message", "healthcare format" | 1.0.0 |
| healthcare-audit-logger | HIPAA-compliant audit trail logging | "audit log", "compliance logging", "track healthcare access" | 1.0.0 |
hipaa-guardian Skill - Core Features
- 18 HIPAA Safe Harbor Identifiers: Names, SSN, MRN, DOB, phone, email, address, IP, biometric, etc.
- Confidence Scoring: 0-100% match confidence with pattern analysis
- Risk Assessment: Automated risk scoring based on sensitivity & exposure
- File Type Support: JSON, CSV, XML, SQL, Python, JavaScript, YAML, FHIR, HL7, CDA
- Smart Patterns: Entropy detection, format validation, cross-field analysis
HL7 FHIR R5 → Patient, Condition, Observation, MedicationRequest
HL7 v2.x → MSH, PID, DG1, OBX, RXO segments
CDA/C-CDA → Clinical documents, patientRole elements
X12 EDI → Healthcare claims (837, 835 formats)
✓ Source code (all languages)
✓ Comments and documentation
✓ Test fixtures and mock data
✓ Configuration files (.env, secrets)
✓ Database seeds and migrations
✓ API response samples
- HIPAA Rule Mapping: Each finding linked to specific regulatory sections
- Breach Risk Scoring: 0-100 risk score with severity levels (CRITICAL→LOW)
- De-identification Validation: Verify data removal meets HIPAA standards
- Audit Trail Generation: 45 CFR §164.312(b) compliant logging
- Remediation Guidance: Step-by-step fix instructions with code examples
- Claude/Copilot/Windsurf: Prompt activation with skill triggers
- GitHub Actions: CI/CD pipeline integration
- Pre-commit Hooks: Automatic scanning before code commits
- VS Code Extension: Real-time PHI detection while coding
- OpenAPI 3.1: REST API for third-party integration
Validates healthcare data against HL7 standards:
- FHIR R5 Schema Validation - Patient, Condition, Observation resources
- HL7 v2 Message Parsing - Complete v2.x segment validation
- CDA Document Structure - Clinical Document Architecture compliance
- Custom Validation Rules - Domain-specific constraints
HIPAA-compliant audit trail logging:
- Immutable Logs - Tamper-evident audit trail
- Complete Context - User, action, resource, timestamp, outcome
- Access Control Logging - Who accessed what and when
- Event Classification - CREATE, READ, UPDATE, DELETE, EXPORT events
- Retention Management - Configurable log retention policies
Scenario: Healthcare startup building patient portal backend
# During PR review, scan code for accidental PHI commits
"Review this PR for any hardcoded patient data"
# Findings:
// ❌ CRITICAL: database/seeders/PatientSeeder.js:42
const mockPatient = {
name: "John Doe", // HIGH: Patient name
ssn: "123-45-6789", // CRITICAL: SSN
mrn: "MRN-2024-001", // CRITICAL: MRN
dob: "01/15/1985", // HIGH: Date of birth
};
// ✅ Remediation: Use faker.js instead
const faker = require('faker');
const mockPatient = {
name: faker.name.fullName(),
ssn: faker.datatype.string(11),
mrn: `MRN-${faker.datatype.uuid()}`,
dob: faker.date.past(),
};Scenario: Hospital discovering potential data leak through logs
# Scan application logs for PHI exposure
"Check our logs for patient data that shouldn't be there"
# Findings:
❌ application.log:2024-02-07T10:15:33Z
ERROR: Query failed for patient John Doe (SSN: 123-45-6789)
✅ Remediation:
1. Remove specific identifiers from logs
2. Hash or mask sensitive data
3. Use patient ID instead of names
4. Implement log filtering policy
// Good logging pattern:
logger.error(`Query failed for patient: ${patient.id}`);
// Never log: name, SSN, DOB, MRNScenario: Building HL7 FHIR-compliant patient API
# Validate FHIR resources before returning to clients
"Check this patient response for PHI safety and FHIR compliance"
// API response - auto-checked:
{
"resourceType": "Patient",
"id": "pat-12345", // ✓ Safe ID only
"identifier": [...], // ✓ Medical Record Numbers
"name": [{"given": [...]}], // ✓ Patient names (expected)
"telecom": [...], // ✓ Phone/email
"birthDate": "1985-01-15", // ✓ DOB (expected in healthcare)
"address": [...] // ✓ Address data
}
✓ Valid FHIR R5 Patient
✓ All fields appropriate for healthcare context
✓ PHI is expected and necessary
✓ Safe to transmit to authorized clientsScenario: Annual HIPAA audit for healthcare SaaS company
# Generate full compliance report
"Run a comprehensive HIPAA compliance check on our entire codebase"
# Generates Report:
├── Executive Summary
│ ├── Overall Risk: MEDIUM
│ ├── Critical Findings: 3
│ ├── High Findings: 12
│ └── Remediation Time: ~16 hours
├── Findings by Category
│ ├── Code Security (12 issues)
│ ├── Configuration (5 issues)
│ ├── Test Data (8 issues)
│ └── Documentation (3 issues)
├── HIPAA Rule Mappings
│ ├── 45 CFR §164.308 (Admin safeguards)
│ ├── 45 CFR §164.312 (Technical safeguards)
│ └── 45 CFR §164.504 (BA requirements)
└── Remediation Playbook
├── Priority 1: Critical fixes (3 items)
├── Priority 2: High-risk items (12 items)
└── Timeline and owner assignment- Privacy Rule (§164.500+): Patient rights, use & disclosure, PHI protections
- Security Rule (§164.300+): Administrative, physical, and technical safeguards
- Breach Notification Rule (§164.400+): Notification requirements & documentation
- HL7 FHIR R5: International healthcare data exchange standard
- NIST Cybersecurity Framework 2.0: Governance, risk management, detect/respond functions
- NIST SP 800-66: HIPAA security implementations
- NIST SP 800-188: De-identification of personal information
- HHS HIPAA Guidance: Official HIPAA compliance portal
- OCR Enforcement: HIPAA violations, corrective action plans
- HITRUST CSF: Certified HIPAA compliance framework
We welcome contributions! Areas for improvement:
- New healthcare data format support
- Additional HIPAA rule mappings
- Pre-commit hook enhancements
- Language-specific pattern improvements
See skill-specific documentation in ./skills/*/ directories for contribution guidelines.
- references/HIPAA-OVERVIEW.md - Complete HIPAA rule reference
- references/HL7-FHIR-R5.md - FHIR resource specifications
- references/NIST-CSF-2.0.md - Cybersecurity framework
- references/HEALTHCARE-DATA-TYPES.md - Healthcare formats
- skills/hipaa-guardian/ - Core PHI detection skill
- skills/fhir-hl7-validator/ - Healthcare format validation
- skills/healthcare-audit-logger/ - Audit logging
Use these phrases to activate HIPAA Guardian skills in Claude, Cursor, or Windsurf:
- "Scan for PHI" / "Detect PII"
- "HIPAA compliance check" / "HIPAA audit"
- "Healthcare data security" / "Check code for PHI leakage"
- "Scan logs for PHI" / "Check authentication on PHI endpoints"
- "Generate HIPAA audit report" / "Find sensitive healthcare data"
- "Validate FHIR resource" / "Check HL7 message"
- "Healthcare format validation"
- "Validate FHIR R5" / "Check HL7 v2"
- "Create audit log" / "Compliance logging"
- "Track healthcare access" / "Audit trail"
Q: Can I use this in production? A: These skills are designed to support compliance efforts. Always conduct your own security review, consult your legal/compliance team, and execute a Business Associate Agreement (BAA) with any external providers.
Q: Does it detect all PHI? A: It detects the 18 HIPAA Safe Harbor identifiers with high confidence. However, some context-dependent PHI may require manual review. Always combine automated detection with human review.
Q: What about false positives? A: Confidence scores (0-100%) are provided for each finding. Low-confidence findings may be false positives. Always review findings in context.
Q: Can I integrate with my CI/CD? A: Yes! Check skills/hipaa-guardian/ for GitHub Actions, pre-commit hook, and custom integration examples.
Q: How do I report security issues? A: Please email security vulnerabilities privately (do not create public GitHub issues for security problems).
- Check confidence threshold - May be filtering low-confidence matches
- Verify pattern matches - Some formats vary (e.g., SSN: 123-45-6789 vs 123456789)
- Context matters - Test data may be intentionally de-identified
- Exclude unnecessary directories -
.git,node_modules,dist,build - Filter by file type - Focus on relevant files (code, config, not binaries)
- Use batch scanning - Process directories in smaller chunks
- Review concrete examples - Check the skill examples for best practices
- Consult references - references/ has detailed guidance
- Reach out - Create an issue with specific use case
- 📖 Documentation: ./references/ - Detailed guides
- 🐛 Issues/Features: GitHub Issues
- 🔒 Security Reports: Report vulnerabilities responsibly (do not create public issues)
- 💬 Discussions: GitHub Discussions
MIT License - See LICENSE.txt
Permissions: ✓ Commercial use | ✓ Modification | ✓ Distribution | ✓ Private use
Conditions:
Limitations: ✗ No warranty | ✗ No liability
Repository: 1Mangesh1/hipaa-guardian
Last Updated: February 2026
Status: ✅ Active Development
Latest Version: 1.2.0
License: MIT