Skip to content

https://youtu.be/u8KJYQgUOs8 AI-powered paralegal assistant that extracts, summarizes, and analyzes legal documents. Upload PDF/DOCX contracts to get instant summaries, extract key clauses, and perform intelligent document analysis using OpenAI.

Notifications You must be signed in to change notification settings

AbdulRehmanMehar/AI-Legal-Document-Analyzer-POC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VisualCare - Paralegal Assistant POC

An intelligent document analysis tool built with Next.js that helps legal professionals analyze contracts and legal documents using AI. Upload PDF, DOCX, or TXT files to get instant summaries, extract key clauses, and analyze legal documents with the power of OpenAI.

Next.js React TypeScript TailwindCSS

✨ Features

  • Document Upload & Text Extraction

    • Drag & drop interface for easy file uploads
    • Supports PDF, DOCX, TXT, and Markdown files
    • Client-side text extraction (no server uploads required)
    • Advanced PDF parsing with PDF.js including timeout protection
    • DOCX parsing with headers/footers extraction using JSZip + Mammoth
  • AI-Powered Analysis (powered by OpenAI)

    • Summarize: Get concise summaries with key parties, terms, obligations, and risks
    • Key Clauses: Automated extraction and checklist of critical contract clauses
    • Q&A: Ask questions about specific document content (Coming Soon)
    • Draft: Generate new clauses or sections based on context (Coming Soon)
  • Privacy-First Design

    • All file parsing happens in the browser
    • No files uploaded to your server
    • Documents only sent to OpenAI API for analysis

πŸš€ Quick Start

Prerequisites

  • Node.js 20+
  • npm, yarn, pnpm, or bun
  • OpenAI API key (Get one here)

Installation

  1. Clone the repository

    git clone <repository-url>
    cd visualcare
  2. Install dependencies

    npm install
  3. Set up environment variables

    Create a .env.local file in the root directory:

    OPENAI_API_KEY=your_openai_api_key_here
    OPENAI_MODEL=gpt-4o-mini  # Optional, defaults to gpt-4o-mini
  4. Run the development server

    npm run dev
  5. Open the application

    Navigate to http://localhost:3000 in your browser.

πŸ“– Usage

  1. Upload a Document

    • Drag and drop a legal document (PDF, DOCX, TXT, or MD)
    • Or click "Choose file" to browse and select
    • The text will be automatically extracted
  2. Select an Analysis Task

    • Summarize: Get a comprehensive overview of the document
    • Key Clauses: Extract and review critical contract clauses with status indicators
  3. Run Analysis

    • Click the "Run" button to process the document
    • Wait for AI analysis to complete
    • Review the formatted output with syntax highlighting

πŸ—οΈ Project Structure

visualcare/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── analyze/
β”‚   β”‚       └── route.ts          # OpenAI API integration
β”‚   β”œβ”€β”€ globals.css               # Global styles
β”‚   β”œβ”€β”€ layout.tsx                # Root layout
β”‚   └── page.tsx                  # Main application UI
β”œβ”€β”€ lib/
β”‚   └── extractText.ts            # Document text extraction utilities
β”œβ”€β”€ public/                       # Static assets
β”œβ”€β”€ next.config.ts                # Next.js configuration
β”œβ”€β”€ package.json                  # Dependencies and scripts
β”œβ”€β”€ tailwind.config.ts            # Tailwind CSS configuration
└── tsconfig.json                 # TypeScript configuration

πŸ”§ Technical Stack

Frontend

  • Next.js 15.5.6 - React framework with App Router
  • React 19.1.0 - UI library
  • TypeScript 5 - Type safety
  • Tailwind CSS 4 - Utility-first styling
  • React Markdown - Markdown rendering with syntax highlighting

Document Processing

  • PDF.js (pdfjs-dist) - PDF text extraction
  • Mammoth - DOCX to text conversion
  • JSZip - DOCX headers/footers extraction

AI Integration

  • OpenAI API - Natural language processing and analysis

βš™οΈ Configuration

Environment Variables

Variable Required Default Description
OPENAI_API_KEY Yes - Your OpenAI API key
OPENAI_MODEL No gpt-4o-mini OpenAI model to use

Supported Models

  • gpt-4o-mini (default, cost-effective)
  • gpt-4o (more capable)
  • gpt-4-turbo
  • Any other OpenAI chat completion model

πŸ“ Scripts

Command Description
npm run dev Start development server with Turbopack
npm run build Build production bundle with Turbopack
npm start Start production server

πŸ”’ Privacy & Security

  • Client-side extraction: All document parsing happens in your browser
  • No file storage: Documents are not stored on any server
  • API only: Text is only sent to OpenAI API for analysis
  • Secure transmission: All API calls use HTTPS

πŸ“„ Document Processing Details

PDF Extraction

  • Uses PDF.js with multiple CDN fallbacks (unpkg, cloudflare)
  • Robust error handling with page-level timeouts
  • Supports system fonts and embedded fonts
  • Maximum image size: 1MB per image
  • Timeout protection: 30s for loading, 10s per page

DOCX Extraction

  • Primary method: ZIP parsing for headers, body, and footers
  • Fallback: Mammoth.js for body text
  • Preserves formatting markers (bullets, lists)
  • Normalizes whitespace for consistent output

Text Normalization

  • Removes control characters
  • Normalizes spaces and line breaks
  • Preserves paragraph structure
  • Limits to 90,000 characters for API context

πŸ€– AI Analysis Modes

Summarize

Generates comprehensive summaries including:

  • Parties and dates
  • Purpose and scope
  • Key obligations and deliverables
  • Payment, terms, and termination conditions
  • Liability, indemnity, and warranties
  • Confidentiality and IP rights
  • Identified risks and red flags

Key Clauses

Extracts and analyzes critical clauses:

  • Parties & Definitions
  • Payment & Fees
  • Term & Termination
  • Confidentiality
  • IP Ownership/Licensing
  • Warranties/Disclaimers
  • Indemnification
  • Limitation of Liability
  • Assignment & Subcontracting
  • Governing Law & Dispute Resolution
  • Compliance (privacy, export, anti-bribery)
  • Change Control & Notices
  • Non-compete & Non-solicit

Each clause includes:

  • Status (Present/Missing/Ambiguous)
  • Summary
  • Exact quote from document

🚧 Limitations

  • Documents are truncated to 90,000 characters to fit model context
  • Image-based PDFs (scanned documents) cannot be processed
  • Password-protected documents are not supported
  • Very large files may experience slower processing times
  • API usage is subject to OpenAI rate limits and pricing

πŸ› οΈ Development

Built with:

  • Turbopack for fast development and builds
  • TypeScript for type safety across the codebase
  • ESLint for code quality
  • PostCSS for CSS processing

πŸ“¦ Deployment

Deploy on Vercel

The easiest deployment option:

Deploy with Vercel

  1. Push your code to GitHub
  2. Import project in Vercel
  3. Add OPENAI_API_KEY environment variable
  4. Deploy

Other Platforms

This is a standard Next.js application and can be deployed to:

  • AWS Amplify
  • Netlify
  • Railway
  • Render
  • Any Node.js hosting platform

See Next.js deployment documentation for details.

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

This project is a POC (Proof of Concept) for demonstration purposes.

πŸ™ Acknowledgments


Note: This is a proof of concept project. For production use, consider adding authentication, rate limiting, error monitoring, and additional security measures.

About

https://youtu.be/u8KJYQgUOs8 AI-powered paralegal assistant that extracts, summarizes, and analyzes legal documents. Upload PDF/DOCX contracts to get instant summaries, extract key clauses, and perform intelligent document analysis using OpenAI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published