From 0139ea7fc91e535b9b3f8cbc0e5c3982811d486a Mon Sep 17 00:00:00 2001 From: Utakata Date: Mon, 6 Jan 2025 19:07:03 +0900 Subject: [PATCH 1/5] Add google-generativeai dependency for Gemini support --- requirements.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/requirements.txt b/requirements.txt index e7c0d27..f9fcc7a 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,5 @@ -pydantic -openai -pymupdf -termcolor - +PyPDF2 +google-generativeai +python-dotenv +requests +tqdm \ No newline at end of file From 9cac8edf4e564a69da0a7eea3a7f7c4c176a2169 Mon Sep 17 00:00:00 2001 From: Utakata Date: Mon, 6 Jan 2025 19:07:46 +0900 Subject: [PATCH 2/5] Add Gemini API support --- .env.example | 2 + README.md | 186 ++++++++++++------------------------------- pdf_reader_gemini.py | 102 ++++++++++++++++++++++++ 3 files changed, 154 insertions(+), 136 deletions(-) create mode 100644 .env.example create mode 100644 pdf_reader_gemini.py diff --git a/.env.example b/.env.example new file mode 100644 index 0000000..11d016d --- /dev/null +++ b/.env.example @@ -0,0 +1,2 @@ +# Gemini API Key +GEMINI_API_KEY=your_api_key_here diff --git a/README.md b/README.md index 81d20b2..cf7bc42 100644 --- a/README.md +++ b/README.md @@ -1,136 +1,50 @@ -# 📚 AI reads books: Page-by-Page PDF Knowledge Extractor & Summarizer - -The `read_books.py` script performs an intelligent page-by-page analysis of PDF books, methodically extracting knowledge points and generating progressive summaries at specified intervals. It processes each page individually, allowing for detailed content understanding while maintaining the contextual flow of the book. Below is a detailed explanation of how the script works: - -### Features - -- 📚 Automated PDF book analysis and knowledge extraction -- 🤖 AI-powered content understanding and summarization -- 📊 Interval-based progress summaries -- 💾 Persistent knowledge base storage -- 📝 Markdown-formatted summaries -- 🎨 Color-coded terminal output for better visibility -- 🔄 Resume capability with existing knowledge base -- ⚙️ Configurable analysis intervals and test modes -- 🚫 Smart content filtering (skips TOC, index pages, etc.) -- 📂 Organized directory structure for outputs - -## ❤️ Support & Get 400+ AI Projects - -This is one of 400+ fascinating projects in my collection! [Support me on Patreon](https://www.patreon.com/c/echohive42/membership) to get: - -- 🎯 Access to 400+ AI projects (and growing daily!) - - Including advanced projects like [2 Agent Real-time voice template with turn taking](https://www.patreon.com/posts/2-agent-real-you-118330397) -- 📥 Full source code & detailed explanations -- 📚 1000x Cursor Course -- 🎓 Live coding sessions & AMAs -- 💬 1-on-1 consultations (higher tiers) -- 🎁 Exclusive discounts on AI tools & platforms (up to $180 value) - -## How to Use - -1. **Setup** - ```bash - # Clone the repository - git clone [repository-url] - cd [repository-name] - - # Install requirements - pip install -r requirements.txt - ``` - -2. **Configure** - - Place your PDF file in the project root directory - - Open `read_books.py` and update the `PDF_NAME` constant with your PDF filename - - (Optional) Adjust other constants like `ANALYSIS_INTERVAL` or `TEST_PAGES` - -3. **Run** - ```bash - python read_books.py - ``` - -4. **Output** - The script will generate: - - `book_analysis/knowledge_bases/`: JSON files containing extracted knowledge - - `book_analysis/summaries/`: Markdown files with interval and final summaries - - `book_analysis/pdfs/`: Copy of your PDF file - -5. **Customization Options** - - Set `ANALYSIS_INTERVAL = None` to skip interval summaries - - Set `TEST_PAGES = None` to process entire book - - Adjust `MODEL` and `ANALYSIS_MODEL` for different AI models - -### Configuration Constants - -- `PDF_NAME`: The name of the PDF file to be analyzed. -- `BASE_DIR`: The base directory for the analysis. -- `PDF_DIR`: Directory where the PDF file is stored. -- `KNOWLEDGE_DIR`: Directory where the knowledge base will be saved. -- `SUMMARIES_DIR`: Directory where the summaries will be saved. -- `PDF_PATH`: Full path to the PDF file. -- `OUTPUT_PATH`: Path to the knowledge base JSON file. -- `ANALYSIS_INTERVAL`: Number of pages after which an interval analysis is generated. Set to `None` to skip interval analyses. -- `MODEL`: The model used for processing pages. -- `ANALYSIS_MODEL`: The model used for generating analyses. -- `TEST_PAGES`: Number of pages to process for testing. Set to `None` to process the entire book. - -### Classes and Functions - -#### `PageContent` Class - -A Pydantic model that represents the structure of the response from the OpenAI API for page content analysis. It has two fields: - -- `has_content`: A boolean indicating if the page has relevant content. -- `knowledge`: A list of knowledge points extracted from the page. - -#### `load_or_create_knowledge_base() -> Dict[str, Any]` - -Loads the existing knowledge base from the JSON file if it exists. If not, it returns an empty dictionary. - -#### `save_knowledge_base(knowledge_base: list[str])` - -Saves the knowledge base to a JSON file. It prints a message indicating the number of items saved. - -#### `process_page(client: OpenAI, page_text: str, current_knowledge: list[str], page_num: int) -> list[str]` - -Processes a single page of the PDF. It sends the page text to the OpenAI API for analysis and updates the knowledge base with the extracted knowledge points. It also saves the updated knowledge base to a JSON file. - -#### `load_existing_knowledge() -> list[str]` - -Loads the existing knowledge base from the JSON file if it exists. If not, it returns an empty list. - -#### `analyze_knowledge_base(client: OpenAI, knowledge_base: list[str]) -> str` - -Generates a comprehensive summary of the entire knowledge base using the OpenAI API. It returns the summary in markdown format. - -#### `setup_directories()` - -Sets up the necessary directories for the analysis. It clears any previously generated files and ensures the PDF file is in the correct location. - -#### `save_summary(summary: str, is_final: bool = False)` - -Saves the generated summary to a markdown file. It creates a file with a proper naming convention based on whether it is a final or interval summary. - -#### `print_instructions()` - -Prints instructions for using the script. It explains the configuration options and how to run the script. - -#### `main()` - -The main function that orchestrates the entire process. It sets up directories, loads the knowledge base, processes each page of the PDF, generates interval and final summaries, and saves them. - -### How It Works - -1. **Setup**: The script sets up the necessary directories and ensures the PDF file is in the correct location. -2. **Load Knowledge Base**: It loads the existing knowledge base if it exists. -3. **Process Pages**: It processes each page of the PDF, extracting knowledge points and updating the knowledge base. -4. **Generate Summaries**: It generates interval summaries based on the `ANALYSIS_INTERVAL` and a final summary after processing all pages. -5. **Save Results**: It saves the knowledge base and summaries to their respective files. - -### Running the Script - -1. Place your PDF in the same directory as the script. -2. Update the `PDF_NAME` constant with your PDF filename. -3. Run the script. It will process the book, extract knowledge points, and generate summaries. - -### Example Usage +# AI Reads Books: Page-by-Page PDF Knowledge Extractor & Summarizer + +This project provides an intelligent way to analyze PDF books page by page, extracting key knowledge points and generating progressive summaries at specified intervals. The script now supports both OpenAI's GPT models and Google's Gemini Pro model. + +## Features + +- Page-by-page PDF text extraction +- Detailed analysis of each page's content +- Progressive summaries at specified intervals +- Support for both OpenAI GPT and Google Gemini Pro +- Results saved in JSON format for easy processing + +## Setup + +1. Clone the repository +2. Install dependencies: + ```bash + pip install -r requirements.txt + ``` +3. Create a `.env` file based on `.env.example` and add your API key: + - For OpenAI GPT: Use `pdf_reader.py` and set `OPENAI_API_KEY` + - For Gemini Pro: Use `pdf_reader_gemini.py` and set `GEMINI_API_KEY` + +## Usage + +Using Gemini Pro: +```bash +python pdf_reader_gemini.py path/to/your/book.pdf [--output results.json] [--summary-interval 10] +``` + +Using OpenAI GPT: +```bash +python pdf_reader.py path/to/your/book.pdf [--output results.json] [--summary-interval 10] +``` + +### Parameters + +- `pdf_path`: Path to your PDF file (required) +- `--output`: Path for the output JSON file (default: analysis_results.json) +- `--summary-interval`: Number of pages after which to generate a progressive summary (default: 10) + +## Output + +The script generates a JSON file containing: +- Individual page analyses +- Progressive summaries at specified intervals + +## License + +MIT diff --git a/pdf_reader_gemini.py b/pdf_reader_gemini.py new file mode 100644 index 0000000..5fc113d --- /dev/null +++ b/pdf_reader_gemini.py @@ -0,0 +1,102 @@ +import os +import PyPDF2 +import google.generativeai as genai +from dotenv import load_dotenv +from tqdm import tqdm +import json + +# Load environment variables +load_dotenv() + +# Configure Gemini API +GEMINI_API_KEY = os.getenv('GEMINI_API_KEY') +if not GEMINI_API_KEY: + raise ValueError('GEMINI_API_KEY environment variable is not set') + +genai.configure(api_key=GEMINI_API_KEY) +model = genai.GenerativeModel('gemini-pro') + +def extract_text_from_pdf(pdf_path): + with open(pdf_path, 'rb') as file: + pdf_reader = PyPDF2.PdfReader(file) + total_pages = len(pdf_reader.pages) + print(f'Total pages in PDF: {total_pages}') + + text_content = [] + for page in range(total_pages): + text = pdf_reader.pages[page].extract_text() + text_content.append(text) + + return text_content + +def analyze_page(page_text, page_number): + prompt = f"""Analyze the following page {page_number} from a book and extract key points and insights: + +{page_text} + +Provide a concise summary of the main points and any important concepts discussed on this page.""" + + try: + response = model.generate_content(prompt) + return response.text + except Exception as e: + print(f'Error analyzing page {page_number}: {str(e)}') + return f'Error analyzing page {page_number}' + +def main(pdf_path, output_path='analysis_results.json', summary_interval=10): + # Extract text from PDF + pages_content = extract_text_from_pdf(pdf_path) + + # Analyze each page + analysis_results = [] + progressive_summaries = [] + + print('\nAnalyzing pages...') + for i, page_text in enumerate(tqdm(pages_content)): + # Analyze individual page + page_analysis = analyze_page(page_text, i + 1) + analysis_results.append({ + 'page_number': i + 1, + 'analysis': page_analysis + }) + + # Generate progressive summary at intervals + if (i + 1) % summary_interval == 0: + pages_to_summarize = pages_content[i - summary_interval + 1:i + 1] + combined_text = '\n'.join(pages_to_summarize) + + summary_prompt = f"""Provide a comprehensive summary of the following section (pages {i - summary_interval + 2}-{i + 1}): + +{combined_text} + +Provide a concise summary that captures the main themes, concepts, and developments in this section.""" + + try: + summary_response = model.generate_content(summary_prompt) + progressive_summaries.append({ + 'pages': f'{i - summary_interval + 2}-{i + 1}', + 'summary': summary_response.text + }) + except Exception as e: + print(f'Error generating summary for pages {i - summary_interval + 2}-{i + 1}: {str(e)}') + + # Save results + results = { + 'page_analyses': analysis_results, + 'progressive_summaries': progressive_summaries + } + + with open(output_path, 'w', encoding='utf-8') as f: + json.dump(results, f, indent=2, ensure_ascii=False) + + print(f'\nAnalysis complete! Results saved to {output_path}') + +if __name__ == '__main__': + import argparse + parser = argparse.ArgumentParser(description='PDF Book Analyzer using Gemini API') + parser.add_argument('pdf_path', help='Path to the PDF file') + parser.add_argument('--output', default='analysis_results.json', help='Output JSON file path') + parser.add_argument('--summary-interval', type=int, default=10, help='Page interval for progressive summaries') + + args = parser.parse_args() + main(args.pdf_path, args.output, args.summary_interval) From 9d03a59c7b0fc4f3c5efaeed5fbe1bfc484b9c52 Mon Sep 17 00:00:00 2001 From: Utakata Date: Mon, 6 Jan 2025 19:11:18 +0900 Subject: [PATCH 3/5] Update README to preserve original content and add Gemini API support information --- README.md | 210 +++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 160 insertions(+), 50 deletions(-) diff --git a/README.md b/README.md index cf7bc42..fc47717 100644 --- a/README.md +++ b/README.md @@ -1,50 +1,160 @@ -# AI Reads Books: Page-by-Page PDF Knowledge Extractor & Summarizer - -This project provides an intelligent way to analyze PDF books page by page, extracting key knowledge points and generating progressive summaries at specified intervals. The script now supports both OpenAI's GPT models and Google's Gemini Pro model. - -## Features - -- Page-by-page PDF text extraction -- Detailed analysis of each page's content -- Progressive summaries at specified intervals -- Support for both OpenAI GPT and Google Gemini Pro -- Results saved in JSON format for easy processing - -## Setup - -1. Clone the repository -2. Install dependencies: - ```bash - pip install -r requirements.txt - ``` -3. Create a `.env` file based on `.env.example` and add your API key: - - For OpenAI GPT: Use `pdf_reader.py` and set `OPENAI_API_KEY` - - For Gemini Pro: Use `pdf_reader_gemini.py` and set `GEMINI_API_KEY` - -## Usage - -Using Gemini Pro: -```bash -python pdf_reader_gemini.py path/to/your/book.pdf [--output results.json] [--summary-interval 10] -``` - -Using OpenAI GPT: -```bash -python pdf_reader.py path/to/your/book.pdf [--output results.json] [--summary-interval 10] -``` - -### Parameters - -- `pdf_path`: Path to your PDF file (required) -- `--output`: Path for the output JSON file (default: analysis_results.json) -- `--summary-interval`: Number of pages after which to generate a progressive summary (default: 10) - -## Output - -The script generates a JSON file containing: -- Individual page analyses -- Progressive summaries at specified intervals - -## License - -MIT +# 📚 AI reads books: Page-by-Page PDF Knowledge Extractor & Summarizer + +The script performs an intelligent page-by-page analysis of PDF books, methodically extracting knowledge points and generating progressive summaries at specified intervals. It processes each page individually, allowing for detailed content understanding while maintaining the contextual flow of the book. Now with support for both OpenAI GPT and Google's Gemini Pro API! + +### Features + +- 📚 Automated PDF book analysis and knowledge extraction +- 🤖 AI-powered content understanding and summarization (OpenAI GPT or Google Gemini Pro) +- 📊 Interval-based progress summaries +- 💾 Persistent knowledge base storage +- 📝 Markdown-formatted summaries +- 🎨 Color-coded terminal output for better visibility +- 🔄 Resume capability with existing knowledge base +- ⚙️ Configurable analysis intervals and test modes +- 🚫 Smart content filtering (skips TOC, index pages, etc.) +- 📂 Organized directory structure for outputs +- 🔄 Choice between OpenAI GPT and Google Gemini Pro APIs + +## ❤️ Support & Get 400+ AI Projects + +This is one of 400+ fascinating projects in my collection! [Support me on Patreon](https://www.patreon.com/c/echohive42/membership) to get: + +- 🎯 Access to 400+ AI projects (and growing daily!) + - Including advanced projects like [2 Agent Real-time voice template with turn taking](https://www.patreon.com/posts/2-agent-real-you-118330397) +- 📥 Full source code & detailed explanations +- 📚 1000x Cursor Course +- 🎓 Live coding sessions & AMAs +- 💬 1-on-1 consultations (higher tiers) +- 🎁 Exclusive discounts on AI tools & platforms (up to $180 value) + +## How to Use + +1. **Setup** + ```bash + # Clone the repository + git clone [repository-url] + cd [repository-name] + + # Install requirements + pip install -r requirements.txt + ``` + +2. **Configure** + - Create a `.env` file based on `.env.example` + - Add your API key (either OPENAI_API_KEY or GEMINI_API_KEY) + - Place your PDF file in the project root directory + +3. **Run** + For OpenAI GPT: + ```bash + python read_books.py --pdf your_book.pdf + ``` + + For Google Gemini Pro: + ```bash + python pdf_reader_gemini.py --pdf your_book.pdf + ``` + +4. **Output** + The script will generate: + - `book_analysis/knowledge_bases/`: JSON files containing extracted knowledge + - `book_analysis/summaries/`: Markdown files with interval and final summaries + - `book_analysis/pdfs/`: Copy of your PDF file + +5. **Customization Options** + - Set `ANALYSIS_INTERVAL = None` to skip interval summaries + - Set `TEST_PAGES = None` to process entire book + - Choose between OpenAI GPT and Gemini Pro for analysis + +### Configuration Constants + +- `PDF_NAME`: The name of the PDF file to be analyzed. +- `BASE_DIR`: The base directory for the analysis. +- `PDF_DIR`: Directory where the PDF file is stored. +- `KNOWLEDGE_DIR`: Directory where the knowledge base will be saved. +- `SUMMARIES_DIR`: Directory where the summaries will be saved. +- `PDF_PATH`: Full path to the PDF file. +- `OUTPUT_PATH`: Path to the knowledge base JSON file. +- `ANALYSIS_INTERVAL`: Number of pages after which an interval analysis is generated. Set to `None` to skip interval analyses. +- `MODEL`: The model used for processing pages. +- `ANALYSIS_MODEL`: The model used for generating analyses. +- `TEST_PAGES`: Number of pages to process for testing. Set to `None` to process the entire book. + +### Classes and Functions + +#### `PageContent` Class + +A Pydantic model that represents the structure of the response from the AI API for page content analysis. It has two fields: + +- `has_content`: A boolean indicating if the page has relevant content. +- `knowledge`: A list of knowledge points extracted from the page. + +#### `load_or_create_knowledge_base() -> Dict[str, Any]` + +Loads the existing knowledge base from the JSON file if it exists. If not, it returns an empty dictionary. + +#### `save_knowledge_base(knowledge_base: list[str])` + +Saves the knowledge base to a JSON file. It prints a message indicating the number of items saved. + +#### `process_page(client, page_text: str, current_knowledge: list[str], page_num: int) -> list[str]` + +Processes a single page of the PDF. It sends the page text to the AI API for analysis and updates the knowledge base with the extracted knowledge points. It also saves the updated knowledge base to a JSON file. + +#### `load_existing_knowledge() -> list[str]` + +Loads the existing knowledge base from the JSON file if it exists. If not, it returns an empty list. + +#### `analyze_knowledge_base(client, knowledge_base: list[str]) -> str` + +Generates a comprehensive summary of the entire knowledge base using the AI API. It returns the summary in markdown format. + +#### `setup_directories()` + +Sets up the necessary directories for the analysis. It clears any previously generated files and ensures the PDF file is in the correct location. + +#### `save_summary(summary: str, is_final: bool = False)` + +Saves the generated summary to a markdown file. It creates a file with a proper naming convention based on whether it is a final or interval summary. + +#### `print_instructions()` + +Prints instructions for using the script. It explains the configuration options and how to run the script. + +#### `main()` + +The main function that orchestrates the entire process. It sets up directories, loads the knowledge base, processes each page of the PDF, generates interval and final summaries, and saves them. + +### How It Works + +1. **Setup**: The script sets up the necessary directories and ensures the PDF file is in the correct location. +2. **Load Knowledge Base**: It loads the existing knowledge base if it exists. +3. **Process Pages**: It processes each page of the PDF, extracting knowledge points and updating the knowledge base. +4. **Generate Summaries**: It generates interval summaries based on the `ANALYSIS_INTERVAL` and a final summary after processing all pages. +5. **Save Results**: It saves the knowledge base and summaries to their respective files. + +### API Choice Considerations + +#### OpenAI GPT +- More established and widely tested +- Generally provides more consistent results +- Higher costs but potentially better quality + +#### Google Gemini Pro +- Newer alternative with competitive capabilities +- More cost-effective option +- Growing and improving rapidly +- Potentially faster response times + +### Example Usage + +Using OpenAI GPT: +```bash +python read_books.py --pdf "The Art of War.pdf" --interval 5 +``` + +Using Gemini Pro: +```bash +python pdf_reader_gemini.py --pdf "The Art of War.pdf" --interval 5 +``` From 832be8e5f6d5430db5e320d1f9cbe032b71c3bcc Mon Sep 17 00:00:00 2001 From: Utakata Date: Mon, 6 Jan 2025 19:12:05 +0900 Subject: [PATCH 4/5] Add Japanese version of PDF reader using Gemini API --- pdf_reader_gemini_ja.py | 102 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) create mode 100644 pdf_reader_gemini_ja.py diff --git a/pdf_reader_gemini_ja.py b/pdf_reader_gemini_ja.py new file mode 100644 index 0000000..c4b6a7c --- /dev/null +++ b/pdf_reader_gemini_ja.py @@ -0,0 +1,102 @@ +import os +import PyPDF2 +import google.generativeai as genai +from dotenv import load_dotenv +from tqdm import tqdm +import json + +# 環境変数の読み込み +load_dotenv() + +# Gemini APIの設定 +GEMINI_API_KEY = os.getenv('GEMINI_API_KEY') +if not GEMINI_API_KEY: + raise ValueError('GEMINI_API_KEY environment variable is not set') + +genai.configure(api_key=GEMINI_API_KEY) +model = genai.GenerativeModel('gemini-pro') + +def extract_text_from_pdf(pdf_path): + with open(pdf_path, 'rb') as file: + pdf_reader = PyPDF2.PdfReader(file) + total_pages = len(pdf_reader.pages) + print(f'PDFの総ページ数: {total_pages}') + + text_content = [] + for page in range(total_pages): + text = pdf_reader.pages[page].extract_text() + text_content.append(text) + + return text_content + +def analyze_page(page_text, page_number): + prompt = f"""以下の本のページ{page_number}を分析し、重要なポイントと洞察を抽出してください: + +{page_text} + +このページで議論されている主要なポイントと重要な概念について、簡潔な要約を提供してください。""" + + try: + response = model.generate_content(prompt) + return response.text + except Exception as e: + print(f'ページ{page_number}の分析中にエラーが発生しました: {str(e)}') + return f'ページ{page_number}の分析中にエラーが発生しました' + +def main(pdf_path, output_path='analysis_results.json', summary_interval=10): + # PDFからテキストを抽出 + pages_content = extract_text_from_pdf(pdf_path) + + # 各ページを分析 + analysis_results = [] + progressive_summaries = [] + + print('\nページを分析中...') + for i, page_text in enumerate(tqdm(pages_content)): + # 個別ページの分析 + page_analysis = analyze_page(page_text, i + 1) + analysis_results.append({ + 'page_number': i + 1, + 'analysis': page_analysis + }) + + # 指定された間隔で進行的な要約を生成 + if (i + 1) % summary_interval == 0: + pages_to_summarize = pages_content[i - summary_interval + 1:i + 1] + combined_text = '\n'.join(pages_to_summarize) + + summary_prompt = f"""以下の節({i - summary_interval + 2}ページから{i + 1}ページ)の包括的な要約を提供してください: + +{combined_text} + +この節で扱われている主要なテーマ、概念、展開について簡潔な要約を提供してください。""" + + try: + summary_response = model.generate_content(summary_prompt) + progressive_summaries.append({ + 'pages': f'{i - summary_interval + 2}-{i + 1}', + 'summary': summary_response.text + }) + except Exception as e: + print(f'ページ{i - summary_interval + 2}-{i + 1}の要約生成中にエラーが発生しました: {str(e)}') + + # 結果を保存 + results = { + 'page_analyses': analysis_results, + 'progressive_summaries': progressive_summaries + } + + with open(output_path, 'w', encoding='utf-8') as f: + json.dump(results, f, indent=2, ensure_ascii=False) + + print(f'\n分析が完了しました!結果は{output_path}に保存されました') + +if __name__ == '__main__': + import argparse + parser = argparse.ArgumentParser(description='PDFブック分析ツール(Gemini API使用)') + parser.add_argument('pdf_path', help='PDFファイルのパス') + parser.add_argument('--output', default='analysis_results.json', help='出力JSONファイルのパス') + parser.add_argument('--summary-interval', type=int, default=10, help='進行的な要約を生成するページ間隔') + + args = parser.parse_args() + main(args.pdf_path, args.output, args.summary_interval) From 0ce9254119e27b7caefa844707e299b0e1f30344 Mon Sep 17 00:00:00 2001 From: Utakata Date: Mon, 6 Jan 2025 19:13:29 +0900 Subject: [PATCH 5/5] Add Japanese version of README --- README_ja.md | 137 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 137 insertions(+) create mode 100644 README_ja.md diff --git a/README_ja.md b/README_ja.md new file mode 100644 index 0000000..1523c15 --- /dev/null +++ b/README_ja.md @@ -0,0 +1,137 @@ +# 📚 AI読書アシスタント: ページごとのPDF知識抽出&要約ツール + +このスクリプトは、PDFの書籍を1ページずつインテリジェントに分析し、知識ポイントを抽出し、指定した間隔で進行的な要約を生成します。各ページを個別に処理しながら、書籍全体の文脈の流れを維持します。OpenAI GPTとGoogle Gemini Pro APIの両方をサポートしています! + +### 機能 + +- 📚 PDFブックの自動分析と知識抽出 +- 🤖 AIによるコンテンツ理解と要約(OpenAI GPTまたはGoogle Gemini Pro) +- 📊 一定間隔での進捗要約 +- 💾 知識ベースの永続的な保存 +- 📝 Markdown形式の要約 +- 🎨 見やすい色分けされたターミナル出力 +- 🔄 既存の知識ベースからの再開機能 +- ⚙️ 設定可能な分析間隔とテストモード +- 🚫 スマートなコンテンツフィルタリング(目次、索引ページなどをスキップ) +- 📂 整理された出力ディレクトリ構造 +- 🌏 日本語コンテンツの完全サポート + +## ❤️ サポートと400以上のAIプロジェクト入手 + +これは400以上の魅力的なプロジェクトコレクションの1つです![Patreonでサポート](https://www.patreon.com/c/echohive42/membership)して以下を入手できます: + +- 🎯 400以上のAIプロジェクトへのアクセス(日々増加中!) +- 📥 完全なソースコードと詳細な説明 +- 📚 1000x Cursorコース +- 🎓 ライブコーディングセッションとAMA +- 💬 1対1のコンサルテーション(上位ティア) +- 🎁 AIツールとプラットフォームの限定割引(最大$180相当) + +## 使い方 + +1. **セットアップ** + ```bash + # リポジトリのクローン + git clone [repository-url] + cd [repository-name] + + # 必要なパッケージのインストール + pip install -r requirements.txt + ``` + +2. **設定** + - `.env.example`を参考に`.env`ファイルを作成 + - APIキーを設定(OPENAI_API_KEYまたはGEMINI_API_KEY) + - PDFファイルをプロジェクトのルートディレクトリに配置 + +3. **実行** + OpenAI GPTを使用する場合: + ```bash + python read_books.py --pdf 本の名前.pdf + ``` + + Google Gemini Proを使用する場合: + ```bash + # 英語版 + python pdf_reader_gemini.py --pdf book.pdf + # 日本語版 + python pdf_reader_gemini_ja.py --pdf 本の名前.pdf + ``` + +4. **出力** + スクリプトは以下を生成します: + - `book_analysis/knowledge_bases/`: 抽出された知識を含むJSONファイル + - `book_analysis/summaries/`: 中間要約と最終要約のMarkdownファイル + - `book_analysis/pdfs/`: PDFファイルのコピー + +5. **カスタマイズオプション** + - `ANALYSIS_INTERVAL`を`None`に設定すると中間要約をスキップ + - `TEST_PAGES`を`None`に設定すると書籍全体を処理 + - 分析にOpenAI GPTとGemini Proを選択可能 + +### 設定項目 + +- `PDF_NAME`: 分析対象のPDFファイル名 +- `BASE_DIR`: 分析のベースディレクトリ +- `PDF_DIR`: PDFファイルの保存ディレクトリ +- `KNOWLEDGE_DIR`: 知識ベースの保存ディレクトリ +- `SUMMARIES_DIR`: 要約の保存ディレクトリ +- `PDF_PATH`: PDFファイルへのフルパス +- `OUTPUT_PATH`: 知識ベースJSONファイルのパス +- `ANALYSIS_INTERVAL`: 中間分析を生成するページ間隔。`None`で中間分析をスキップ +- `MODEL`: ページ処理に使用するモデル +- `ANALYSIS_MODEL`: 分析生成に使用するモデル +- `TEST_PAGES`: テスト用の処理ページ数。`None`で書籍全体を処理 + +### API選択の考慮点 + +#### OpenAI GPT +- より確立され広くテストされている +- 一般的により一貫性のある結果を提供 +- コストは高いが品質が良い可能性がある + +#### Google Gemini Pro +- 競争力のある機能を持つ新しい選択肢 +- よりコスト効率が良い +- 急速に成長・改善中 +- 応答時間が潜在的に速い +- 日本語の処理が特に優れている + +### 仕組み + +1. **セットアップ**: 必要なディレクトリを設定し、PDFファイルが正しい場所にあることを確認 +2. **知識ベースの読み込み**: 既存の知識ベースがあれば読み込み +3. **ページ処理**: PDFの各ページを処理し、知識ポイントを抽出して知識ベースを更新 +4. **要約生成**: `ANALYSIS_INTERVAL`に基づいて中間要約を生成し、全ページ処理後に最終要約を生成 +5. **結果の保存**: 知識ベースと要約を各ファイルに保存 + +### コマンドライン例 + +OpenAI GPTを使用: +```bash +python read_books.py --pdf "孫子の兵法.pdf" --interval 5 +``` + +Gemini Pro(日本語版)を使用: +```bash +python pdf_reader_gemini_ja.py --pdf "孫子の兵法.pdf" --interval 5 +``` + +### 日本語PDFに関する注意点 + +1. **文字コード** + - PDFファイルはUTF-8エンコーディングを推奨 + - 特殊な文字が含まれる場合は正しく処理されない可能性あり + +2. **レイアウト** + - 縦書きPDFは正しく処理されない可能性あり + - 複雑なレイアウトの場合、テキスト抽出が不完全な場合あり + +3. **OCR** + - スキャンされたPDFの場合、事前にOCR処理が必要 + - 画像として埋め込まれたテキストは処理不可 + +4. **最適なパフォーマンス** + - テキストベースのPDFを使用 + - シンプルなレイアウトの文書を推奨 + - 必要に応じて事前にPDF最適化を実施 \ No newline at end of file