DeepSeeker — Lightweight Multi-LLM Deep Research Engine

DeepSeeker is a modular deep-research system inspired by OpenAI’s Deep Research, designed for transparent reasoning, iterative search, and multi-model collaboration.

It integrates:

LLM0 — Planner & Final Analyst (high-quality model)
LLM1 — Reader & Summarizer (cost-efficient model)
BingSift — Web search + result extraction backend

DeepSeeker runs a full research pipeline:

LLM0 analyzes the question and decides whether to perform web search.
If search is required, DeepSeeker uses BingSift to fetch SERP results.
LLM0 inspects results (title + snippet) and chooses which pages require deep reading.
LLM1 reads each selected page and produces structured summaries.
LLM0 synthesizes all summaries and generates a final, well-structured report.

All LLM responses follow a lightweight JSON protocol (MCP-like, but much simpler), ensuring predictable, controllable system behavior.

Features

Two-LLM architecture for optimal cost/performance.
Deterministic JSON protocol for plan/selection/summarization/synthesis stages.
Extensible orchestrator written in Python.
Search powered by BingSift (keyword filtering, domain control, freshness filters).
Human-readable step logging for full transparency.
CLI utilities for testing each stage:
- plan — LLM0 planning behaviour
- search — BingSift integration
- run — full research pipeline

Installation

pip install -r requirements.txt

Set environment variables:

export OPENAI_API_KEY="your-key"
# Optional: custom endpoint
# export OPENAI_BASE_URL="https://your-host/v1"

Configuration

DeepSeeker supports configuration via JSON file or environment variables.

Option 1: Configuration File (Recommended)

Create a default configuration file:

python -m deepseeker.cli init

This creates config.json in your current directory:

{
  "api_key": "your-openai-api-key",
  "base_url": "https://api.openai.com/v1",
  "llm0": {
    "model": "gpt-5.1-thinking",
    "max_output_tokens": 4096
  },
  "llm1": {
    "model": "gpt-4o-mini",
    "max_output_tokens": 1536
  },
  "search_max_results": 10,
  "search_freshness": "week"
}

Automatic Detection: If config.json exists in the current directory, it will be automatically used by all commands.

Custom Config File: You can also specify a custom config file:

python -m deepseeker.cli --config custom_config.json run --question "your question"

Option 2: Environment Variables

If no config.json file is found, DeepSeeker falls back to environment variables:

# API Configuration
export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="https://api.openai.com/v1"  # Optional

# LLM Configuration
export DEEPSEEKER_LLM0_MODEL="gpt-5.1-thinking"
export DEEPSEEKER_LLM0_MAX_TOKENS="4096"
export DEEPSEEKER_LLM1_MODEL="gpt-4o-mini"
export DEEPSEEKER_LLM1_MAX_TOKENS="1536"

# Search Configuration
export DEEPSEEKER_SEARCH_MAX_RESULTS="10"
export DEEPSEEKER_SEARCH_FRESHNESS="week"

Note: To use environment variables, either delete config.json or specify a different config file with --config.

Priority Order

Configuration is loaded in this priority order:

Explicit --config parameter
Default config.json file (if exists)
Environment variables
Built-in defaults

Configuration Example

Here's a complete example of a config.json file:

{
  "api_key": "sk-your-openai-api-key-here",
  "base_url": "https://api.openai.com/v1",
  "llm0": {
    "model": "gpt-4o",
    "max_output_tokens": 4096
  },
  "llm1": {
    "model": "gpt-4o-mini",
    "max_output_tokens": 1536
  },
  "search_max_results": 15,
  "search_freshness": "week"
}

Tips:

Leave api_key empty ("") to use environment variable OPENAI_API_KEY
Leave base_url empty ("") to use default OpenAI endpoint

You can use different config files for different projects:

deepseeker --config project1.json run --question "..."
deepseeker --config project2.json run --question "..."

CLI Usage

1. Initialize Configuration

# Create default config.json
python -m deepseeker.cli init

# Create custom config file
python -m deepseeker.cli init --output my_config.json

2. Search

# Uses config.json or environment variables
python -m deepseeker.cli search --query "intel earnings"

# Override config settings
python -m deepseeker.cli search --query "intel earnings" --when week --max-results 20

3. Planning

# Uses config.json or environment variables
python -m deepseeker.cli plan --question "Explain ARM vs RISC-V for servers."

# Use custom config file
python -m deepseeker.cli --config custom.json plan --question "your question"

4. Run full pipeline

# Uses config.json or environment variables
python -m deepseeker.cli run --question "Latest advances in large-scale model training using distributed computing and GPU clusters?"

# Use custom config file
python -m deepseeker.cli --config custom.json run --question "your question"

You will see:

Markdown final answer
Key summary points
A full JSON log of internal steps

Lightweight JSON Protocol

DeepSeeker enforces strict JSON outputs:

LLM0 Plan

{
  "action": "direct_answer" | "search_then_answer",
  "direct_answer": "...",
  "search": {
    "query": "...",
    "when": "week",
    "include": [],
    "exclude": [],
    "allow_domains": [],
    "deny_domains": [],
    "max_results": 10
  },
  "notes": "..."
}

LLM1 Summary

{
  "title": "...",
  "summary": "...",
  "key_points": ["..."],
  "relevance_score": 0.0,
  "notes": "..."
}

LLM0 Final Synthesis

{
  "answer": "markdown report",
  "key_points": [],
  "used_results": [],
  "notes": "..."
}

Roadmap

License

MIT License (or replace with your preferred license).

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
deepseeker		deepseeker
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSeeker — Lightweight Multi-LLM Deep Research Engine

Features

Installation

Configuration

Option 1: Configuration File (Recommended)

Option 2: Environment Variables

Priority Order

Configuration Example

CLI Usage

1. Initialize Configuration

2. Search

3. Planning

4. Run full pipeline

Lightweight JSON Protocol

LLM0 Plan

LLM1 Summary

LLM0 Final Synthesis

Roadmap

License

About

Uh oh!

Releases 6

Packages

Languages

License

TabNahida/DeepSeeker

Folders and files

Latest commit

History

Repository files navigation

DeepSeeker — Lightweight Multi-LLM Deep Research Engine

Features

Installation

Configuration

Option 1: Configuration File (Recommended)

Option 2: Environment Variables

Priority Order

Configuration Example

CLI Usage

1. Initialize Configuration

2. Search

3. Planning

4. Run full pipeline

Lightweight JSON Protocol

LLM0 Plan

LLM1 Summary

LLM0 Final Synthesis

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Languages

Packages