Skip to content

Wimmics/SciLEx

 
 

Repository files navigation

Scilex

SciLEx

Docs License Python Issues Tests

SciLEx (Science Literature Exploration) is a Python toolkit for systematic literature reviews. Crawl 10 academic APIs, deduplicate papers, analyze citation networks, and push to Zotero with advanced quality filtering.

Cite this work:

Full text:

Célian Ringwald, Benjamin Navet. SciLEx, Science Literature Exploration Toolkit ⟨swh:1:dir:944639eb0260a034a5cbf8766d5ee9b74ca85330⟩.

Bibtex:

@softwareversion{scilex2026,
  TITLE = {{SciLEx, Science Literature Exploration Toolkit}},
  AUTHOR = {Ringwald, Célian and Navey, Benjamin},
  URL = {https://github.com/Wimmics/SciLEx},
  NOTE = {},
  INSTITUTION = {{University C{\^o}te d'Azur ; CNRS ; Inria}},
  YEAR = {2026},
  MONTH = Fev,
  SWHID = {swh:1:dir:944639eb0260a034a5cbf8766d5ee9b74ca85330},
  VERSION = {1.0},
  REPOSITORY = {https://github.com/Wimmics/SciLEx},
  LICENSE = {MIT Licence},
  KEYWORDS = {Python, Scientific literature, literature research, paper retriva},
  HAL_ID = {},
  HAL_VERSION = {},
}

Framework

architecture


Key Features

  • Multi-API collection with parallel processing (10 academic APIs)
  • Smart deduplication using DOI, URL, and fuzzy title matching
  • 5-phase quality filtering pipeline with time-aware citation thresholds
  • Citation network extraction via CrossRef + OpenCitations + Semantic Scholar
  • HuggingFace enrichment: ML models, datasets, GitHub stats, AI keywords
  • Export to Zotero (bulk upload) or BibTeX (with PDF links)
  • Idempotent collections for safe re-runs

Installation

# With uv (recommended)
uv sync

# With pip
pip install -e .

# Dev dependencies (pytest, ruff, coverage)
pip install -e ".[dev]"

Quick Start

# 1. Configure APIs and search parameters
cp scilex/api.config.yml.example scilex/api.config.yml
cp scilex/scilex.config.yml.example scilex/scilex.config.yml
cp scilex/scilex.advanced.yml.example scilex/scilex.advanced.yml

# 2. Activate your environment (pip/venv users)
source .venv/bin/activate       # macOS/Linux
# .venv\Scripts\activate        # Windows
# uv users: no activation needed, use: uv run scilex-collect

# 3. Collect papers from APIs
scilex-collect

# 3. Deduplicate & filter
scilex-aggregate

# 4. (Optional) Enrich with HuggingFace metadata
scilex-enrich

# 5. Export to Zotero or BibTeX
scilex-push-zotero          # Push to Zotero
scilex-export-bibtex        # Or export to BibTeX

See the Quick Start Guide for a complete walkthrough.


Supported APIs

API Key Required Best For
SemanticScholar Optional CS/AI papers, citation networks
OpenAlex Optional Broad coverage, ORCID data
IEEE Yes Engineering, CS conferences
Arxiv No Preprints, physics, CS
Springer Yes Journals, books
Elsevier Yes Medical, life sciences
PubMed Optional 35M biomedical papers
HAL No French research, theses
DBLP No CS bibliography, 95%+ DOI
Istex No French institutional access

See the API Comparison for rate limits, coverage details, and limitations.


Documentation

Full documentation is available at scilex.readthedocs.io:


Contributing

Requirements

  • Python >=3.10
  • pip or uv package manager

About

Python Tool Box For Science Analysis

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 98.4%
  • TeX 1.6%