Add comprehensive migration documentation section by dataroaring · Pull Request #3341 · apache/doris-website

dataroaring · 2026-02-05T23:40:25Z

Summary

Create consolidated migration documentation section with guides for migrating from PostgreSQL, MySQL, Elasticsearch, and other OLAP systems to Apache Doris
Add migration overview page with comparison table of migration paths and methods
Include Chinese translations for all migration guides

Test plan

Verify all internal documentation links resolve correctly
Check sidebar navigation displays Migration section after Getting Started
Review English documentation renders properly
Review Chinese documentation renders properly

🤖 Generated with Claude Code

Create consolidated migration guides covering: - Overview page with migration path comparison table - PostgreSQL to Doris (JDBC Catalog, Flink CDC, Export/Import) - MySQL to Doris (Flink CDC, JDBC Catalog, DataX) - Elasticsearch to Doris (ES Catalog, inverted index migration) - Other OLAP systems (ClickHouse, Greenplum, Hive/Iceberg/Hudi) Each guide includes data type mappings, step-by-step instructions, and troubleshooting for common issues. Chinese translations included. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Elasticsearch nested type should map to Doris VARIANT type for better flexible schema handling. Added links to VARIANT documentation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

This pull request adds comprehensive migration documentation to help users migrate data from various databases and data systems to Apache Doris. The documentation covers four major migration sources (PostgreSQL, MySQL, Elasticsearch, and other OLAP systems) with an overview page that provides a comparison table of migration paths and methods.

Changes:

Adds a new "Migration" section to the documentation sidebar, positioned between "Getting Started" and "Guides"
Creates 5 new English documentation files with detailed migration guides
Provides complete Chinese translations for all migration guides
Includes practical examples, data type mappings, and best practices for each migration source

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
sidebars.ts	Adds Migration category to sidebar with 5 migration guide entries
docs/migration/overview.md	Overview page with migration path comparison table and method selection guide
docs/migration/postgresql-to-doris.md	Complete PostgreSQL migration guide with JDBC Catalog, Flink CDC, and export-import options
docs/migration/mysql-to-doris.md	MySQL migration guide emphasizing Flink CDC with full database sync capabilities
docs/migration/elasticsearch-to-doris.md	Elasticsearch migration guide covering ES Catalog, Logstash, and custom scripts
docs/migration/other-olap-to-doris.md	Migration guides for ClickHouse, Greenplum, Hive, Iceberg, Hudi, and Spark/Flink connectors
i18n/zh-CN/.../migration/*.md	Chinese translations of all 5 migration guides

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-05T23:46:12Z

docs/migration/elasticsearch-to-doris.md

+`★ Insight ─────────────────────────────────────`
+1. **DUPLICATE KEY model** is best for log data where append-only writes are common
+2. **Inverted indexes** enable full-text search similar to Elasticsearch
+3. **Dynamic partitioning** automatically manages time-based data lifecycle
+`─────────────────────────────────────────────────`
+


The "Insight" section present in the English version (lines 147-151) is missing from the Chinese translation. This creates an inconsistency between the English and Chinese documentation. Either add a corresponding section to the Chinese version or remove it from the English version for consistency. If keeping it, consider using standard Docusaurus admonition syntax for better rendering.

Suggested change

`★ Insight ─────────────────────────────────────`

1. **DUPLICATE KEY model** is best for log data where append-only writes are common

2. **Inverted indexes** enable full-text search similar to Elasticsearch

3. **Dynamic partitioning** automatically manages time-based data lifecycle

`─────────────────────────────────────────────────`

Copilot · 2026-02-05T23:46:13Z

docs/migration/other-olap-to-doris.md

+
+| Source System | Recommended Method | Notes |
+|---------------|-------------------|-------|
+| ClickHouse | JDBC Catalog + SQL Convertor | Schema and SQL syntax conversion needed |


The term "Convertor" should be spelled "Converter" for correct English spelling. "Converter" is the standard spelling for a tool that converts something from one form to another.

Suggested change

| ClickHouse | JDBC Catalog + SQL Convertor | Schema and SQL syntax conversion needed |

| ClickHouse | JDBC Catalog + SQL Converter | Schema and SQL syntax conversion needed |

Copilot · 2026-02-05T23:46:13Z

docs/migration/elasticsearch-to-doris.md

+`★ Insight ─────────────────────────────────────`
+1. **DUPLICATE KEY model** is best for log data where append-only writes are common
+2. **Inverted indexes** enable full-text search similar to Elasticsearch
+3. **Dynamic partitioning** automatically manages time-based data lifecycle
+`─────────────────────────────────────────────────`


This "Insight" section uses an unusual formatting style with inline code blocks (backticks) rather than a proper markdown callout or info box. This formatting is inconsistent with standard Docusaurus documentation practices and may not render as intended. Consider using a standard markdown blockquote (>) or a Docusaurus admonition (:::info, :::tip, etc.) instead for better visual consistency and rendering.

Suggested change

`★ Insight ─────────────────────────────────────`

1. **DUPLICATE KEY model** is best for log data where append-only writes are common

2. **Inverted indexes** enable full-text search similar to Elasticsearch

3. **Dynamic partitioning** automatically manages time-based data lifecycle

`─────────────────────────────────────────────────`

:::info Insight

1. **DUPLICATE KEY model** is best for log data where append-only writes are common

2. **Inverted indexes** enable full-text search similar to Elasticsearch

3. **Dynamic partitioning** automatically manages time-based data lifecycle

:::

Copilot · 2026-02-05T23:46:14Z

docs/migration/postgresql-to-doris.md

+```sql
+-- Use Flink Doris Connector's database sync feature
+CREATE DATABASE IF NOT EXISTS sync_db;
+
+-- FlinkCDC whole database sync configuration
+-- See Flink Doris Connector documentation for complete setup


The PostgreSQL full database sync section (Step 3) provides only a placeholder comment directing users to see the Flink Doris Connector documentation, while the MySQL migration guide provides a complete working example with the CdcTools command. For consistency and completeness, consider adding a similar concrete example here showing the postgres-sync-database command, similar to the mysql-sync-database example in mysql-to-doris.md (lines 132-149).

Suggested change

```sql

-- Use Flink Doris Connector's database sync feature

CREATE DATABASE IF NOT EXISTS sync_db;

-- FlinkCDC whole database sync configuration

-- See Flink Doris Connector documentation for complete setup

```bash

# Use CdcTools to perform a full PostgreSQL database sync into Doris

# This is similar in spirit to the mysql-sync-database command

cdc-tools postgres-sync-database \

--pg-host pg-host \

--pg-port 5432 \

--pg-username pg_user \

--pg-password pg_password \

--pg-database source_db \

--include-tables "public.orders,public.customers" \

--doris-fe-nodes doris-fe:8030 \

--doris-username doris_user \

--doris-password doris_password \

--doris-database target_db \

--sink-label-prefix pg_full_sync

Change JSON type mappings to VARIANT across all migration docs: - PostgreSQL: json/jsonb → VARIANT - MySQL: JSON → VARIANT - Elasticsearch: object, flattened → VARIANT VARIANT type provides better flexible schema support for semi-structured data migration. Added links to VARIANT documentation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings February 5, 2026 23:40

dataroaring had a problem deploying to Production February 5, 2026 23:40 — with GitHub Actions Failure

Copilot started reviewing on behalf of dataroaring February 5, 2026 23:40 View session

Update ES nested type mapping from JSON to VARIANT

41790a4

Elasticsearch nested type should map to Doris VARIANT type for better flexible schema handling. Added links to VARIANT documentation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

dataroaring temporarily deployed to Production February 5, 2026 23:45 — with GitHub Actions Inactive

Copilot AI reviewed Feb 5, 2026

View reviewed changes

dataroaring temporarily deployed to Production February 6, 2026 00:32 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive migration documentation section#3341

Add comprehensive migration documentation section#3341
dataroaring wants to merge 3 commits intomasterfrom
feature/migration-documentation-consolidation

dataroaring commented Feb 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

Copilot AI Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	\| ClickHouse \| JDBC Catalog + SQL Convertor \| Schema and SQL syntax conversion needed \|
	\| ClickHouse \| JDBC Catalog + SQL Converter \| Schema and SQL syntax conversion needed \|

-```sql
--- Use Flink Doris Connector's database sync feature
-CREATE DATABASE IF NOT EXISTS sync_db;
--- FlinkCDC whole database sync configuration
--- See Flink Doris Connector documentation for complete setup
+```bash
+# Use CdcTools to perform a full PostgreSQL database sync into Doris
+# This is similar in spirit to the mysql-sync-database command
+cdc-tools postgres-sync-database \
+  --pg-host pg-host \
+  --pg-port 5432 \
+  --pg-username pg_user \
+  --pg-password pg_password \
+  --pg-database source_db \
+  --include-tables "public.orders,public.customers" \
+  --doris-fe-nodes doris-fe:8030 \
+  --doris-username doris_user \
+  --doris-password doris_password \
+  --doris-database target_db \
+  --sink-label-prefix pg_full_sync

Conversation

dataroaring commented Feb 5, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant