Skip to content

WIP: perf: add stream download/upload mem benchmark#947

Open
killagu wants to merge 4 commits intomasterfrom
perf/stream_mem
Open

WIP: perf: add stream download/upload mem benchmark#947
killagu wants to merge 4 commits intomasterfrom
perf/stream_mem

Conversation

@killagu
Copy link
Contributor

@killagu killagu commented Dec 24, 2025

Summary by CodeRabbit

  • New Features

    • End-to-end stream download/upload benchmark with a containerized nginx server, health endpoint, and Node.js workload scripts.
  • Documentation

    • Added a Chinese quickstart and an in-depth coredump/memory analysis guide.
  • Tools & Scripts

    • Added container startup/stop/status helpers, file generator, benchmark runners, GC observer, and artifact collection utilities.
  • Chores

    • Updated ignore rules to exclude large benchmark test and coredump artifacts.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 24, 2025 16:38
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a self-contained Nginx + Node.js streaming benchmark suite: Docker image and compose, nginx config and test assets, Node benchmark scripts and GC instrumentation, helper scripts for build/run/monitoring and coredump handling, file generator, and .gitignore updates to ignore generated artifacts.

Changes

Cohort / File(s) Summary
Docker & Container Setup
benchmark/stream_download/Dockerfile, benchmark/stream_download/docker-compose.nginx.yml, benchmark/stream_download/package.json
New Dockerfile (node base) installing nginx/utilities, healthcheck, workspace setup and CMD; compose file for nginx service; package.json sets CommonJS type.
Nginx Configuration & Test Assets
benchmark/stream_download/nginx.conf, benchmark/stream_download/nginx/test-file.txt, benchmark/stream_download/nginx/50mb_ones.txt
New nginx config exposing /download/, /upload/, /health with CORS, ranges, streaming settings; small test-file added; a 50MB payload file is referenced (added/ignored in .gitignore).
Benchmark Scripts
benchmark/stream_download/benchmark.js, benchmark/stream_download/benchmark_undici.js, benchmark/stream_download/gc.js, benchmark/stream_download/benchmark_undici.js
Two continuous Node.js benchmark scripts (urllib and undici) that stream-download a 50MB file and upload it; gc.js instruments GC and reports stats.
Utilities & Orchestration Scripts
benchmark/stream_download/start-docker.sh, benchmark/stream_download/stop-docker.sh, benchmark/stream_download/status-docker.sh, benchmark/stream_download/start-nginx.sh, benchmark/stream_download/copy-coredump.sh, benchmark/stream_download/run-benchmark-with-coredump.sh
Scripts to build/run/stop the Docker container, start nginx inside container with signal handling, check status, collect coredumps/heap snapshots, and run benchmarks that produce snapshots/core dumps.
File Generation & Formats
benchmark/stream_download/generate_50mb_file.sh, benchmark/stream_download/curl-format.txt, benchmark/stream_download/.gitignore, .gitignore
Script to generate nginx/50mb_ones.txt; curl write-out template; local .gitignore to ignore core/heapsnapshot/tmp; top-level .gitignore updated with comment # Benchmark test files and benchmark/stream_download/nginx/50mb_ones.txt entry.
Documentation & Analysis Guides
benchmark/stream_download/README.md, benchmark/stream_download/COREDUMP_ANALYSIS.md
Chinese README with quickstart and memory analysis steps; coredump analysis guide with commands and workflows.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Bench as Benchmark Script
  participant FS as Filesystem
  participant Nginx as Nginx Server
  participant Docker as Container Runtime

  Note over Docker,Nginx: Containerized benchmark environment
  Bench->>Nginx: GET /download/50mb_ones.txt
  Nginx-->>Bench: stream file (200, ranges/CORS)
  Bench->>FS: stream write to temp file (UUID)
  Bench->>Nginx: POST /upload/ (stream from temp file)
  Nginx-->>Bench: 201 Created
  Bench->>FS: delete temp file
  alt SIGUSR2 toggles pause
    Bench-->>Bench: pause loop (sleep, optional GC)
  end
  Note right of Docker: healthcheck endpoint polled by Docker
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

enhancement

Suggested reviewers

  • elrrrrrrr

Poem

🐰 I hopped through configs, pipes, and ports,
I fetched big ones and sent them forth,
Temp files vanish without retorts,
Containers hum in tidy cohorts,
I nibble logs and dance — a benchmark sport 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'WIP: perf: add stream download/upload mem benchmark' directly describes the main change—adding memory benchmarking tools for stream download/upload scenarios. It is concise, specific, and accurately reflects the core objective of the PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch perf/stream_mem

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @killagu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes a comprehensive Docker-based infrastructure specifically designed for performance benchmarking of stream download and upload operations. It leverages Nginx as a high-performance web server, configured to handle streaming data efficiently, and integrates a Node.js client to simulate continuous data transfer. The setup provides all necessary tools for environment management, test data generation, and detailed performance measurement, enabling developers to assess memory and network performance under streaming conditions.

Highlights

  • Performance Benchmarking Environment: Introduced a new Docker-based environment for conducting stream download and upload memory benchmarks, utilizing Nginx as the server.
  • Continuous Stream Testing: Added a Node.js script that continuously downloads a large file and immediately uploads it, simulating a real-world streaming workload.
  • Docker Management and Test Utilities: Included a suite of shell scripts for building, running, stopping, and checking the status of the Docker benchmark container, along with a script to generate large test files and a custom curl format for detailed metrics.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@killagu
Copy link
Contributor Author

killagu commented Dec 24, 2025

image

可以看到 js heap 和 rss 实际差距很大。

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Nginx-based benchmark for stream download and upload performance, including Dockerfiles, Nginx configuration, shell scripts for managing the Docker environment, and a Node.js script (benchmark.js) to perform the actual download/upload tests. Review comments suggest several improvements: updating the Dockerfile to use a valid and alpine-based Node.js version, optimizing image size by removing vim and using --no-install-recommends, and improving dependency management by adding urllib to package.json. For the benchmark.js script, it's recommended to pipe the download stream directly to the upload stream to avoid intermediate disk I/O and to add error handling to the while(true) loop for robustness. Additionally, the README.md has a typo in a docker run command, the generate_50mb_file.sh script's dd command can be made more readable, and the start-docker.sh script should use a more reliable healthcheck polling mechanism instead of a fixed sleep duration.

@@ -0,0 +1,41 @@
FROM node:24.12.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Node.js version 24.12.0 specified does not appear to be a valid or current version. The latest Node.js version is 22.x, and the current LTS is 20.x. Using a non-existent version will cause the build to fail. Please use a current stable or LTS version. Using an -alpine image is also recommended for smaller image sizes.

FROM node:20.14.0-alpine

Comment on lines 6 to 16
async function downloadAndUpload() {
const tmpFilePath = path.join(__dirname, `${crypto.randomUUID()}.txt`);
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This benchmark currently involves disk I/O by writing the downloaded file to a temporary path and then reading it back for upload. This can skew the results, as you're measuring disk speed as well as network performance. For a more accurate stream benchmark, you should pipe the download response stream directly into the upload request stream. This avoids the intermediate file and measures the pure stream handling performance.

Suggested change
async function downloadAndUpload() {
const tmpFilePath = path.join(__dirname, `${crypto.randomUUID()}.txt`);
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
}
async function downloadAndUpload() {
const { res } = await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
streaming: true,
timeout: 60000, // 60 seconds
});
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: res,
timeout: 60000, // 60 seconds
});
}

Comment on lines 18 to 22
(async () => {
while (true) {
await downloadAndUpload();
}
})();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The while(true) loop will exit on the first error that occurs within downloadAndUpload(), stopping the benchmark. To make the benchmark more robust, you should wrap the call in a try...catch block to handle potential errors and allow the loop to continue.

Suggested change
(async () => {
while (true) {
await downloadAndUpload();
}
})();
(async () => {
while (true) {
try {
await downloadAndUpload();
} catch (err) {
console.error('An error occurred during download/upload:', err);
// Optional: wait a bit before retrying
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
})();

Comment on lines 4 to 9
RUN apt-get update && apt-get install -y \
nginx \
curl \
vim \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To optimize the Docker image size, it's recommended to:

  1. Use --no-install-recommends with apt-get install to avoid installing unnecessary packages.
  2. Remove vim as it's a large dependency and generally not needed in a production or benchmark image. If you need to debug, you can docker exec into a running container and install it manually.
RUN apt-get update && apt-get install -y --no-install-recommends \
    nginx \
    curl \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean


COPY benchmark.js /root/workspace/benchmark.js

RUN cd /root/workspace && npm i urllib --registry https://registry.npmmirror.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Installing npm packages directly with npm i inside the Dockerfile is not ideal for dependency management. It's better practice to add urllib to a dependencies section in your package.json, copy package.json (and package-lock.json) into the image, and then run npm install or npm ci. This makes your dependencies explicit and leverages Docker's layer caching more effectively. I've added a separate comment on package.json with a suggestion. With that change, this line should be updated to use npm install.

RUN cd /root/workspace && npm install --registry https://registry.npmmirror.com

### 运行容器

```bash
docker run -rm -d --platform linux/amd64 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's a typo in the docker run command. The -rm flag should be --rm.

Suggested change
docker run -rm -d --platform linux/amd64 \
docker run --rm -d --platform linux/amd64 \

echo "正在生成 $TARGET_SIZE_MB MB 的文件,内容都是1..."

# 使用dd命令生成文件,每块1KB,共50*1024块
dd if=/dev/zero bs=1024 count=$((TARGET_SIZE_MB * 1024)) | tr '\0' '1' > "$OUTPUT_FILE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The dd command is functionally correct, but can be made more readable by using 1M for block size and referencing the TARGET_SIZE_MB variable directly.

Suggested change
dd if=/dev/zero bs=1024 count=$((TARGET_SIZE_MB * 1024)) | tr '\0' '1' > "$OUTPUT_FILE"
dd if=/dev/zero bs=1M count=${TARGET_SIZE_MB} | tr '\0' '1' > "$OUTPUT_FILE"

Comment on lines +1 to +3
{
"type": "commonjs"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This package.json file is missing the dependencies section. To make dependency management explicit and reliable, you should declare urllib here. This also works in conjunction with improvements to the Dockerfile to leverage layer caching.

Suggested change
{
"type": "commonjs"
}
{
"type": "commonjs",
"dependencies": {
"urllib": "^3.22.0"
}
}


# 等待容器启动
echo "等待容器启动..."
sleep 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a fixed sleep 3 to wait for the container to start is unreliable. The container might take more or less time depending on the host machine's performance. A more robust approach is to poll the healthcheck endpoint in a loop until it becomes available or a timeout is reached.

Suggested change
sleep 3
# Wait for container to be healthy
echo "Waiting for container to be healthy..."
for i in {1..20}; do
if curl -s -f http://localhost:"$HOST_PORT"/health > /dev/null; then
echo "Container is healthy."
break
fi
if [ $i -eq 20 ]; then
echo -e "${RED}Container did not become healthy in time.${NC}"
docker logs "$CONTAINER_NAME"
exit 1
fi
echo "Still waiting for container... ($i/20)"
sleep 1
done

@codecov
Copy link

codecov bot commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.26%. Comparing base (e6f9208) to head (b719d94).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #947      +/-   ##
==========================================
- Coverage   95.39%   95.26%   -0.13%     
==========================================
  Files         197      197              
  Lines       22262    22262              
  Branches     2472     2456      -16     
==========================================
- Hits        21237    21209      -28     
- Misses       1025     1053      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (8)
benchmark/stream_download/docker-compose.nginx.yml (1)

1-1: Consider updating to Docker Compose Specification format.

The version: '3.8' field is from the older versioned format. The Docker Compose Specification (no version field) is now the recommended approach and provides better forward compatibility.

🔎 Optional modernization
-version: '3.8'
-
 services:
   nginx-benchmark:

Reference: Docker Compose Specification

benchmark/stream_download/start-nginx.sh (1)

1-55: LGTM with minor suggestions.

The script properly handles nginx startup, validation, signal handling, and graceful shutdown. The structure and error handling are solid.

Optional improvements
  1. Replace arbitrary sleep with retry loop (lines 27-28):
# Wait for nginx to start with timeout
for i in {1..10}; do
  if kill -0 $NGINX_PID 2>/dev/null; then
    break
  fi
  sleep 0.5
done
  1. Add healthcheck verification (after line 40):
# Verify health endpoint
if command -v curl &> /dev/null; then
  sleep 1
  if curl -f http://localhost/health &> /dev/null; then
    echo "健康检查通过"
  else
    echo "警告: 健康检查失败"
  fi
fi
  1. Consider uncommenting and adjusting permissions (lines 17-19) if file ownership issues arise in production.
benchmark/stream_download/start-docker.sh (1)

50-52: Verify Dockerfile exists before building.

The script attempts to build a Docker image without checking if the Dockerfile exists, which could lead to confusing error messages.

🔎 Proposed fix
 # 构建 Docker 镜像
+if [ ! -f "Dockerfile" ]; then
+    echo -e "${RED}错误: Dockerfile 不存在${NC}"
+    exit 1
+fi
 echo "构建 Docker 镜像..."
 docker build -t "$IMAGE_NAME" .
benchmark/stream_download/Dockerfile (3)

38-38: Consider using standard npm registry and package.json.

Installing urllib directly via npm i without a package.json bypasses dependency locking and version control. Additionally, the npmmirror.com registry is China-specific and may cause issues for international users or CI/CD systems.

🔎 Recommended approach

Create a package.json in the benchmark directory:

{
  "name": "stream-benchmark",
  "version": "1.0.0",
  "dependencies": {
    "urllib": "^3.0.0"
  }
}

Then update the Dockerfile:

-RUN cd /root/workspace && npm i urllib --registry https://registry.npmmirror.com
+COPY package.json package-lock.json /root/workspace/
+RUN cd /root/workspace && npm ci

24-25: Clarify or remove unused debug port 9229.

Port 9229 is the Node.js debug port, but there's no indication that the benchmark requires debugging capabilities. If debugging isn't needed, consider removing this to reduce the attack surface.


27-28: Inconsistent WORKDIR usage.

The WORKDIR is set to /var/www/html (line 28) but subsequent operations create and work in /root/workspace (lines 34-38). This inconsistency can confuse developers about where files are located.

🔎 Proposed fix
-# 设置工作目录
-WORKDIR /var/www/html
-
 # 健康检查
 HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
     CMD curl -f http://localhost/health || exit 1
 
 RUN mkdir -p /root/workspace
-
 COPY benchmark.js /root/workspace/benchmark.js
-
 RUN cd /root/workspace && npm i urllib --registry https://registry.npmmirror.com
+
+WORKDIR /root/workspace

Also applies to: 34-38

benchmark/stream_download/status-docker.sh (1)

30-34: Verify curl availability before health check.

The script uses curl for the health check without verifying it's installed on the host. While curl is common, it's not guaranteed to be available on all systems.

🔎 Proposed fix
     echo ""
     echo "访问测试:"
-    if curl -s http://localhost:8080/health > /dev/null; then
+    if command -v curl >/dev/null 2>&1 && curl -s http://localhost:8080/health > /dev/null 2>&1; then
         echo -e "${GREEN}✓ HTTP 服务正常${NC}"
     else
-        echo -e "${RED}✗ HTTP 服务异常${NC}"
+        echo -e "${RED}✗ HTTP 服务异常 (或 curl 未安装)${NC}"
     fi
benchmark/stream_download/nginx.conf (1)

5-6: Potential root/alias directive conflict.

The root directive at line 6 and alias directive at line 17 for /download/ can interact unexpectedly. While this configuration may work, the alias overrides the root for that location, making the server-level root only apply to other locations. Consider removing the server-level root or documenting this interaction.

🔎 Recommended clarification
 server {
     listen 80;
     server_name localhost;
 
-    # 设置根目录为 nginx 文件夹
-    root /var/www/html/;
-    
     # 禁用缓存以支持流式下载测试
     sendfile off;
     tcp_nopush off;

Also applies to: 14-17

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f85ebce and 2639007.

📒 Files selected for processing (15)
  • .gitignore
  • benchmark/stream_download/Dockerfile
  • benchmark/stream_download/README.md
  • benchmark/stream_download/benchmark.js
  • benchmark/stream_download/cb53f70f-6c5c-4b5a-b179-e8294afbf04c.txt
  • benchmark/stream_download/curl-format.txt
  • benchmark/stream_download/docker-compose.nginx.yml
  • benchmark/stream_download/generate_50mb_file.sh
  • benchmark/stream_download/nginx.conf
  • benchmark/stream_download/nginx/test-file.txt
  • benchmark/stream_download/package.json
  • benchmark/stream_download/start-docker.sh
  • benchmark/stream_download/start-nginx.sh
  • benchmark/stream_download/status-docker.sh
  • benchmark/stream_download/stop-docker.sh
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx,js,jsx}: Use single quotes for string literals
Use ES5 trailing commas (comma in last element of objects/arrays)
Use 2-space indentation
Limit lines to 120 character width
Avoid parentheses around single arrow function parameters
Limit function parameters to a maximum of 6
Warn on console.log usage in source code
Do not use anonymous default exports
Use ES modules syntax (import/export) throughout

Files:

  • benchmark/stream_download/benchmark.js
🧠 Learnings (2)
📚 Learning: 2025-12-22T13:51:37.326Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-22T13:51:37.326Z
Learning: Applies to test/**/*.test.ts : Tests must use eggjs/mock for mocking, assert from node:assert/strict, and test both success and error cases

Applied to files:

  • .gitignore
📚 Learning: 2025-12-22T13:51:37.326Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-22T13:51:37.326Z
Learning: Applies to **/*.ts : Use ES modules (import/export) throughout the codebase - no CommonJS require()

Applied to files:

  • benchmark/stream_download/package.json
🧬 Code graph analysis (1)
benchmark/stream_download/generate_50mb_file.sh (1)
test/TestUtil.ts (1)
  • rm (178-184)
🪛 Shellcheck (0.11.0)
benchmark/stream_download/stop-docker.sh

[warning] 11-11: RED appears unused. Verify use (or export if used externally).

(SC2034)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: test on mysql (node@22, shard@0/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@24, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@24, shard@1/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@22, shard@2/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@24, shard@2/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@22, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@20, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on postgresql (node@20, shard@2/3)
  • GitHub Check: test on postgresql (node@22, shard@2/3)
  • GitHub Check: test on postgresql (node@24, shard@0/3)
  • GitHub Check: test on postgresql (node@22, shard@1/3)
  • GitHub Check: test on postgresql (node@20, shard@0/3)
  • GitHub Check: test on postgresql (node@22, shard@0/3)
  • GitHub Check: test on postgresql (node@24, shard@2/3)
  • GitHub Check: test on postgresql (node@20, shard@1/3)
  • GitHub Check: test on postgresql (node@24, shard@1/3)
  • GitHub Check: build-and-push-image
🔇 Additional comments (13)
.gitignore (1)

126-128: LGTM!

The ignore rule appropriately excludes the generated 50MB test file from version control, with a clear section comment.

benchmark/stream_download/curl-format.txt (1)

1-16: LGTM!

The curl format template correctly defines timing and data metric placeholders for standardized benchmark output.

benchmark/stream_download/nginx/test-file.txt (1)

1-16: LGTM!

The test file appropriately documents its purpose and provides clear test commands for download/upload verification.

benchmark/stream_download/start-docker.sh (1)

54-61: Good use of read-only volume mount for security.

The --volume flag with :ro (read-only) prevents the container from modifying the host files, which is appropriate for a benchmark that only needs to serve static files.

benchmark/stream_download/generate_50mb_file.sh (2)

22-28: Excellent cross-platform compatibility.

The script correctly handles both BSD (stat -f%z) and GNU (stat -c%s) variants of the stat command, ensuring portability across macOS and Linux. The size verification logic is also robust.


19-19: Note: Generated file is highly compressible.

The file contains only '1' characters, making it highly compressible. This may not accurately represent real-world scenarios where data entropy affects network and I/O performance. Consider whether this is suitable for your benchmark goals.

benchmark/stream_download/stop-docker.sh (2)

9-12: ShellCheck false positive - RED is used.

The static analysis warning about RED being unused is incorrect. The variable is defined here and used on line 11 in the echo statement. This is a false positive from ShellCheck.


16-30: LGTM: Proper container lifecycle management.

The script correctly checks container state before attempting to stop or remove it, preventing unnecessary errors. The use of grep -q with conditional logic ensures graceful handling of both running and stopped containers.

benchmark/stream_download/status-docker.sh (2)

55-62: Port check may require elevated privileges.

The netstat -tulnp and lsof -i :8080 commands may require sudo on some systems to display process information. The script handles command failure gracefully with 2>/dev/null, but the check might return false negatives on systems with restricted permissions.

This is acceptable for a development/benchmark script, but be aware that the port check might not work reliably without proper permissions.


17-43: Well-structured status reporting.

The script provides comprehensive status information with clear visual feedback using color codes, handles multiple container states (running, stopped, non-existent), and includes useful diagnostics like resource usage and HTTP health checks.

benchmark/stream_download/nginx.conf (3)

29-32: CORS wildcard acceptable for benchmarks.

The wildcard CORS configuration (Access-Control-Allow-Origin *) is appropriate for a local benchmark environment but would be insecure in production. Since this is explicitly a benchmark setup, this is fine.


34-36: Beware of 'if' directive in location context.

Using if in location context can cause unexpected behavior in nginx (see "If is Evil" in nginx documentation). While this specific usage for OPTIONS requests is generally safe, be aware of this limitation for future modifications.

For reference: https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/


8-12: Good streaming configuration for benchmarks.

The settings (sendfile off, tcp_nodelay on) are appropriate for benchmark testing where you want to measure actual streaming behavior without kernel optimizations that might mask performance characteristics.

Comment on lines 1 to 4
const urllib = require('urllib');
const fs = require('fs');
const crypto = require('crypto');
const path = require('path');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Convert to ES module imports.

The code uses CommonJS require() statements, which violates the project's coding guidelines mandating ES modules throughout the codebase.

🔎 Proposed fix
-const urllib = require('urllib');
-const fs = require('fs');
-const crypto = require('crypto');
-const path = require('path');
+import urllib from 'urllib';
+import fs from 'fs';
+import crypto from 'crypto';
+import path from 'path';

Note: This requires changing package.json to "type": "module" (see related comment on package.json).

Based on learnings: Use ES modules (import/export) throughout the codebase - no CommonJS require()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const urllib = require('urllib');
const fs = require('fs');
const crypto = require('crypto');
const path = require('path');
import urllib from 'urllib';
import fs from 'fs';
import crypto from 'crypto';
import path from 'path';
🤖 Prompt for AI Agents
In benchmark/stream_download/benchmark.js around lines 1 to 4, the file uses
CommonJS require() calls; convert them to ES module imports (e.g., replace each
require with an import statement for urllib, fs, crypto, and path) and update
any downstream code that depended on CommonJS semantics if necessary; also
ensure package.json is set to "type": "module" so the ES imports are valid.

@@ -0,0 +1,22 @@
const urllib = require('urllib');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for package.json files and check for urllib dependency
find . -name 'package.json' -type f | head -20

Repository: cnpm/cnpmcore

Length of output: 420


🏁 Script executed:

#!/bin/bash
# Check the main package.json for urllib
cat ./package.json

Repository: cnpm/cnpmcore

Length of output: 5286


🏁 Script executed:

#!/bin/bash
# Also search for urllib references in package.json files
rg -l 'package\.json' -g '**' | head -5
rg 'urllib' package.json -A 2 -B 2

Repository: cnpm/cnpmcore

Length of output: 301


🏁 Script executed:

#!/bin/bash
# Check the benchmark package.json
cat ./benchmark/stream_download/package.json

Repository: cnpm/cnpmcore

Length of output: 81


🏁 Script executed:

#!/bin/bash
# Also verify the require statement in the benchmark file
cat ./benchmark/stream_download/benchmark.js | head -20

Repository: cnpm/cnpmcore

Length of output: 646


🏁 Script executed:

#!/bin/bash
# Check if urllib is used anywhere else in the project and what versions might be expected
rg 'urllib' --type ts --type js -i | grep -v node_modules | head -20

Repository: cnpm/cnpmcore

Length of output: 328


🏁 Script executed:

#!/bin/bash
# Check if there's any documentation or comments about urllib in the project
rg 'urllib' -B 2 -A 2 --type ts --type js --type json | grep -v node_modules | head -40

Repository: cnpm/cnpmcore

Length of output: 938


🏁 Script executed:

#!/bin/bash
# Check if urllib is listed as a transitive dependency in the main package.json
# by looking for packages that might depend on urllib
rg '"urllib"' node_modules 2>/dev/null | head -5 || echo "node_modules not available"

Repository: cnpm/cnpmcore

Length of output: 39


🏁 Script executed:

#!/bin/bash
# Check if there are any npm scripts or docs about running benchmarks
cat ./package.json | grep -A 20 '"scripts"'

Repository: cnpm/cnpmcore

Length of output: 970


Add urllib to benchmark/stream_download/package.json dependencies.

The urllib package is required by benchmark.js but not declared in the benchmark's package.json. This will cause the benchmark to fail with a module not found error at runtime.

Update benchmark/stream_download/package.json to include urllib as a dependency:

{
  "type": "commonjs",
  "dependencies": {
    "urllib": "^3.0.0"
  }
}
🤖 Prompt for AI Agents
In benchmark/stream_download/benchmark.js around line 1, the file requires
"urllib" but benchmark/stream_download/package.json is missing that dependency;
update benchmark/stream_download/package.json to add "urllib" under dependencies
(e.g., "urllib": "^3.0.0") and ensure "type" remains "commonjs", then run npm
install (or update lockfile) so the module is available at runtime.

Comment on lines 6 to 16
async function downloadAndUpload() {
const tmpFilePath = path.join(__dirname, `${crypto.randomUUID()}.txt`);
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add error handling and resource cleanup.

The downloadAndUpload function lacks error handling, which could lead to:

  • Unhandled promise rejections causing the process to crash
  • Temp files left on disk if upload or deletion fails
  • No visibility into failures during benchmarking
🔎 Proposed improvements
 async function downloadAndUpload() {
   const tmpFilePath = path.join(__dirname, `${crypto.randomUUID()}.txt`);
-  await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
-    writeStream: fs.createWriteStream(tmpFilePath),
-  });
-  await urllib.request('http://127.0.0.1/upload/', {
-    method: 'POST',
-    stream: fs.createReadStream(tmpFilePath),
-  });
-  await fs.promises.rm(tmpFilePath);
+  try {
+    await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
+      writeStream: fs.createWriteStream(tmpFilePath),
+    });
+    await urllib.request('http://127.0.0.1/upload/', {
+      method: 'POST',
+      stream: fs.createReadStream(tmpFilePath),
+    });
+  } finally {
+    // Ensure temp file is cleaned up even on error
+    try {
+      await fs.promises.rm(tmpFilePath);
+    } catch (err) {
+      // Ignore if file doesn't exist
+      if (err.code !== 'ENOENT') throw err;
+    }
+  }
 }
🤖 Prompt for AI Agents
In benchmark/stream_download/benchmark.js around lines 6 to 16,
downloadAndUpload currently has no error handling or guaranteed cleanup; wrap
the download/upload sequence in a try/catch/finally: create the tmpFilePath and
streams, perform the download and upload inside try, log or rethrow any caught
errors in catch, and in finally ensure the read/write streams are
closed/destroyed and the temp file is removed if it exists (use
fs.promises.unlink or rm and guard with exists check), so failures during upload
or deletion won’t leak files or unhandled promise rejections.

Comment on lines 18 to 22
(async () => {
while (true) {
await downloadAndUpload();
}
})();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add logging, metrics, and graceful shutdown.

For a benchmark tool, this implementation has several issues:

  • No output to measure performance or track progress
  • No metrics collection (timing, throughput, memory usage)
  • Infinite loop with no exit condition or signal handling
  • Will leave orphaned temp files on forced termination
🔎 Suggested enhancements
+(let iteration = 0;
+const startTime = Date.now();
+
+process.on('SIGINT', () => {
+  const duration = (Date.now() - startTime) / 1000;
+  console.log(`\nBenchmark stopped after ${iteration} iterations in ${duration.toFixed(2)}s`);
+  process.exit(0);
+});
+
 (async () => {
   while (true) {
-    await downloadAndUpload();
+    const iterStart = Date.now();
+    try {
+      await downloadAndUpload();
+      const duration = Date.now() - iterStart;
+      iteration++;
+      console.log(`Iteration ${iteration}: ${duration}ms`);
+    } catch (err) {
+      console.error(`Error in iteration ${iteration + 1}:`, err.message);
+      // Continue or exit based on error severity
+    }
   }
 })();

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In benchmark/stream_download/benchmark.js around lines 18 to 22, the loop
currently runs forever with no logs, metrics, signal handling, or temp-file
cleanup; update it to run a controlled benchmark: add configurable iteration
count or duration and emit periodic logs and metrics (timings, per-iteration
throughput, cumulative bytes, memory usage) after each downloadAndUpload run;
measure start/end time and bytes transferred to compute throughput and record to
a simple in-process metrics object or exportable JSON/CSV; implement graceful
shutdown by listening for SIGINT/SIGTERM to stop submitting new work, wait for
the current iteration to finish, clean up any temp files created by
downloadAndUpload, flush/serialize metrics and logs, and then exit; ensure
downloadAndUpload returns metadata (bytes, duration, temp paths) so the runner
can aggregate and delete temp files and report results.

Comment on lines +9 to +11
volumes:
- ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
- ./nginx:/usr/share/nginx/html:ro
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the root directory configured in nginx.conf
rg -n 'root\s+' benchmark/stream_download/nginx.conf

Repository: cnpm/cnpmcore

Length of output: 122


🏁 Script executed:

#!/bin/bash
# View the context around the root directives
echo "=== Context around line 6 ==="
sed -n '1,15p' benchmark/stream_download/nginx.conf

echo ""
echo "=== Context around line 76 ==="
sed -n '70,82p' benchmark/stream_download/nginx.conf

Repository: cnpm/cnpmcore

Length of output: 551


Fix the server-level root directive in nginx.conf to match the mounted volume path.

The default root /var/www/html/; at the server level (line 6) does not match the mounted volume path /usr/share/nginx/html. The container will fail to serve content because /var/www/html is not mounted in docker-compose. The root /usr/share/nginx/html; directive at line 76 only applies to the error page location block and won't cover general requests.

Change line 6 to root /usr/share/nginx/html/; to align with the docker-compose volume mount.

🤖 Prompt for AI Agents
In benchmark/stream_download/docker-compose.nginx.yml around lines 9 to 11, the
nginx server-level root in the mounted nginx.conf points to /var/www/html which
doesn't match the docker-compose volume mount /usr/share/nginx/html; update the
server-level root directive in nginx.conf (line 6) to read root
/usr/share/nginx/html/ so general requests use the mounted content (leave the
existing error-page-specific root intact).

Comment on lines +46 to +52
# 将上传内容重定向到 /dev/null
client_body_in_file_only clean;
client_body_temp_path /tmp/nginx_temp;
client_max_body_size 0; # 不限制上传大小

# 使用 Lua 模块或直接返回 201
return 201;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Unnecessary file I/O for upload endpoint.

The configuration writes uploaded request bodies to temporary files (client_body_in_file_only clean) but immediately returns 201 without processing them. This creates unnecessary disk I/O that defeats the purpose of a benchmark testing streaming upload performance.

🔎 Recommended fix
     location /upload/ {
         # 只允许 POST 方法
         limit_except POST {
             deny all;
         }
         
-        # 将上传内容重定向到 /dev/null
-        client_body_in_file_only clean;
-        client_body_temp_path /tmp/nginx_temp;
-        client_max_body_size 0;  # 不限制上传大小
+        # 丢弃上传内容以测试纯上传性能
+        client_body_in_single_buffer on;
+        client_max_body_size 100m;  # 合理限制上传大小
         
-        # 使用 Lua 模块或直接返回 201
         return 201;

This approach reads the body into memory without writing to disk, which better simulates streaming upload scenarios.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# 将上传内容重定向到 /dev/null
client_body_in_file_only clean;
client_body_temp_path /tmp/nginx_temp;
client_max_body_size 0; # 不限制上传大小
# 使用 Lua 模块或直接返回 201
return 201;
# 丢弃上传内容以测试纯上传性能
client_body_in_single_buffer on;
client_max_body_size 100m; # 合理限制上传大小
return 201;
🤖 Prompt for AI Agents
In benchmark/stream_download/nginx.conf around lines 46 to 52, the config forces
writing request bodies to disk (client_body_in_file_only clean and
client_body_temp_path) even though the handler immediately returns 201, causing
unnecessary I/O; remove or disable those directives (set
client_body_in_file_only off or delete the line and remove
client_body_temp_path), ensure requests are buffered in memory by adding or
increasing client_body_buffer_size (and optionally enable
client_body_in_single_buffer on) while keeping client_max_body_size 0 if
unlimited uploads are desired, so uploads are handled in-memory and the endpoint
returns 201 without disk writes.

# 将上传内容重定向到 /dev/null
client_body_in_file_only clean;
client_body_temp_path /tmp/nginx_temp;
client_max_body_size 0; # 不限制上传大小
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unlimited upload size is a DoS risk.

client_max_body_size 0 allows unlimited upload sizes, which could be exploited in a DoS attack by uploading extremely large files. While this is a benchmark environment, consider setting a reasonable limit (e.g., 100MB or 1GB) to prevent accidental resource exhaustion.

🤖 Prompt for AI Agents
In benchmark/stream_download/nginx.conf around line 49, client_max_body_size is
set to 0 (unlimited), which is a DoS risk; change this to a reasonable cap (for
example "100m" or "1g") appropriate for your benchmark needs, update the
directive to that value on line 49, and document the chosen limit in a comment
so future maintainers understand the trade-off.

Comment on lines +1 to +3
{
"type": "commonjs"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Change module type to ES modules to align with project standards.

The package.json specifies "type": "commonjs", which contradicts the project's coding guidelines and learnings that mandate ES modules (import/export) throughout the codebase. This forces the benchmark.js file to use CommonJS require() instead of modern ES6 import statements.

🔎 Proposed fix
 {
-  "type": "commonjs"
+  "type": "module"
 }

This change will require updating benchmark.js to use ES module syntax (see related comment on that file).

Based on learnings: Use ES modules (import/export) throughout the codebase - no CommonJS require()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{
"type": "commonjs"
}
{
"type": "module"
}
🤖 Prompt for AI Agents
In benchmark/stream_download/package.json around lines 1 to 3, the module type
is set to "commonjs" but the project standard requires ES modules; change the
"type" value to "module" in package.json and then update benchmark.js to use ES
module syntax (replace require() with import statements and export default where
applicable), ensure any file extensions or relative import paths comply with ES
module rules, and run the benchmark to verify no runtime import errors remain.

Comment on lines +34 to +43
# 检查是否有测试文件,如果没有就创建一些
if [ ! -f "$MOUNT_DIR/test-file.txt" ]; then
echo -e "${YELLOW}创建测试文件...${NC}"
cp nginx/test-file.txt "$MOUNT_DIR/" 2>/dev/null || echo "测试文件已存在"
fi

if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
echo -e "${YELLOW}创建大测试文件...${NC}"
cp nginx/large-test-file.bin "$MOUNT_DIR/" 2>/dev/null || dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix path confusion in test file creation logic.

Lines 37 and 42 attempt to copy files from nginx/ to $MOUNT_DIR/ where MOUNT_DIR is defined as $(pwd)/nginx, resulting in circular or redundant paths (e.g., cp nginx/test-file.txt $(pwd)/nginx/test-file.txt). Additionally, the || echo "测试文件已存在" fallback message is misleading when the fallback actually creates files using dd.

🔎 Recommended fix
 # 检查是否有测试文件,如果没有就创建一些
 if [ ! -f "$MOUNT_DIR/test-file.txt" ]; then
     echo -e "${YELLOW}创建测试文件...${NC}"
-    cp nginx/test-file.txt "$MOUNT_DIR/" 2>/dev/null || echo "测试文件已存在"
+    echo "This is a test file" > "$MOUNT_DIR/test-file.txt"
 fi
 
 if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
     echo -e "${YELLOW}创建大测试文件...${NC}"
-    cp nginx/large-test-file.bin "$MOUNT_DIR/" 2>/dev/null || dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
+    dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# 检查是否有测试文件,如果没有就创建一些
if [ ! -f "$MOUNT_DIR/test-file.txt" ]; then
echo -e "${YELLOW}创建测试文件...${NC}"
cp nginx/test-file.txt "$MOUNT_DIR/" 2>/dev/null || echo "测试文件已存在"
fi
if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
echo -e "${YELLOW}创建大测试文件...${NC}"
cp nginx/large-test-file.bin "$MOUNT_DIR/" 2>/dev/null || dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
fi
# 检查是否有测试文件,如果没有就创建一些
if [ ! -f "$MOUNT_DIR/test-file.txt" ]; then
echo -e "${YELLOW}创建测试文件...${NC}"
echo "This is a test file" > "$MOUNT_DIR/test-file.txt"
fi
if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
echo -e "${YELLOW}创建大测试文件...${NC}"
dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
fi
🤖 Prompt for AI Agents
In benchmark/stream_download/start-docker.sh around lines 34 to 43, the script
copies from "nginx/..." into $MOUNT_DIR which is set to $(pwd)/nginx causing
circular paths and a misleading fallback message; fix by (1) resolving a proper
source directory (e.g. SCRIPT_DIR or repo root) and use that as the cp source
instead of "nginx/" when $MOUNT_DIR points to ./nginx, (2) add a conditional: if
source file exists then cp into $MOUNT_DIR, else if target missing create it
with dd, and (3) update the fallback echo to accurately reflect the action taken
(e.g. "created with dd" vs "already exists"). Ensure file-existence checks use
absolute/consistent paths so cp is never attempted from the same directory as
the target.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This work-in-progress PR adds benchmark infrastructure for testing memory usage during stream download and upload operations. The implementation provides a Docker-based nginx server with Node.js benchmark scripts to measure memory behavior during file transfers.

Key changes:

  • Docker-based nginx server configuration for hosting test files and accepting uploads
  • Shell scripts for managing the Docker environment (start, stop, status monitoring)
  • Node.js benchmark script that continuously downloads and uploads a 50MB file to test memory usage

Reviewed changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
benchmark/stream_download/Dockerfile Defines container image with Node.js 24.12.0 and nginx for running benchmarks
benchmark/stream_download/nginx.conf Configures nginx with download and upload endpoints for testing
benchmark/stream_download/benchmark.js Implements continuous download/upload loop for memory testing
benchmark/stream_download/start-docker.sh Shell script to build and start the benchmark Docker container
benchmark/stream_download/stop-docker.sh Shell script to stop and clean up Docker containers and images
benchmark/stream_download/status-docker.sh Shell script to check Docker container status and resource usage
benchmark/stream_download/start-nginx.sh Shell script to initialize and start nginx within the container
benchmark/stream_download/generate_50mb_file.sh Generates a 50MB test file filled with '1' characters
benchmark/stream_download/docker-compose.nginx.yml Docker Compose configuration for alternative nginx-only setup
benchmark/stream_download/README.md Documentation for building, running, and testing the benchmark
benchmark/stream_download/package.json Specifies CommonJS module type for Node.js scripts
benchmark/stream_download/nginx/test-file.txt Small test file for basic download validation
benchmark/stream_download/curl-format.txt Format template for detailed curl timing output
.gitignore Excludes generated 50MB test file from version control

### 运行容器

```bash
docker run -rm -d --platform linux/amd64 \
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker run command has a typo: -rm should be --rm (with double dash). This will cause the command to fail as -rm is not a valid Docker flag.

Suggested change
docker run -rm -d --platform linux/amd64 \
docker run --rm -d --platform linux/amd64 \

Copilot uses AI. Check for mistakes.
Comment on lines 8 to 15
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark script lacks error handling around the download and upload operations. If any operation fails (network error, disk full, etc.), the script will crash without cleanup, potentially leaving orphaned temporary files. Wrap the operations in try-catch blocks and ensure cleanup happens in a finally block.

Suggested change
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
let downloadStream;
let uploadStream;
try {
downloadStream = fs.createWriteStream(tmpFilePath);
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: downloadStream,
});
uploadStream = fs.createReadStream(tmpFilePath);
await urllib.request('http://127.0.0.1/upload/', {
method: 'POST',
stream: uploadStream,
});
} finally {
if (downloadStream) {
downloadStream.destroy();
}
if (uploadStream) {
uploadStream.destroy();
}
try {
await fs.promises.rm(tmpFilePath, { force: true });
} catch {
// ignore cleanup errors in benchmark script
}
}

Copilot uses AI. Check for mistakes.
Comment on lines 8 to 11
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded URL 'http://127.0.0.1/download/50mb_ones.txt' uses port 80, but the README and other scripts configure the service to run on port 8080. This mismatch will cause the benchmark to fail when run. The URL should be 'http://127.0.0.1:8080/download/50mb_ones.txt' to match the Docker port mapping.

Suggested change
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
await urllib.request('http://127.0.0.1:8080/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1:8080/upload/', {

Copilot uses AI. Check for mistakes.
await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
writeStream: fs.createWriteStream(tmpFilePath),
});
await urllib.request('http://127.0.0.1/upload/', {
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded URL 'http://127.0.0.1/upload/' uses port 80, but the README and other scripts configure the service to run on port 8080. This mismatch will cause the benchmark to fail when run. The URL should be 'http://127.0.0.1:8080/upload/' to match the Docker port mapping.

Suggested change
await urllib.request('http://127.0.0.1/upload/', {
await urllib.request('http://127.0.0.1:8080/upload/', {

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +42
cp nginx/test-file.txt "$MOUNT_DIR/" 2>/dev/null || echo "测试文件已存在"
fi

if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
echo -e "${YELLOW}创建大测试文件...${NC}"
cp nginx/large-test-file.bin "$MOUNT_DIR/" 2>/dev/null || dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional logic at lines 36-38 attempts to copy files that may not exist yet. The script references 'nginx/test-file.txt' from the current directory when the MOUNT_DIR variable already points to '$(pwd)/nginx'. This creates a circular reference where it tries to copy 'nginx/test-file.txt' to 'nginx/test-file.txt'. Consider checking if the file exists in MOUNT_DIR directly, or create a simple test file if it doesn't exist.

Suggested change
cp nginx/test-file.txt "$MOUNT_DIR/" 2>/dev/null || echo "测试文件已存在"
fi
if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
echo -e "${YELLOW}创建大测试文件...${NC}"
cp nginx/large-test-file.bin "$MOUNT_DIR/" 2>/dev/null || dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10
echo "This is a small test file for nginx benchmark." > "$MOUNT_DIR/test-file.txt"
fi
if [ ! -f "$MOUNT_DIR/large-test-file.bin" ]; then
echo -e "${YELLOW}创建大测试文件...${NC}"
dd if=/dev/zero of="$MOUNT_DIR/large-test-file.bin" bs=1M count=10

Copilot uses AI. Check for mistakes.
dd if=/dev/zero bs=1024 count=$((TARGET_SIZE_MB * 1024)) | tr '\0' '1' > "$OUTPUT_FILE"

# 验证文件大小
ACTUAL_SIZE=$(stat -f%z "$OUTPUT_FILE" 2>/dev/null || stat -c%s "$OUTPUT_FILE" 2>/dev/null)
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stat command usage at line 22 attempts to use both BSD (-f%z) and GNU (-c%s) syntax with fallback, but the command structure will fail. When the BSD version fails, it will not silently fall back to the GNU version - instead, it will show an error. Consider using a more robust approach by checking the OS type first or using a conditional that properly handles the error without displaying it.

Suggested change
ACTUAL_SIZE=$(stat -f%z "$OUTPUT_FILE" 2>/dev/null || stat -c%s "$OUTPUT_FILE" 2>/dev/null)
ACTUAL_SIZE=$(wc -c < "$OUTPUT_FILE")

Copilot uses AI. Check for mistakes.
Comment on lines 18 to 22
(async () => {
while (true) {
await downloadAndUpload();
}
})();
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infinite loop in the benchmark script will run continuously without any delay or control mechanism. This makes it difficult to observe memory behavior over time and could overwhelm the system. Consider adding a configurable iteration count or delay between iterations, and implement graceful shutdown handling for SIGINT/SIGTERM signals.

Suggested change
(async () => {
while (true) {
await downloadAndUpload();
}
})();
let keepRunning = true;
process.once('SIGINT', () => {
keepRunning = false;
});
process.once('SIGTERM', () => {
keepRunning = false;
});
const maxIterationsEnv = process.env.BENCHMARK_MAX_ITERATIONS;
let maxIterations;
if (maxIterationsEnv) {
const parsed = Number.parseInt(maxIterationsEnv, 10);
if (Number.isFinite(parsed) && parsed > 0) {
maxIterations = parsed;
}
}
const delayEnv = process.env.BENCHMARK_DELAY_MS;
const delayMs = (() => {
if (!delayEnv) {
return 0;
}
const parsed = Number.parseInt(delayEnv, 10);
if (!Number.isFinite(parsed) || parsed < 0) {
return 0;
}
return parsed;
})();
async function main() {
for (let i = 0; keepRunning && (maxIterations === undefined || i < maxIterations); i++) {
await downloadAndUpload();
if (!keepRunning) {
break;
}
if (delayMs > 0) {
await new Promise(resolve => setTimeout(resolve, delayMs));
}
}
}
main().catch(err => {
// eslint-disable-next-line no-console
console.error('Benchmark failed:', err);
process.exitCode = 1;
});

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +19
# chown -R www-data:www-data /var/www/html
# chmod -R 755 /var/www/html
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out lines for setting permissions on /var/www/html reference 'www-data:www-data', but the base image is 'node:24.12.0' which is Debian-based and may not have the www-data user configured in the same way. While these lines are commented out, if they were to be enabled, they might fail or require adjustment for the Node.js base image.

Suggested change
# chown -R www-data:www-data /var/www/html
# chmod -R 755 /var/www/html

Copilot uses AI. Check for mistakes.
- '8080:80'
volumes:
- ./nginx.conf:/etc/nginx/conf.d/default.conf:ro
- ./nginx:/usr/share/nginx/html:ro
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The volume mount path in docker-compose.nginx.yml uses '/usr/share/nginx/html' (line 11), which is the standard nginx path, but the nginx.conf file references '/var/www/html/' as the root directory (line 6 in nginx.conf) and the Dockerfile creates '/var/www/html' as the working directory. This path inconsistency will cause the nginx service to fail to serve files correctly.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +5
# Nginx 下载/上传测试服务器

## 快速开始

> **注意**: 请先切换到 `benchmark/stream_download` 目录下执行以下命令
Copy link

Copilot AI Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark script is located in a directory called 'stream_download', but it performs both download AND upload operations. The directory name doesn't accurately reflect its purpose. Consider renaming to 'stream_benchmark' or 'stream_upload_download' to better represent the full scope of functionality.

Suggested change
# Nginx 下载/上传测试服务器
## 快速开始
> **注意**: 请先切换到 `benchmark/stream_download` 目录下执行以下命令
# Nginx 下载/上传流式基准测试服务器
## 快速开始
> **注意**: 请先切换到 `benchmark/stream_download` (下载/上传流式基准测试目录)下执行以下命令

Copilot uses AI. Check for mistakes.
node --expose-gc benchmark.js

use kill -s USR2
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
benchmark/stream_download/benchmark.js (3)

1-5: Convert to ES module imports.

The file uses CommonJS require() statements which violates the project's coding guidelines mandating ES modules throughout.

This was previously flagged. Ensure benchmark/stream_download/package.json is updated with "type": "module" and urllib is added as a dependency.

🔎 Proposed fix
-const urllib = require('urllib');
-const fs = require('fs');
-const crypto = require('crypto');
-const path = require('path');
-const { setTimeout } = require('timers/promises');
+import urllib from 'urllib';
+import fs from 'fs';
+import crypto from 'crypto';
+import path from 'path';
+import { setTimeout } from 'timers/promises';
+import { fileURLToPath } from 'url';
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));

Based on learnings: Use ES modules (import/export) throughout the codebase.


7-17: Previously flagged issues remain unaddressed.

This function has several issues already identified in past reviews:

  1. Port mismatch: URLs use port 80 but Docker config exposes port 8080
  2. No error handling: Failures leave orphaned temp files
  3. Disk I/O overhead: Consider piping download stream directly to upload for pure stream benchmarking
🔎 Consolidated fix addressing all issues
 async function downloadAndUpload() {
   const tmpFilePath = path.join(__dirname, `${crypto.randomUUID()}.txt`);
-  await urllib.request('http://127.0.0.1/download/50mb_ones.txt', {
-    writeStream: fs.createWriteStream(tmpFilePath),
-  });
-  await urllib.request('http://127.0.0.1/upload/', {
-    method: 'POST',
-    stream: fs.createReadStream(tmpFilePath),
-  });
-  await fs.promises.rm(tmpFilePath);
+  try {
+    await urllib.request('http://127.0.0.1:8080/download/50mb_ones.txt', {
+      writeStream: fs.createWriteStream(tmpFilePath),
+      timeout: 60000,
+    });
+    await urllib.request('http://127.0.0.1:8080/upload/', {
+      method: 'POST',
+      stream: fs.createReadStream(tmpFilePath),
+      timeout: 60000,
+    });
+  } finally {
+    try {
+      await fs.promises.rm(tmpFilePath, { force: true });
+    } catch {
+      // ignore cleanup errors
+    }
+  }
 }

19-31: Add graceful shutdown and iteration logging.

The infinite loop lacks SIGINT/SIGTERM handling and progress logging, as previously flagged.

The idle mode with optional GC triggering (lines 24-28) is appropriate for the memory benchmark purpose. However, consider adding:

  • Iteration counter and timing logs to track progress
  • SIGINT/SIGTERM handlers for clean exit
🔎 Suggested enhancement
+let iteration = 0;
 let downloading = true;
+
+process.on('SIGINT', () => process.exit(0));
+process.on('SIGTERM', () => process.exit(0));
+
 (async () => {
   while (true) {
     if (downloading) {
+      const start = Date.now();
       await downloadAndUpload();
+      iteration++;
+      console.log(`Iteration ${iteration}: ${Date.now() - start}ms`);
     } else {
       await setTimeout(100);
       if (globalThis.gc) {
         globalThis.gc();
       }
     }
   }
 })();
🧹 Nitpick comments (1)
benchmark/stream_download/benchmark.js (1)

33-35: Add logging when toggle state changes.

The SIGUSR2 handler toggles state silently. Adding a log message helps confirm the signal was received and what mode the benchmark is now in.

🔎 Proposed enhancement
 process.on('SIGUSR2', () => {
   downloading = !downloading;
+  console.log(`Benchmark mode: ${downloading ? 'downloading' : 'idle (GC enabled)'}`);
 });
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2639007 and f27060c.

📒 Files selected for processing (1)
  • benchmark/stream_download/benchmark.js
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx,js,jsx}: Use single quotes for string literals
Use ES5 trailing commas (comma in last element of objects/arrays)
Use 2-space indentation
Limit lines to 120 character width
Avoid parentheses around single arrow function parameters
Limit function parameters to a maximum of 6
Warn on console.log usage in source code
Do not use anonymous default exports
Use ES modules syntax (import/export) throughout

Files:

  • benchmark/stream_download/benchmark.js
🧠 Learnings (2)
📚 Learning: 2025-12-22T13:51:37.326Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-22T13:51:37.326Z
Learning: Applies to **/*.ts : Use ES modules (import/export) throughout the codebase - no CommonJS require()

Applied to files:

  • benchmark/stream_download/benchmark.js
📚 Learning: 2025-12-22T13:51:03.262Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-22T13:51:03.262Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Use ES modules syntax (import/export) throughout

Applied to files:

  • benchmark/stream_download/benchmark.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: test on mysql (node@24, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@22, shard@0/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@22, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@20, shard@2/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@22, shard@0/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@20, shard@0/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@24, shard@2/3, enableJSONBuilder@false)
  • GitHub Check: test on postgresql (node@22, shard@0/3)
  • GitHub Check: test on postgresql (node@20, shard@2/3)
  • GitHub Check: test on postgresql (node@22, shard@1/3)
  • GitHub Check: test on postgresql (node@20, shard@1/3)
  • GitHub Check: test on postgresql (node@22, shard@2/3)
  • GitHub Check: test on postgresql (node@24, shard@0/3)
  • GitHub Check: test on postgresql (node@24, shard@1/3)
  • GitHub Check: test on postgresql (node@20, shard@0/3)
  • GitHub Check: test on postgresql (node@24, shard@2/3)
  • GitHub Check: build-and-push-image

@fengmk2
Copy link
Member

fengmk2 commented Dec 25, 2025

image

可以看到 js heap 和 rss 实际差距很大。

就是这个现象了

Copilot AI review requested due to automatic review settings December 25, 2025 01:50
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
benchmark/stream_download/Dockerfile (3)

1-1: Verify Node.js version 24.12.0 exists.

Please confirm that Node.js version 24.12.0 is a valid, published version. According to project guidelines, the Node.js version must be ^20.18.0 or ^22.18.0 or ^24.11.0.

Run the following to verify the version exists:

#!/bin/bash
# Check if Node.js 24.12.0 image exists on Docker Hub
curl -s "https://hub.docker.com/v2/repositories/library/node/tags?page_size=100&name=24.12" | jq -r '.results[].name' | grep -E "^24\.12\.0$"

Based on learnings, the project requires specific Node.js versions.


4-9: Consider image optimization suggestions from previous reviews.

Previous reviews suggested using --no-install-recommends and removing vim to reduce image size. These are optional optimizations for a benchmark image.


40-40: Consider adding urllib to package.json.

Previous reviews suggested declaring urllib in package.json dependencies rather than installing it directly with npm i. This would make dependencies explicit and improve Docker layer caching.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f27060c and b677416.

📒 Files selected for processing (7)
  • benchmark/stream_download/.gitignore
  • benchmark/stream_download/Dockerfile
  • benchmark/stream_download/README.md
  • benchmark/stream_download/benchmark.js
  • benchmark/stream_download/benchmark_undici.js
  • benchmark/stream_download/gc.js
  • benchmark/stream_download/generate_50mb_file.sh
✅ Files skipped from review due to trivial changes (1)
  • benchmark/stream_download/.gitignore
🚧 Files skipped from review as they are similar to previous changes (2)
  • benchmark/stream_download/README.md
  • benchmark/stream_download/generate_50mb_file.sh
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx,js,jsx}: Use single quotes for string literals
Use ES5 trailing commas (comma in last element of objects/arrays)
Use 2-space indentation
Limit lines to 120 character width
Avoid parentheses around single arrow function parameters
Limit function parameters to a maximum of 6
Warn on console.log usage in source code
Do not use anonymous default exports
Use ES modules syntax (import/export) throughout

Files:

  • benchmark/stream_download/benchmark_undici.js
  • benchmark/stream_download/benchmark.js
  • benchmark/stream_download/gc.js
🧠 Learnings (3)
📚 Learning: 2025-12-22T13:51:37.326Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-22T13:51:37.326Z
Learning: Applies to **/*.ts : Use ES modules (import/export) throughout the codebase - no CommonJS require()

Applied to files:

  • benchmark/stream_download/benchmark.js
📚 Learning: 2025-12-22T13:51:03.262Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-22T13:51:03.262Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Use ES modules syntax (import/export) throughout

Applied to files:

  • benchmark/stream_download/benchmark.js
📚 Learning: 2025-12-22T13:51:03.262Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-22T13:51:03.262Z
Learning: Applies to .npmrc,.nvmrc : Node.js version must be ^20.18.0 or ^22.18.0 or ^24.11.0

Applied to files:

  • benchmark/stream_download/Dockerfile
🧬 Code graph analysis (2)
benchmark/stream_download/benchmark_undici.js (1)
benchmark/stream_download/benchmark.js (7)
  • require (5-5)
  • fs (2-2)
  • crypto (3-3)
  • path (4-4)
  • tmpFilePath (15-15)
  • downloadResponse (16-18)
  • downloading (32-32)
benchmark/stream_download/benchmark.js (1)
benchmark/stream_download/gc.js (1)
  • require (1-1)
🪛 Biome (2.1.2)
benchmark/stream_download/gc.js

[error] 61-61: Unexpected constant condition.

(lint/correctness/noConstantCondition)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: test on mysql (node@20, shard@1/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@20, shard@2/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@20, shard@0/3, enableJSONBuilder@false)
  • GitHub Check: build-and-push-image
  • GitHub Check: test on mysql (node@24, shard@0/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@22, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@24, shard@1/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@22, shard@0/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@24, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@22, shard@2/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@24, shard@0/3, enableJSONBuilder@true)
  • GitHub Check: test on postgresql (node@22, shard@2/3)
  • GitHub Check: test on postgresql (node@24, shard@2/3)
  • GitHub Check: test on postgresql (node@24, shard@1/3)
  • GitHub Check: test on postgresql (node@22, shard@0/3)
  • GitHub Check: test on postgresql (node@20, shard@0/3)
  • GitHub Check: test on postgresql (node@20, shard@2/3)
  • GitHub Check: test on postgresql (node@20, shard@1/3)
  • GitHub Check: test on postgresql (node@22, shard@1/3)
  • GitHub Check: test on postgresql (node@24, shard@0/3)
🔇 Additional comments (6)
benchmark/stream_download/benchmark_undici.js (2)

1-5: CommonJS usage is consistent with package.json.

The use of require() is appropriate since package.json specifies "type": "commonjs". While the project generally prefers ES modules, this benchmark module explicitly uses CommonJS.


7-21: LGTM! Clean stream-based download and upload flow.

The function properly uses undici's streaming API with pipeline for download and createReadStream for upload, then cleans up the temporary file.

benchmark/stream_download/gc.js (1)

1-58: LGTM! Well-structured GC statistics collection.

The PerformanceObserver setup correctly tracks GC events, categorizes them by kind, and provides aggregated statistics through getGCStats().

benchmark/stream_download/benchmark.js (3)

1-7: CommonJS usage is consistent with package.json.

The use of require() is appropriate since package.json specifies "type": "commonjs". The import of setTimeout from 'timers/promises' is correct for promise-based delays.


14-30: Core benchmark logic is sound.

The function correctly downloads to a temporary file, uploads from that file, and cleans up. The disk I/O is intentional for this benchmark (measuring stream download/upload with filesystem intermediary). Previous reviews noted missing error handling, which is a valid concern but may be acceptable for a benchmark script.


33-56: Parallel execution of identical benchmark loops.

Two identical IIFEs run concurrently, each executing 3 parallel download/upload cycles when active. This design intentionally creates parallel load for the benchmark. Previous reviews suggested adding error handling, metrics, and graceful shutdown—those remain valid suggestions for production-quality tooling.

if (downloading) {
await downloadAndUpload();
} else {
await setTimeout(100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing import for setTimeout.

Line 29 uses setTimeout(100) as a promise, but setTimeout is not imported from 'timers/promises'. This will cause a runtime error because the global setTimeout returns a timer ID, not a promise.

🔎 Proposed fix
 const fs = require('fs');
 const crypto = require('crypto');
 const path = require('path');
+const { setTimeout } = require('timers/promises');
 const { pipeline } = require('stream/promises');

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In benchmark/stream_download/benchmark_undici.js around line 29, the code calls
await setTimeout(100) but never imports the promise-based setTimeout, so the
global setTimeout returns a timer id and causes a runtime error; fix by adding
an import for the promise-based setTimeout from 'timers/promises' (e.g., import
{ setTimeout } from "timers/promises") at the top of the file and keep using
await setTimeout(100) as intended.

}

// only print GC stats if the GC environment variable is set
if (process.env.GC || true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove constant condition to make environment check meaningful.

The condition if (process.env.GC || true) always evaluates to true, making the environment variable check ineffective. The logging will run unconditionally regardless of the GC environment variable setting.

🔎 Proposed fix
-if (process.env.GC || true) {
+if (process.env.GC) {
   setInterval(() => {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (process.env.GC || true) {
if (process.env.GC) {
setInterval(() => {
🧰 Tools
🪛 Biome (2.1.2)

[error] 61-61: Unexpected constant condition.

(lint/correctness/noConstantCondition)

🤖 Prompt for AI Agents
In benchmark/stream_download/gc.js around line 61, the condition `if
(process.env.GC || true)` always evaluates to true; remove the constant `||
true` so the branch actually depends on the environment variable (e.g., use `if
(process.env.GC)` or a stricter check like `if (process.env.GC === '1' ||
process.env.GC === 'true')`), and keep the existing logging/behavior inside the
block unchanged so it only runs when GC is enabled via env.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 17 changed files in this pull request and generated 9 comments.

Comment on lines +24 to +35
(async () => {
while (true) {
if (downloading) {
await downloadAndUpload();
} else {
await setTimeout(100);
if (globalThis.gc) {
globalThis.gc();
}
}
}
})();
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infinite while loop with no error handling could cause the benchmark to run indefinitely even if operations are failing. If download/upload operations hang, this could accumulate many pending operations. Consider adding error handling and potentially a mechanism to detect and report failures.

Copilot uses AI. Check for mistakes.
COPY benchmark.js /root/workspace/benchmark.js
COPY benchmark_undici.js /root/workspace/benchmark_undici.js

RUN cd /root/workspace && npm i urllib --registry https://registry.npmmirror.com
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The npm package installation uses a hardcoded Chinese mirror registry. For a project that may be used internationally, consider using the default npm registry or making the registry configurable. If the Chinese mirror is required for specific performance reasons, consider adding a comment explaining why.

Copilot uses AI. Check for mistakes.
Comment on lines 16 to 28
const downloadResponse = await urllib.request(`${BASE_URL}/download/50mb_ones.txt`, {
writeStream: fs.createWriteStream(tmpFilePath),
});
const uploadResponse = await urllib.request(`${BASE_URL}/upload/`, {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
count++;
if (count % 100 === 0) {
console.log(
`Downloaded and uploaded ${count} times, downloadResponse: ${downloadResponse.status}, uploadResponse: ${uploadResponse.status}`,
);
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for the download and upload operations. If either request fails, the temporary file may not be cleaned up, leading to disk space issues over time. Consider wrapping the operations in a try-finally block to ensure cleanup happens even on failure.

Suggested change
const downloadResponse = await urllib.request(`${BASE_URL}/download/50mb_ones.txt`, {
writeStream: fs.createWriteStream(tmpFilePath),
});
const uploadResponse = await urllib.request(`${BASE_URL}/upload/`, {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
count++;
if (count % 100 === 0) {
console.log(
`Downloaded and uploaded ${count} times, downloadResponse: ${downloadResponse.status}, uploadResponse: ${uploadResponse.status}`,
);
let downloadResponse;
let uploadResponse;
try {
downloadResponse = await urllib.request(`${BASE_URL}/download/50mb_ones.txt`, {
writeStream: fs.createWriteStream(tmpFilePath),
});
uploadResponse = await urllib.request(`${BASE_URL}/upload/`, {
method: 'POST',
stream: fs.createReadStream(tmpFilePath),
});
count++;
if (count % 100 === 0) {
console.log(
`Downloaded and uploaded ${count} times, downloadResponse: ${downloadResponse.status}, uploadResponse: ${uploadResponse.status}`,
);
}
} finally {
try {
await fs.promises.rm(tmpFilePath);
} catch {
// Ignore cleanup errors to avoid masking original failure
}

Copilot uses AI. Check for mistakes.
}

// only print GC stats if the GC environment variable is set
if (process.env.GC || true) {
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition process.env.GC || true will always evaluate to true, making the environment variable check ineffective. This means GC stats will always be printed regardless of the GC environment variable setting. Consider changing to just process.env.GC if you want to make it conditional, or remove the condition entirely if you always want logging.

Suggested change
if (process.env.GC || true) {
if (process.env.GC) {

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +20
// Download file
const downloadResponse = await request('http://127.0.0.1/download/50mb_ones.txt');
await pipeline(downloadResponse.body, fs.createWriteStream(tmpFilePath));

// Upload file
await request('http://127.0.0.1/upload/', {
method: 'POST',
body: fs.createReadStream(tmpFilePath),
});

await fs.promises.rm(tmpFilePath);
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for the download and upload operations. If either request fails, the temporary file may not be cleaned up, leading to disk space issues over time. Consider wrapping the operations in a try-finally block to ensure cleanup happens even on failure.

Suggested change
// Download file
const downloadResponse = await request('http://127.0.0.1/download/50mb_ones.txt');
await pipeline(downloadResponse.body, fs.createWriteStream(tmpFilePath));
// Upload file
await request('http://127.0.0.1/upload/', {
method: 'POST',
body: fs.createReadStream(tmpFilePath),
});
await fs.promises.rm(tmpFilePath);
try {
// Download file
const downloadResponse = await request('http://127.0.0.1/download/50mb_ones.txt');
await pipeline(downloadResponse.body, fs.createWriteStream(tmpFilePath));
// Upload file
await request('http://127.0.0.1/upload/', {
method: 'POST',
body: fs.createReadStream(tmpFilePath),
});
} finally {
try {
await fs.promises.rm(tmpFilePath, { force: true });
} catch {
// Ignore cleanup errors to avoid masking original failure
}
}

Copilot uses AI. Check for mistakes.
# 设置根目录为 nginx 文件夹
root /var/www/html/;

# 禁用缓存以支持流式下载测试
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nginx configuration disables sendfile which is a performance optimization. While the comment states this is to support streaming download tests, consider documenting why this is necessary for the benchmark specifically, as sendfile is generally recommended for serving static files efficiently.

Suggested change
# 禁用缓存以支持流式下载测试
# 为了基准测试精确观测“流式”下载行为,这里显式关闭 sendfile。
# sendfile 通常在生产环境中应开启以提升静态文件性能(零拷贝),
# 但它会绕过用户态缓冲和分块发送逻辑,使得内核直接把文件数据写入 socket。
# 本基准测试需要测量应用/代理层逐步写出响应数据的吞吐与延迟,因此禁用 sendfile,
# 确保响应经由 nginx 用户态处理与缓冲,从而更真实地反映流式下载场景的表现。

Copilot uses AI. Check for mistakes.
# 将上传内容重定向到 /dev/null
client_body_in_file_only clean;
client_body_temp_path /tmp/nginx_temp;
client_max_body_size 0; # 不限制上传大小
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting client_max_body_size to 0 disables the upload size limit entirely. For a benchmark this may be intentional, but it creates a potential denial-of-service vector if the server is exposed. Consider adding a comment explaining this is intentional for benchmarking purposes only and should not be used in production.

Suggested change
client_max_body_size 0; # 不限制上传大小
client_max_body_size 0; # 不限制上传大小(仅用于压测,禁止用于生产环境)

Copilot uses AI. Check for mistakes.
add_header Cache-Control "no-cache, no-store, must-revalidate";
add_header Pragma "no-cache";

# 允许跨域访问(测试需要)
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CORS configuration allows all origins with wildcard (*). While this may be acceptable for a local benchmark, consider adding a comment noting this is for testing only and should be restricted in any production-like scenario.

Suggested change
# 允许跨域访问(测试需要)
# 允许跨域访问(测试需要)
# 注意:以下 CORS 配置仅用于本地/基准测试环境,禁止直接用于生产环境。
# 在生产环境中应将 Access-Control-Allow-Origin 限制为受信任的具体域名,而不是使用通配符 *。

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +56
(async () => {
while (true) {
if (downloading) {
await Promise.all([downloadAndUpload(), downloadAndUpload(), downloadAndUpload()]);
} else {
await setTimeout(100);
if (globalThis.gc) {
globalThis.gc();
}
}
}
})();
(async () => {
while (true) {
if (downloading) {
await Promise.all([downloadAndUpload(), downloadAndUpload(), downloadAndUpload()]);
} else {
await setTimeout(100);
if (globalThis.gc) {
globalThis.gc();
}
}
}
})();
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infinite while loop with concurrent Promise.all calls could lead to uncontrolled resource consumption. If download/upload operations are slow or hang, this could accumulate many pending operations. Consider adding concurrency limits or waiting for previous operations to complete before starting new ones.

Copilot uses AI. Check for mistakes.
- Add run-benchmark-with-coredump.sh to run benchmark and generate coredump/heap snapshot
- Add copy-coredump.sh to copy dump files from Docker container
- Add COREDUMP_ANALYSIS.md with detailed analysis guide
- Update Dockerfile to install gdb and enable coredump support
- Update start-docker.sh with --privileged and --ulimit for coredump
- Update benchmark.js to use configurable connection pool (POOL_SIZE env)
- Update .gitignore to exclude coredump files
- Update README.md with memory analysis documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
benchmark/stream_download/Dockerfile (1)

1-1: Verify that Node.js version 24.12.0 exists and is valid.

A previous review comment indicated that version 24.12.0 may not be valid. According to project learnings, Node.js version must be ^20.18.0 or ^22.18.0 or ^24.11.0. While 24.12.0 would satisfy ^24.11.0 if it exists, please verify that this specific version is available and is a stable release.

Does Node.js version 24.12.0 exist and is it a stable release?

Based on learnings, the project requires Node.js version ^20.18.0 or ^22.18.0 or ^24.11.0.

🧹 Nitpick comments (4)
benchmark/stream_download/copy-coredump.sh (2)

21-24: Consider checking if container is running, not just if it exists.

The script uses docker ps -a which includes stopped containers. Since the script executes commands inside the container, it needs the container to be running. If the container exists but is stopped, this check will pass but subsequent docker exec commands will fail with less clear error messages.

🔎 Proposed fix
-if ! docker ps -a | grep -q "$CONTAINER_NAME"; then
+if ! docker ps | grep -q "$CONTAINER_NAME"; then
-    echo -e "${RED}Error: Container $CONTAINER_NAME not found${NC}"
+    echo -e "${RED}Error: Container $CONTAINER_NAME is not running${NC}"
     exit 1
 fi

28-87: Consider using IFS= with while read loops for robustness.

The script uses while read -r without setting IFS=, which could theoretically cause issues with filenames containing whitespace or special characters. While the risk is low in this controlled environment (Node.js-generated heap snapshots and core dumps with predictable names), adding IFS= would make the script more robust.

🔎 Example fix for one section
-    echo "$HEAP_FILES" | while read -r file; do
+    echo "$HEAP_FILES" | while IFS= read -r file; do
         if [ -n "$file" ]; then

Apply similar changes to the other while read loops on lines 45, 61, and 77.

benchmark/stream_download/Dockerfile (1)

15-16: Avoid using chmod 777 for security reasons.

Setting permissions to 777 on /tmp/cores allows any user to read, write, and execute files in this directory. Since this is a benchmark/development image, the security risk may be acceptable, but consider using more restrictive permissions like 755 or 775.

🔎 Proposed fix
 RUN echo "ulimit -c unlimited" >> /etc/bash.bashrc \
     && mkdir -p /tmp/cores \
-    && chmod 777 /tmp/cores
+    && chmod 775 /tmp/cores
benchmark/stream_download/run-benchmark-with-coredump.sh (1)

39-41: Consider verifying that core dump configuration succeeded.

The script uses || true to suppress errors when setting /proc/sys/kernel/core_pattern, which requires privileged access. While this makes the script more portable, it means configuration failures will be silent, potentially leading to no core dumps being generated.

🔎 Suggested enhancement
 echo -e "${YELLOW}Enabling core dumps in container...${NC}"
-docker exec "$CONTAINER_NAME" bash -c "ulimit -c unlimited && echo '/tmp/core.%p' > /proc/sys/kernel/core_pattern 2>/dev/null || true"
+docker exec "$CONTAINER_NAME" bash -c "ulimit -c unlimited && echo '/tmp/core.%p' > /proc/sys/kernel/core_pattern 2>/dev/null || echo 'Warning: Could not set core_pattern (requires --privileged)'"

This provides feedback when the configuration fails.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b677416 and b719d94.

📒 Files selected for processing (8)
  • benchmark/stream_download/.gitignore
  • benchmark/stream_download/COREDUMP_ANALYSIS.md
  • benchmark/stream_download/Dockerfile
  • benchmark/stream_download/README.md
  • benchmark/stream_download/benchmark.js
  • benchmark/stream_download/copy-coredump.sh
  • benchmark/stream_download/run-benchmark-with-coredump.sh
  • benchmark/stream_download/start-docker.sh
🚧 Files skipped from review as they are similar to previous changes (4)
  • benchmark/stream_download/.gitignore
  • benchmark/stream_download/start-docker.sh
  • benchmark/stream_download/benchmark.js
  • benchmark/stream_download/README.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-22T13:51:03.277Z
Learnt from: CR
Repo: cnpm/cnpmcore PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-22T13:51:03.277Z
Learning: Applies to .npmrc,.nvmrc : Node.js version must be ^20.18.0 or ^22.18.0 or ^24.11.0

Applied to files:

  • benchmark/stream_download/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: test on mysql (node@24, shard@2/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@22, shard@2/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@24, shard@2/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@20, shard@2/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@24, shard@1/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@20, shard@0/3, enableJSONBuilder@true)
  • GitHub Check: test on mysql (node@22, shard@0/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@24, shard@0/3, enableJSONBuilder@false)
  • GitHub Check: test on mysql (node@20, shard@1/3, enableJSONBuilder@false)
  • GitHub Check: build-and-push-image
  • GitHub Check: test on postgresql (node@24, shard@2/3)
  • GitHub Check: test on postgresql (node@20, shard@0/3)
  • GitHub Check: test on postgresql (node@20, shard@1/3)
  • GitHub Check: test on postgresql (node@24, shard@0/3)
  • GitHub Check: test on postgresql (node@24, shard@1/3)
  • GitHub Check: test on postgresql (node@22, shard@2/3)
  • GitHub Check: test on postgresql (node@22, shard@0/3)
  • GitHub Check: test on postgresql (node@20, shard@2/3)
  • GitHub Check: test on postgresql (node@22, shard@1/3)
🔇 Additional comments (2)
benchmark/stream_download/run-benchmark-with-coredump.sh (2)

28-32: LGTM - Container check is correct.

The script correctly uses docker ps (without -a) to verify the container is running, which is appropriate since the script needs to execute commands inside it.


50-60: LGTM - PID retrieval logic is sound.

The script correctly uses pgrep -f to find the benchmark process and includes good error handling with diagnostic output. In this controlled environment, the pattern "node.*benchmark.js" should reliably match the intended process.

Comment on lines +14 to +16
RUN echo "ulimit -c unlimited" >> /etc/bash.bashrc \
&& mkdir -p /tmp/cores \
&& chmod 777 /tmp/cores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The ulimit configuration in bash.bashrc won't affect the container's CMD process.

Setting ulimit -c unlimited in /etc/bash.bashrc only affects interactive bash shells. The CMD process (/usr/local/bin/start-nginx.sh) won't inherit this setting. Core dumps must be configured at runtime when starting the container (e.g., via --ulimit core=-1 flag) or within the start-nginx.sh script itself.

🔎 Verification

Based on the PR context, run-benchmark-with-coredump.sh line 41 already sets ulimit at runtime:

docker exec "$CONTAINER_NAME" bash -c "ulimit -c unlimited && ..."

This confirms that the Dockerfile configuration alone is insufficient. Consider removing the ineffective bash.bashrc line or documenting that runtime configuration is required.

🤖 Prompt for AI Agents
In benchmark/stream_download/Dockerfile around lines 14 to 16, the RUN appending
of "ulimit -c unlimited" to /etc/bash.bashrc is ineffective for the container
CMD process; remove that line and either (a) document that core dumps must be
enabled at runtime (e.g., docker run/exec --ulimit core=-1 or
run-benchmark-with-coredump.sh already does ulimit inside the container), or (b)
add a ulimit invocation to the container startup script
(/usr/local/bin/start-nginx.sh) so the CMD process inherits unlimited core size;
pick one approach and update the Dockerfile or scripts and comments accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants