Skip to content

Error Reference data file not found: MetaCyc_reference.RData when using pathway_annotation() in ggpicrust2 v2.1.2 #162

@santi-zaragoza

Description

@santi-zaragoza

Hello @santi-zaragoza and @thardy615,

Thank you for reporting this issue. I've thoroughly analyzed the problem and can confirm this is a known issue that has been resolved in recent versions of the package.

Root Cause Analysis

The error Reference data file not found: MetaCyc_reference.RData occurs due to:

  1. Version-specific issue: You're likely using ggpicrust2 v2.1.2, which has a less robust file loading mechanism
  2. Installation method differences: The reference data files may not be properly installed depending on how the package was installed
  3. File path resolution: The older version only searches in one location for reference files

Verified Solutions

Solution 1: Update to Latest Version (Recommended)

The issue has been fixed in version 2.3.2 with an enhanced load_reference_data function that:

  • Searches multiple file locations
  • Provides better error messages
  • Includes fallback mechanisms
# Remove current installation
remove.packages("ggpicrust2")

# Install latest version from CRAN (most stable)
install.packages("ggpicrust2")

# OR install latest development version from GitHub
# install.packages("devtools")
# devtools::install_github("cafferychen777/ggpicrust2")

# Restart R session
.rs.restartR()  # In RStudio

Solution 2: Verify Installation and Files

After installation, verify the reference files are present:

library(ggpicrust2)

# Check if reference files exist
ref_path <- system.file("extdata", "MetaCyc_reference.RData", package = "ggpicrust2")
cat("Reference file path:", ref_path, "\n")
cat("File exists:", file.exists(ref_path), "\n")

# List all files in extdata directory
extdata_path <- system.file("extdata", package = "ggpicrust2")
cat("Files in extdata directory:\n")
print(list.files(extdata_path))

# Check package version
cat("Package version:", as.character(packageVersion("ggpicrust2")), "\n")

Solution 3: Alternative Installation Methods

If the standard installation fails:

# Method 1: Install with dependencies
install.packages("ggpicrust2", dependencies = TRUE)

# Method 2: Install from GitHub with build_vignettes = FALSE
devtools::install_github("cafferychen777/ggpicrust2", 
                        build_vignettes = FALSE,
                        force = TRUE)

# Method 3: Install specific version
devtools::install_github("cafferychen777/ggpicrust2@v2.3.2")

Solution 4: Temporary Workaround (If update is not possible)

If you must use the current version temporarily:

# Create custom function to handle MetaCyc annotation
custom_metacyc_annotation <- function(daa_results_df) {
  # Try to load reference data from multiple locations
  possible_paths <- c(
    system.file("extdata", "MetaCyc_reference.RData", package = "ggpicrust2"),
    system.file("inst/extdata", "MetaCyc_reference.RData", package = "ggpicrust2")
  )
  
  ref_data <- NULL
  for (path in possible_paths) {
    if (file.exists(path)) {
      load(path)
      ref_data <- MetaCyc_reference
      break
    }
  }
  
  if (is.null(ref_data)) {
    stop("MetaCyc reference data not found in any location")
  }
  
  # Ensure correct column names
  if (ncol(ref_data) >= 2) {
    colnames(ref_data)[1:2] <- c("id", "description")
  }
  
  # Perform annotation
  result <- daa_results_df
  result$description <- ref_data$description[match(result$feature, ref_data$id)]
  
  return(result)
}

# Use the custom function instead of pathway_annotation
# annotated_results <- custom_metacyc_annotation(your_daa_results)

Verification Steps

After applying any solution, test with this code:

library(ggpicrust2)

# Test with sample data
data(metacyc_abundance)
data(metadata)

# Create sample DAA results for testing
test_daa <- data.frame(
  feature = c("PWY-1042", "PWY-5973", "PWY-6121"),
  p_values = c(0.01, 0.03, 0.05),
  p_adjust = c(0.02, 0.04, 0.06)
)

# Test MetaCyc annotation
tryCatch({
  annotated_results <- pathway_annotation(
    pathway = "MetaCyc",
    daa_results_df = test_daa,
    ko_to_kegg = FALSE
  )
  
  cat("✅ MetaCyc annotation successful!\n")
  cat("Number of features annotated:", nrow(annotated_results), "\n")
  cat("Features with descriptions:", sum(!is.na(annotated_results$description)), "\n")
  
}, error = function(e) {
  cat("❌ Error:", e$message, "\n")
})

Additional Information

For @thardy615: The corruption error during GitHub installation suggests network or dependency issues. Try:

  1. Installing from CRAN instead of GitHub
  2. Clearing R package cache: remove.packages("ggpicrust2"); .libPaths(); # check library paths
  3. Installing with force = TRUE parameter

Environment Information Needed

If the solutions above don't work, please provide:

# Run this and share the output
cat("R version:", R.version.string, "\n")
cat("Operating system:", Sys.info()["sysname"], Sys.info()["release"], "\n")
cat("ggpicrust2 version:", as.character(packageVersion("ggpicrust2")), "\n")
cat("Installation method: [CRAN/GitHub/other]\n")

# Check library paths
cat("Library paths:\n")
print(.libPaths())

# Check if package is properly installed
cat("Package location:", find.package("ggpicrust2"), "\n")

The latest version (2.3.2) includes comprehensive fixes for this issue and should resolve the problem permanently. Please try the update first, as it's the most reliable solution.

Best regards,
Caffery Yang

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions