Skip to content
#

data-standardization

Here are 39 public repositories matching this topic...

🩺 Machine Learning diabetes prediction model using Support Vector Machine (SVM) classifier. Analyzes 8 medical features (glucose, BMI, age, etc.) from Pima Indian dataset to predict diabetes risk with 75-80% accuracy. Built with Python, scikit-learn, pandas. Includes data preprocessing, model training, and prediction system for diabetes..

  • Updated Jul 25, 2025
  • Jupyter Notebook

A new package processes textual descriptions of drone designs to extract structured summaries of their operational capabilities. It focuses on identifying and categorizing key features such as locomot

  • Updated Dec 21, 2025
  • Python

Hi folk, During my internship at KultureHire, I completed an end to end Data Analytics project. I created an executive and functional dashboard using pivot tables, conducted a thorough analysis, and provided actionable recommendations. I'm excited to share my work and the insights I discovered.

  • Updated Nov 14, 2024

Highlighting expertise in data migration, data normalization and standardization, this project demonstrates successful data transfer from Snowflake to Databricks. It emphasizes optimized data flow and enhanced accessibility through standardization, showcasing a commitment to ethical data practices.

  • Updated Jul 3, 2024

🌟 Data Cleaning and Processing 🌟 Handled missing values, removed duplicates, standardized salary formats, and treated outliers for consistency.Revealed trends in company performance, job roles, and salary distributions after refining the dataset. This project highlights the power of data preprocessing as the backbone of reliable analytics.

  • Updated Nov 15, 2025
  • Jupyter Notebook

This repository contains a SQL-based data cleaning project where raw layoffs data was transformed into a clean and structured dataset. The project showcases practical SQL techniques such as duplicate removal, data standardization, null handling, and schema optimization, following real-world data preparation best practices.

  • Updated Jan 12, 2026

Tutorial code for performing PCA (with mathematical explanation) on breast cancer features computed from digitized images of fine needle aspirate (FNA) of a breast mass. Center the data, calculate correlation matrix, compute principal components, visualize and interpret results.

  • Updated Dec 31, 2024

基于 Python 的 ETL 流水线,用于标准化 12 个制造基地的异构 IoT 配置数据。具备自动架构映射、多源合并及用于配置生命周期管理的每日变更日志生成功能--自动化聚合 50W+ IoT 资产并生成每日审计追踪,确保平台逻辑与边缘侧实施的一致性。

  • Updated Jan 27, 2026
  • Python

Improve this page

Add a description, image, and links to the data-standardization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-standardization topic, visit your repo's landing page and select "manage topics."

Learn more