Release 3.0.11 (#223)

dylanuys · benliang99 · web-flow · commit 0f16484f929f · 2025-06-20T17:07:51.000-07:00
* V3 (#187) * V3 * removing v2 ci pipeline * removing outdated .gitmodules * keeping the noise that sample size of 50 and slight decay of .5 in EMA provides to avoid having any one model completely dominate the subnet. * release 2.2.6 datasets, models, and lora support (#188) * deprecate stable-diffusion-inpainting * .env templates * V3/RGB (#191) * bgr images --> rgb images * proper BGR -> RGB conversion * eradicate all usage of bgr in image challenge flow * extract frames as rgb * skip extraneous rgb conversion * fix deeperforesnics consistency * v2 frame sampling parity + eidon mp4 fix * missing import * handling improper reporting of fps in wembs * correct content-type on miner side * max_fpx setting * improved video metadata extraction * cleaning up ffprobe options * fixing first frame rotation edge case * i2i fix --------- Co-authored-by: Dylan Uys <dylan@bitmind.ai> * V3 frame extraction (#192) * bgr images --> rgb images * proper BGR -> RGB conversion * eradicate all usage of bgr in image challenge flow * extract frames as rgb * skip extraneous rgb conversion * fix deeperforesnics consistency * v2 frame sampling parity + eidon mp4 fix * missing import * handling improper reporting of fps in wembs * correct content-type on miner side * max_fpx setting * improved video metadata extraction * cleaning up ffprobe options * fixing first frame rotation edge case * i2i fix * frame extraction --------- Co-authored-by: Dylan Uys <dylan@bitmind.ai> * setup.sh * removing wandb log call from generator * V3/2.2.9 (#189) * mugshot dataset * black * i2v support and fixed prompt motion enhancement * gen pipeline updates for i2v * fixing prompt indexing * properly handling new prompt dictionary key (task type) * V3/2.2.11 (#190) * mugshot dataset * black * i2v support and fixed prompt motion enhancement * gen pipeline updates for i2v * prompt sanitation + i2v model * more retries for prompt sanitation * fixing truthy tuple assertion * Update min_compute.yml * fixing setup script name in docs * correct script name * updated requirements.txt with bittensor-cli * removing wandb.off * import cleanup * miner substrate thread restart + vali autoupdate test * temporary v3 branch set to test autoudpate * autoupdate update * lower frequency of audoupdate check * autoudpate test * check autoupdate at setp 0 * typo * autoupdate test * dont set weights immediately at startup in case of many restarts * Pyproject toml (#193) * pyproject setup * executable setup.sh * autoupdate test * resetting version after autoupdate tests * Add Hugging Face model access instructions to validator docs; improve logging and fix LLM device mapping for multi-GPU - Added section to Validating.md with instructions for gaining access to required Hugging Face models (FLUX.1-dev, DeepFloyd IF). - Added logging of generation arguments in generation_pipeline.py. - Fix LLM loading for multi-GPU in prompt_generator.py: use device_map and remove .to(self.device) for quantized models. Quantized LLMs must use device_map for correct device placement; calling .to(self.device) causes device mismatch errors. Parse GPU ID from device string for device_map assignment. * fixing image_samples check for i2i * hf_xet requirement * wandb autorestart * Fix: raise error if image is None for i2i/i2v tasks and ensure image is converted from array * fixing wandb autorestart * error log * Update setup.sh to install Node.js 20.x LTS from NodeSource for pm2 compatibility; add doc note for existing validators' Hugging Face access * external port for proxy cuz tensordock rugged us (#196) * incentive doc * Typo * proxy updates * v2 parity encoding (#197) * final autoupdate test * reset version --------- Co-authored-by: Benjamin S Liang <caliangben@gmail.com> Co-authored-by: Dylan Uys <dylan@bitmind.ai> * autoupdate set to main * testing autoupdate on testnet * autoupdate enabled by default * autoudpate testnet * pointing autoupdate at main by default * removing extra state load command * setting back to 360 epoch length * burn for initial v3 release rampup * debug log typo * fixed merge to testnet * Max Frames and Timeout (#203) * fixing wandb cache clean paths (#202) * max frames configuration * fn header update * slight increase to timeout * adding extra metadata to testnet requests for miners (#201) * remove max size arg * Testnet Metadata (#204) * adding extra metadata to testnet requests for miners * adding label and mediatype to testnet metadata * Log Augmentation Parameters (#205) * log augmentation params * braindead typo * bump verison * [testnet] Release 3.0.5 (#207) * fix hotkey check in sync_metagraph * bump version * [testnet] Miner healthchecks (#209) * healthcheck wip * remove old miner health task vars, change miner healtchechk endpoint name * health count logging * fixing query based on health logic * updating healthcheck interval to 10 * removing unecessary lock * use DEFAULT_TIMEOUT * adding blacklisting for bad responses * fixing detect_image fallback * a couple comments * update functionality for generator and proxy * [testnet] Revised Miner Healthcheck (#210) * adding basic health dict to miner tracker * move all request processing logic to epistula module * reflect request processing updates in eval engine * healthy/unhealthy miner uid functions * proxy using simpler miner health from tracker state * new fn signature for score_challenge * [testnet] Image Scraping (#213) * scraper wip * fixing queries with max date set in tbs, also adding placeholder for reverse image search which i cant get to work rn due to captchas * taking first sentence of prompt as initial version of search queries * - Adding specific media scraping interval config - Adding retry logic and error handling to scraping callback * Fixing enum value for output path in scraper * add selenium to requirements * Fixing str treated as enum * cleaning up * increasing media update intervals * [testnet] Safer Miner Prediction History (#214) * centralizing logic for safely getting valid predictions and associated labels * cleaning up * Aura Dataset and Mask W&B Logging (#217) * adding bm-aura-imagegen dataset * log mask as npy artifact * format * increasing media cache refresh default * disabling scrape_new_media_on_interval * warnings for missing keys in miner history state loading * lowering media scraping default (not currenlty used) * remvoning unused healthcheck endpoint in miner * [testnet] Binary Image Proxy Endpoint (#220) * binary image endpoint * adding video endpoint without preprocessing (#221) * fix frames list (should be np array) * adding wan2.1-t2v-1.3B (#222) * removing scraping from this release * version bump * whitespace * putting peft requirement back * ftfy requirement * refactor(models): make Wan2.1 VAE loading lazy - Convert VAE loading to use lazy tuple pattern (fn, args) - Update load_vae to use kwargs for consistency * Remove duplicate function * fix torch_dtype typo for wan * extending window to 200 * imports --------- Co-authored-by: Benjamin S Liang <caliangben@gmail.com> Co-authored-by: Dylan Uys <dylan@bitmind.ai>
diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-3.0.10
+3.0.11
diff --git a/bitmind/__init__.py b/bitmind/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "3.0.10"
+__version__ = "3.0.11"
 
 version_split = __version__.split(".")
 __spec_version__ = (
diff --git a/bitmind/generation/models.py b/bitmind/generation/models.py
@@ -17,12 +17,15 @@
     AutoPipelineForInpainting,
     CogView4Pipeline,
     CogVideoXImageToVideoPipeline,
+    WanPipeline,
+    AutoencoderKLWan
 )
 
 from bitmind.generation.model_registry import ModelRegistry
 from bitmind.generation.util.model import (
     load_hunyuanvideo_transformer,
     load_annimatediff_motion_adapter,
+    load_vae,
     JanusWrapper,
 )
 from bitmind.types import ModelConfig, ModelTask
@@ -255,6 +258,31 @@ def get_text_to_video_models() -> List[ModelConfig]:
         List of text-to-video model configurations
     """
     return [
+        ModelConfig(
+            path="Wan-AI/Wan2.1-T2V-1.3B-Diffusers",
+            task=ModelTask.TEXT_TO_VIDEO,
+            pipeline_cls=WanPipeline,
+            pretrained_args={
+                "vae": (
+                    load_vae,
+                    {
+                        "vae_cls": AutoencoderKLWan,
+                        "model_id": "Wan-AI/Wan2.1-T2V-1.3B-Diffusers",
+                        "subfolder": "vae",
+                        "torch_dtype": torch.float32
+                    }
+                ),
+                "torch_dtype": torch.bfloat16
+            },
+            generate_args={
+                "resolution": [480, 832],
+                "num_frames": 81,
+                "guidance_scale": 5.0
+            },
+            save_args={"fps": 15},
+            use_autocast=False,
+            tags=["wan2.1"]
+        ),
         ModelConfig(
             path="tencent/HunyuanVideo",
             task=ModelTask.TEXT_TO_VIDEO,
diff --git a/bitmind/generation/util/model.py b/bitmind/generation/util/model.py
@@ -14,6 +14,25 @@
 from typing import Any, Dict, Optional
 
 
+def load_vae(vae_cls, model_id, subfolder, torch_dtype=torch.float32):
+    """
+    Load a VAE model.
+
+    Args:
+        vae_cls: The VAE class to instantiate
+        model_id: The model ID to load from
+        subfolder: The subfolder containing the VAE weights
+        torch_dtype: The torch dtype to use (default: torch.float32)
+    Returns:
+        A loaded VAE model
+    """
+    return vae_cls.from_pretrained(
+        model_id, 
+        subfolder=subfolder, 
+        torch_dtype=torch_dtype
+    )
+
+
 def load_hunyuanvideo_transformer(
     model_id: str = "tencent/HunyuanVideo",
     subfolder: str = "transformer",
diff --git a/bitmind/scoring/eval_engine.py b/bitmind/scoring/eval_engine.py
@@ -173,7 +173,7 @@ def _get_rewards_for_challenge(
                         miner_modality_metrics[modality] = self._empty_metrics()
                         continue
 
-                    metrics = self._get_metrics(uid, modality, window=100)
+                    metrics = self._get_metrics(uid, modality, window=200)
 
                     binary_weight = self.config.scoring.binary_weight
                     multiclass_weight = self.config.scoring.multiclass_weight
diff --git a/bitmind/scoring/miner_history.py b/bitmind/scoring/miner_history.py
@@ -15,7 +15,7 @@ class MinerHistory:
 
     VERSION = 2
 
-    def __init__(self, store_last_n_predictions: int = 100):
+    def __init__(self, store_last_n_predictions: int = 200):
         self.predictions: Dict[int, Dict[Modality, deque]] = {}
         self.labels: Dict[int, Dict[Modality, deque]] = {}
         self.miner_hotkeys: Dict[int, str] = {}
diff --git a/neurons/proxy.py b/neurons/proxy.py
@@ -15,6 +15,7 @@
 import httpx
 import numpy as np
 import uvicorn
+from bittensor.core.settings import SS58_FORMAT, TYPE_REGISTRY
 from cryptography.exceptions import InvalidSignature
 from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PublicKey
 from fastapi import (
@@ -29,11 +30,12 @@
 from fastapi.security import APIKeyHeader
 from PIL import Image
 from bittensor.core.axon import FastAPIThreadedServer
+from substrateinterface import SubstrateInterface
 
 from bitmind.config import MAINNET_UID
 from bitmind.encoding import media_to_bytes
 from bitmind.epistula import query_miner
-from bitmind.metagraph import get_miner_uids
+from bitmind.metagraph import get_miner_uids, run_block_callback_thread
 from bitmind.scoring.miner_history import MinerHistory
 from bitmind.transforms import get_base_transforms
 from bitmind.types import Modality, NeuronType
@@ -69,7 +71,7 @@ def process_image(self, b64_image: str) -> np.ndarray:
             bt.logging.error(f"Error processing image: {e}")
             raise ValueError(f"Failed to process image: {str(e)}")
 
-    def process_video(self, video_data: bytes) -> np.ndarray:
+    def process_video(self, video_data: bytes, transform_frames=True) -> np.ndarray:
         """
         Process raw video bytes into frames and preprocess
 
@@ -105,10 +107,11 @@ def process_video(self, video_data: bytes) -> np.ndarray:
                     bt.logging.error("No frames extracted from video")
                     raise ValueError("No frames extracted from video")
 
-                transformed_frames = get_base_transforms(self.target_size)(
-                    np.stack(frames)
-                )
-                video_bytes, content_type = media_to_bytes(transformed_frames)
+                frames = np.stack(frames)
+                if transform_frames:
+                    frames = get_base_transforms(self.target_size)(frames)
+ 
+                video_bytes, content_type = media_to_bytes(frames)
                 return video_bytes, content_type
 
             except Exception as e:
@@ -266,12 +269,24 @@ def setup_app(self):
             methods=["POST"],
             dependencies=[Depends(self.verify_auth)],
         )
+        router.add_api_route(
+            "/forward_image_binary",
+            self.handle_binary_image_request,
+            methods=["POST"],
+            dependencies=[Depends(self.verify_auth)],
+        )
         router.add_api_route(
             "/forward_video",
             self.handle_video_request,
             methods=["POST"],
             dependencies=[Depends(self.verify_auth)],
         )
+        router.add_api_route(
+            "/forward_video_binary",
+            self.handle_binary_video_request,
+            methods=["POST"],
+            dependencies=[Depends(self.verify_auth)],
+        )
         router.add_api_route(
             "/healthcheck",
             self.healthcheck,
@@ -291,6 +306,69 @@ def setup_app(self):
         )
         self.fast_api = FastAPIThreadedServer(config=fast_config)
 
+    async def handle_binary_image_request(self, request: Request) -> Dict[str, Any]:
+        """
+        Handle raw JPEG image processing requests.
+
+        Args:
+            request: FastAPI request object with binary JPEG image data
+
+        Returns:
+            Dictionary with prediction results
+        """
+        start_time = time.time()
+        request_id = str(uuid.uuid4())[:8]
+        bt.logging.trace(f"[{request_id}] Starting binary image request processing")
+
+        try:
+            image_data = await request.body()
+            if not image_data:
+                raise HTTPException(
+                    status_code=status.HTTP_400_BAD_REQUEST,
+                    detail="Empty image data",
+                )
+
+            query_start = time.time()
+            results = await self.query_miners(
+                media_bytes=image_data,
+                content_type="image/jpeg",
+                modality=Modality.IMAGE,
+                request_id=request_id,
+            )
+            bt.logging.debug(
+                f"[{request_id}] Miners queried in {time.time() - query_start:.2f}s"
+            )
+
+            predictions, uids = self.process_query_results(results)
+            response = {
+                "preds": [float(p) for p in predictions],
+                "fqdn": socket.getfqdn(),
+            }
+
+            # Add rich data if requested
+            rich_param = request.query_params.get("rich", "").lower()
+            if rich_param == "true":
+                response.update(self.get_rich_data(uids))
+
+            total_time = time.time() - start_time
+            bt.logging.debug(
+                f"[{request_id}] Binary image request processed in {total_time:.2f}s"
+            )
+
+            if len(self.request_times["image"]) >= self.max_request_history:
+                self.request_times["image"].pop(0)
+            self.request_times["image"].append(total_time)
+
+            return response
+
+        except Exception as e:
+            bt.logging.error(f"[{request_id}] Error processing binary image request: {e}")
+            bt.logging.error(traceback.format_exc())
+            raise HTTPException(
+                status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+                detail=f"Error processing request: {str(e)}",
+            )
+
     async def handle_image_request(self, request: Request) -> Dict[str, Any]:
         """
         Handle image processing requests.
@@ -441,6 +519,86 @@ async def handle_video_request(self, request: Request) -> Dict[str, Any]:
                 detail=f"Error processing request: {str(e)}",
             )
 
+    async def handle_binary_video_request(self, request: Request) -> Dict[str, Any]:
+        """
+        Handle video processing requests.
+
+        Args:
+            request: FastAPI request object with form data containing video file
+
+        Returns:
+            Dictionary with prediction results
+        """
+        start_time = time.time()
+        request_id = str(uuid.uuid4())[:8]
+        bt.logging.trace(f"[{request_id}] Starting video request processing")
+
+        try:
+            form = await request.form()
+            if "video" not in form:
+                raise HTTPException(
+                    status_code=status.HTTP_400_BAD_REQUEST,
+                    detail="Missing 'video' field in form data",
+                )
+
+            video_file = form["video"]
+            video_data = await video_file.read()
+
+            if not video_data:
+                raise HTTPException(
+                    status_code=status.HTTP_400_BAD_REQUEST, detail="Empty video file"
+                )
+
+            rich_param = form.get("rich", "").lower()
+
+            proc_start = time.time()
+            media_bytes, content_type = await asyncio.to_thread(
+                self.media_processor.process_video, video_data, transform_frames=False
+            )
+            bt.logging.trace(
+                f"[{request_id}] Video processed in {time.time() - proc_start:.2f}s"
+            )
+
+            query_start = time.time()
+            results = await self.query_miners(
+                media_bytes=media_bytes,
+                content_type=content_type,
+                modality=Modality.VIDEO,
+                request_id=request_id,
+            )
+            bt.logging.debug(
+                f"[{request_id}] Miners queried in {time.time() - query_start:.2f}s"
+            )
+
+            predictions, uids = self.process_query_results(results)
+            response = {
+                "preds": [float(p) for p in predictions],
+                "fqdn": socket.getfqdn(),
+            }
+
+            # Add rich data if requested
+            if rich_param == "true":
+                response.update(self.get_rich_data(uids))
+
+            total_time = time.time() - start_time
+            bt.logging.debug(
+                f"[{request_id}] Video request processed in {total_time:.2f}s"
+            )
+
+            if len(self.request_times["video"]) >= self.max_request_history:
+                self.request_times["video"].pop(0)
+            self.request_times["video"].append(total_time)
+            return response
+
+        except Exception as e:
+            bt.logging.error(f"Error processing video request: {e}")
+            bt.logging.error(traceback.format_exc())
+            raise HTTPException(
+                status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+                detail=f"Error processing request: {str(e)}",
+            )
+
+
     async def query_miners(
         self,
         media_bytes: bytes,
diff --git a/requirements.txt b/requirements.txt
@@ -21,4 +21,5 @@ opencv-python==4.11.0.86
 wandb==0.19.9
 uvicorn==0.27.1
 python-multipart==0.0.20
-peft==0.15.0
+peft==0.15.0
+ftfy==6.3.1

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-__version__ = "3.0.10"`
	`1`	`+__version__ = "3.0.11"`
`2`	`2`
`3`	`3`	`version_split = __version__.split(".")`
`4`	`4`	`__spec_version__ = (`