How to use CoreML on MacOS #81
-
|
I'm trying to enable CoreML on a FeatureExtraction pipeline, using a M1 and the model jinaai/jina-embeddings-v2-base-code. I started the session with session, err := hugot.NewORTSession(options.WithOnnxLibraryPath(onnxPath), options.WithCoreML(make(map[string]string)))Then I create the pipeline config := hugot.FeatureExtractionConfig{
ModelPath: modelPath,
Name: modelName,
}
p, err := hugot.NewPipeline(session, config)And finaly I run the pipeline results, err := pipeline.RunPipeline([]string{text})I build with ORT tag and if I remove the
Maybe its the empty map of options but I couldn't find any references for the possible values of the options, only for the old version where it used uint. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 10 replies
-
|
Hi @fabiodcorreia, it would be good to work through this together, as neither myself or Riccardo have any apple devices to test on. So far I just enabled the execution provider that was added in this commit yalue/onnxruntime_go@217ac59 . It shows the options that can be set, which are also documented here https://onnxruntime.ai/docs/execution-providers/CoreML-ExecutionProvider.html. Could you also send me the path to the ORT library you installed, was it the one here? https://github.com/microsoft/onnxruntime/releases/download/v1.22.0/onnxruntime-osx-arm64-1.22.0.tgz Perhaps also try the args from this pyhton example: |
Beta Was this translation helpful? Give feedback.
-
|
Here is what I found so far. Generating embeddings from 57 files with chunks of up to 2500 chars, using the model jina-embeddings-v2-base-code with one goroutine per core I have the following results. Using just ONNX without CoreML takes ~1:30s and the memory usage is flat around 2gb Enabling CoreM:
When using MLProgram with other flags like MLComputeUnits looks like it goes back to onnx without coreML because it takes the same time, memory also flat around 2gb and no GPU usage. Using a smaller model sentence-transformers/all-mpnet-base-v2, the ONNX without CoreML take 14s and with CoreML 44s, looks like regardless the scale with CoreML it's always slowers, but the strange is that I don't see any GPU activity. Besides CoreML there are other options to use MacOS Metal? I would prefer ORT because I already have the scripts to download the onnx runtime and everything ready for cross compilation and XLA looked more complex to release as dependency. |
Beta Was this translation helpful? Give feedback.
-
|
hi @fabiodcorreia I did start taking a look yesterday before seeing this chat on implementing the crossEncoder pipeline since everyone seems to want it lol, do you want me to leave it for you to contribute or do you prefer if I implement it? |
Beta Was this translation helpful? Give feedback.
With python I got the same behavior but shows more warning logs.
RequireStaticInputShapes=1and it's also a lot faster compared to the value 0. With 0 most of the times consumes all memory of the computer.So in summary the bheavior of hugot is the same as python. Bad news for me because after a run of 150 files I can fry eggs on my macbook bottom part and my goal is to reach 4k files :/
Also I don't see any advantage of CoreML vs Onnx only, at least for text and cross encoding text. Maybe for…