Model Gallery

9 models from 1 repositories

Filter by type:

Filter by tags:

insightface-opencv

Face recognition using OpenCV Zoo weights: YuNet detector + SFace 128-d recognizer (fp32). APACHE 2.0 — safe for commercial use. Lower accuracy than insightface packs, no demographic head (`/v1/face/analyze` returns detection regions only). Weights are downloaded on install via LocalAI's gallery mechanism (~40MB).

Repository: localaiLicense: apache-2.0

insightface-opencv-int8

Int8-quantized OpenCV Zoo face pair (YuNet int8 + SFace int8, ~12MB). Roughly 3x smaller and noticeably faster on CPU than the fp32 variant at comparable accuracy for face tasks. APACHE 2.0 — commercial-safe. Weights are downloaded on install via LocalAI's gallery mechanism.

Repository: localaiLicense: apache-2.0

face-detect-yunet-sface

Face recognition with OpenCV Zoo weights: YuNet detector + SFace 128-d recognizer, converted to a C++/ggml GGUF for the `face-detect` backend. APACHE 2.0: safe for commercial use. Lower accuracy than the buffalo packs and no demographic head, but the commercial-friendly alternative to the insightface buffalo line. The architecture (`facedetect.arch`) is read from the GGUF metadata, so this entry alone selects the YuNet + SFace engine.

Repository: localaiLicense: apache-2.0

speechbrain-ecapa-tdnn

Speaker (voice) recognition with SpeechBrain's ECAPA-TDNN trained on VoxCeleb. 192-d L2-normalised embeddings, ~1.9% Equal Error Rate on VoxCeleb1-O. APACHE 2.0 — commercial-safe. The checkpoint is auto-downloaded from HuggingFace on first LoadModel (no separate weight file in gallery `files:`). Points at the upstream SpeechBrain HF repo directly — same bytes every deployment.

Repository: localaiLicense: apache-2.0

wespeaker-resnet34

Speaker recognition with WeSpeaker's ResNet34 trained on VoxCeleb, exported to ONNX. 256-d embeddings, CPU-friendly — avoids the PyTorch runtime entirely (onnxruntime only). APACHE 2.0. Pair with the `speaker-recognition` backend's OnnxDirectEngine. Use when ECAPA-TDNN's torch dependency is undesirable (small images, edge deployments).

Repository: localaiLicense: cc-by-4.0

voice-detect-ecapa-tdnn

Speaker (voice) recognition with SpeechBrain's ECAPA-TDNN trained on VoxCeleb, ported to C++/ggml and shipped as a single GGUF for the `voice-detect` backend. 192-d L2-normalised embeddings, ~1.9% Equal Error Rate on VoxCeleb1-O. APACHE 2.0 - commercial-safe. No Python / torch runtime: voice-detect.cpp reads the embedding architecture (`voicedetect.arch`) directly from the GGUF metadata, so installing this entry is all that is needed to select ECAPA-TDNN. Drives the VoiceVerify / VoiceEmbed gRPC rpcs and the /v1/voice/{verify,embed,register,identify,forget} REST endpoints.

Repository: localaiLicense: apache-2.0

voice-detect-wespeaker-resnet34

Speaker recognition with WeSpeaker's ResNet34 trained on VoxCeleb, converted to a C++/ggml GGUF for the `voice-detect` backend. 256-d embeddings, CPU-friendly and runtime-free (no onnxruntime or torch). CC-BY-4.0. Use when you want WeSpeaker's ResNet34 topology instead of ECAPA-TDNN. The embedding architecture (`voicedetect.arch`) is read from the GGUF metadata, so this entry alone selects the engine.

Repository: localaiLicense: cc-by-4.0

voice-detect-eres2net

Speaker recognition with 3D-Speaker's ERes2Net trained on VoxCeleb, converted to a C++/ggml GGUF for the `voice-detect` backend. 192-d embeddings with strong verification accuracy. APACHE 2.0. The embedding architecture (`voicedetect.arch`) is read from the GGUF metadata, so this entry alone selects the ERes2Net engine.

Repository: localaiLicense: apache-2.0

voice-detect-campplus

Speaker recognition with 3D-Speaker's CAM++ trained on VoxCeleb, converted to a C++/ggml GGUF for the `voice-detect` backend. 192-d embeddings, a fast context-aware masking topology well-suited to CPU and edge deployments. APACHE 2.0. The embedding architecture (`voicedetect.arch`) is read from the GGUF metadata, so this entry alone selects the CAM++ engine.

Repository: localaiLicense: apache-2.0