Model Gallery

22 models from 1 repositories

Filter by type:

Filter by tags:

depth-anything-3-base

Depth Anything 3 (base) monocular metric depth + camera pose, served via the native depth-anything.cpp backend (C++/ggml + purego, no Python at inference). Given an image it returns a dense depth map plus the recovered camera extrinsics (3x4) and intrinsics (3x3). Use GenerateImage (src -> normalized depth PNG at dst) or Predict (JSON depth stats + pose). q4_k is the recommended CPU default.

Repository: localaiLicense: apache-2.0

depth-anything-3-base-q8_0

Depth Anything 3 (base), q8_0 — near-lossless 8-bit quant (~149 MB). Same depth + camera pose output as the q4_k default at higher fidelity.

Repository: localaiLicense: apache-2.0

depth-anything-3-base-f16

Depth Anything 3 (base), f16 — half precision (~233 MB), no measurable accuracy loss vs f32. Depth + camera pose.

Repository: localaiLicense: apache-2.0

depth-anything-3-base-f32

Depth Anything 3 (base), f32 — maximum fidelity (~412 MB). Reference-parity depth + camera pose.

Repository: localaiLicense: apache-2.0

depth-anything-3-giant

Depth Anything 3 (giant / vitg), f32 — the large backbone (~4.9 GB) for maximum quality depth + camera pose. GPU recommended.

Repository: localaiLicense: apache-2.0

depth-anything-3-small

Depth Anything 3 (small / vits), f32 — the smallest backbone (~131 MB) for fast CPU depth + camera pose. Same output as base at lower latency.

Repository: localaiLicense: apache-2.0

depth-anything-3-large

Depth Anything 3 (large / vitl), f32 (~1.6 GB) — higher quality depth + camera pose than base. GPU recommended for interactive use.

Repository: localaiLicense: apache-2.0

depth-anything-3-mono-large

Depth Anything 3 (monocular large / vitl), f32 (~1.3 GB) — single-image monocular depth + a sky mask (no camera pose). DPT single-head variant; use GenerateImage (src -> normalized depth PNG) or Predict (JSON depth stats).

Repository: localaiLicense: apache-2.0

depth-anything-3-metric-large

Depth Anything 3 (metric large / vitl), f32 (~1.3 GB) — single-image metric-scale depth (meters) + a sky mask. DPT single-head metric variant; use GenerateImage (src -> normalized depth PNG) or Predict (JSON metric depth stats, is_metric=true).

Repository: localaiLicense: apache-2.0

depth-anything-3-nested

Depth Anything 3 (nested giant+large), f32 — the recommended metric model. A two-branch pipeline: the anyview GIANT (vitg) branch and a metric ViT-L branch are run and aligned to recover true metric-scale depth (meters) + scaled camera pose from a single image. Downloads both branches (~6 GB total); GPU strongly recommended. Predict returns metric depth stats + pose (is_metric=true).

Repository: localaiLicense: apache-2.0

depth-anything-2-base

Depth Anything V2 (base / ViT-B) monocular depth, served via the native depth-anything.cpp backend (C++/ggml + purego, no Python at inference). Given an image it returns a dense monocular depth map only — no camera pose, no confidence. This is the relative variant (relative inverse depth). Use GenerateImage (src -> normalized depth PNG at dst) or the Depth endpoint. q4_k is the recommended CPU default.

Repository: localaiLicense: apache-2.0

depth-anything-2-base-q8_0

Depth Anything V2 (base / ViT-B), q8_0 — near-lossless 8-bit quant. Same relative monocular depth output as the q4_k default at higher fidelity. Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-base-f16

Depth Anything V2 (base / ViT-B), f16 — half precision, no measurable accuracy loss vs f32. Relative monocular depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-base-f32

Depth Anything V2 (base / ViT-B), f32 — maximum reference fidelity. Relative monocular depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-small

Depth Anything V2 (small / ViT-S), f32 — the smallest, fastest backbone for relative monocular depth on CPU. Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-large

Depth Anything V2 (large / ViT-L), f32 — higher-quality relative monocular depth than base. Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-metric-hypersim-small

Depth Anything V2 Metric (Hypersim, indoor / ViT-S), q4_k — metric monocular depth in METRES (indoor, max_depth 20). Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-metric-hypersim-base

Depth Anything V2 Metric (Hypersim, indoor / ViT-B), q4_k — metric monocular depth in METRES (indoor, max_depth 20). Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-metric-hypersim-large

Depth Anything V2 Metric (Hypersim, indoor / ViT-L), q4_k — highest-quality metric monocular depth in METRES (indoor, max_depth 20). Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-metric-vkitti-small

Depth Anything V2 Metric (Virtual KITTI, outdoor / ViT-S), q4_k — metric monocular depth in METRES (outdoor, max_depth 80). Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

depth-anything-2-metric-vkitti-base

Depth Anything V2 Metric (Virtual KITTI, outdoor / ViT-B), q4_k — metric monocular depth in METRES (outdoor, max_depth 80). Depth only (no pose). Use GenerateImage (src -> depth PNG) or the Depth endpoint.

Repository: localaiLicense: apache-2.0

Page 1