• news
  • newest
  • ask
  • show
  • jobs

2

Show HN: Stop GPU pods placement getting bottlenecked by reserved VRAM

5 days agomedicis1230 comments

We have built a GPU Runtime for Nvidia GPUs that can run multiple development/experimental/inference workloads per GPU with safe overcommit of VRAM, dynamic fractional allocation of GPU cores, and Deduplication of weights in VRAM.

We are looking for teams to give it a try.

More details to get a trial license - https://www.woolyai.com.