Show HN: Run TRELLIS.2 Image-to-3D generation natively on Apple Silicon

(github.com)

119 points | by shivampkumar 4 hours ago

6 comments

gondar 2 hours ago
Nice work. Although this model is not very good, I tried a lot of different image-to-3d models, the one from meshy.ai is the best, trellis is in the useless tier, really hope there could be some good open source models in this domain.
[-]
- shivampkumar 2 hours ago
  Hey, thanks for sharing this. I'm sure TRELLIS.2 definitely has room to improve, especially on texturing.
  From what I've seen personally, and community benchmarks, it does fair on geometry and visual fidelity among open-source options, but I agree it's not perfect for every use case.
  Meshy is solid, I used it to print my girlfriend a mini 3d model of her on her birthday last year!
  Though worth noting it's a paid service, and free tier has usage limitations while TRELLIS.2 is MIT licensed with unlimited local generation. Different tradeoffs for different workflows. Hopefully the open-source side keeps improving.
post-it 1 hour ago
How much RAM does this use? Only sitting on 8 GB right now, I'm trying to figure out if I should buy 24 GB when it's time for a replacement or spring for 32.
[-]
- shivampkumar 19 minutes ago
  The model needed about 15GB at peak during generation - the 4B model loads multiple sub-models (1.3B each for shape and texture flow). 8GB won't be enough, but both 24GB and 32GB both should be fine.
- Serhii-Set 34 minutes ago
  [dead]
kennyloginz 2 hours ago
So much effort, but no examples in the landing page.
[-]
- shivampkumar 1 hour ago
  You're right, thanks for flagging this, let me run something and push images
- shivampkumar 1 hour ago
  added! will add more, maybe even a GIF
villgax 3 hours ago
That’s always been possible with MPS backend, the reason people choose to omit it in HF spaces/demos is that HF doesn’t offer an MPS backend. People would rather have the thing work at best speeds than 10x worse speeds just for compatibility.
[-]
- shivampkumar 1 hour ago
  IMO TRELLIS.2 is slightly different case from the HF models scenario. It depends on five compiled CUDA-only extensions -- flex_gemm for sparse convolution, flash_attn, o_voxel for CUDA hashmap ops, cumesh for mesh processing, and nvdiffrast for differentiable rasterization. These aren't PyTorch ops that fall back to MPS -- they're custom C++/CUDA kernels. The upstream setup.sh literally exits with "No supported GPU found" if nvidia-smi isn't present. The only reason I picked this up because I thought it was cool and no one was working on this open issue for Silicon back then (github.com/microsoft/TRELLIS.2/issues/74) requesting non-CUDA support.
- Reubend 2 hours ago
  Are you saying the original one worked with MPS? Or are you just saying it was always theoretically possible to build what OP posted?
- refulgentis 3 hours ago
  It’s always been possible, but it’s not possible because there’s no backend, and no one wants to it to be possible because everyone needs it 10x the speed of running on a Mac? I’m missing something, I think.
  [-]
  - shivampkumar 1 hour ago
    I thought it was cool and then I found the open issue mentioned above, that convinced me its def something more people want.
    It IS significantly slower, about 3.5 minutes on my MacBook vs seconds on an H100. That's partly the pure-PyTorch backend overhead and partly just the hardware difference.
    For my use case the tradeoff works -- iterate locally without paying for cloud GPUs or waiting in queues.
jmatthews 1 hour ago
Well done
[-]
- serf 1 hour ago
  rad. how long does output take? trellis is a fun model.
  [-]
  - shivampkumar 1 hour ago
    i was able to get it in 3.5 mins from a single image on my 24gb m4 pro macbook
    I'm still working on this to try to replicate nvdiffrast better. Found an open source port, might look it tonight
- shivampkumar 1 hour ago
  thanks!
hank808 2 hours ago
[flagged]
[-]
- refulgentis 42 minutes ago
  Sunday night, and its kinda cool idk man
- shivampkumar 1 hour ago
  I mean I can see that it's niche. Did not expect so many upvotes, but ig it's less niche than I tought
  If you're not working with 3D on Apple Silicon this isn't relevant to you. For the subset of people who are, running this 4B parameter 3D generation model locally on a Mac was previously blocked by hard CUDA dependencies with no workaround.
  [-]
  - svnt 37 minutes ago
    Right but it is at most a couple of hours with claude code and posted on Sunday night.
- kennyloginz 2 hours ago
  Good question.