
Step-by-step guide to using TRELLIS 2 for image-to-3D and text-to-3D generation. Covers workflows, parameter tuning, output formats, and best practices. Start generating 3D models for free.
Last updated: April 14, 2026
TRELLIS 2 by Microsoft Research generates high-quality 3D assets from a single image or text prompt in about 3 seconds. The model uses a 4-billion parameter architecture with Sparse 3D VAE that achieves 16x spatial compression, as described in the research paper published at CVPR 2025. This guide covers every step of the process — from preparing your input to exporting production-ready models — whether you're using the open-source repository or an online platform.
TRELLIS 2 offers two ways to generate 3D models:
| Approach | Requirements | Best For |
|---|---|---|
| Online platform | A web browser | Quick generation, no setup |
| Local installation | NVIDIA GPU (16GB+ VRAM), Python 3.10+ | Custom pipelines, batch processing |
For most users, an online platform is the fastest way to start. If you need local installation, see our TRELLIS 2 installation guide.
Image-to-3D is the most popular workflow. You upload a single photo and TRELLIS 2 reconstructs a full 3D model.
The quality of your output depends heavily on the input image. Follow these guidelines:
Ideal input images have:
Avoid:
A well-prepared image can improve output quality by 30-50% compared to a random snapshot.
On our platform:
If you're running TRELLIS 2 locally, place your image in the project directory and run:
python infer.py --image_path your_image.pngTRELLIS 2 exposes several parameters that control output quality:
| Parameter | Default | Range | Effect |
|---|---|---|---|
| Resolution | 512 | 256-1536 | Higher = more detail, slower generation |
| Sampling Steps | 12 | 4-50 | More steps = better quality, slower |
| Guidance Scale | 7.5 | 1.0-20.0 | Higher = more faithful to input |
| Seed | Random | Any integer | Fixed seed = reproducible results |
Recommended settings by use case:
| Use Case | Resolution | Steps | Guidance Scale |
|---|---|---|---|
| Quick preview | 256 | 4 | 5.0 |
| Standard quality | 512 | 12 | 7.5 |
| High quality | 1024 | 25 | 10.0 |
| Maximum quality | 1536 | 40 | 12.0 |
Click generate and wait approximately 3-10 seconds (depending on resolution and hardware). After generation, review your model:
If the result needs improvement, try:
TRELLIS 2 supports multiple export formats:
| Format | Extension | Best For |
|---|---|---|
| GLB | .glb | Game engines (Unity, Unreal), Web viewers, AR/VR |
| OBJ | .obj | Universal compatibility, 3D editing software |
| STL | .stl | 3D printing (geometry only, no textures) |
| 3D Gaussian Splatting | .ply | Real-time rendering, web-based 3D viewers |
| NeRF | .npz | Photorealistic visualization |
Choose GLB for game development, STL for 3D printing, and Gaussian Splatting for web-based 3D experiences.
Text-to-3D lets you describe what you want in natural language and TRELLIS 2 generates it.
Good prompts are specific and descriptive. Here's a formula that works well:
[Subject] + [Key Features] + [Style/Material] + [Optional: Color, Pose, etc.]Example prompts:
| Quality | Prompt |
|---|---|
| Basic | "a sword" |
| Good | "a medieval longsword with a leather-wrapped handle" |
| Excellent | "a medieval longsword with a double-edged steel blade, leather-wrapped handle, brass crossguard, and a ruby set in the pommel" |
On our platform:
Locally:
python infer.py --text_prompt "a medieval longsword with a ruby pommel"Text-to-3D often requires iteration. If the first result isn't what you envisioned:
TRELLIS 2 can accept multiple views of the same object to improve reconstruction quality. If you have photos from the front, side, and back, upload all of them:
python infer.py --image_path front.png --image_path side.png --image_path back.pngMulti-view input significantly improves back-side accuracy and overall geometric fidelity.
One of TRELLIS 2's unique features is local editing — modify specific parts of a generated 3D model without regenerating everything:
This is particularly useful for:
For generating multiple models, use batch mode:
python infer.py --batch_dir ./input_images/ --output_dir ./output_models/This processes all images in the input directory sequentially, saving results to the output directory.
Sampling steps control the denoising process. More steps produce cleaner geometry and sharper textures:
| Steps | Quality | Speed | Use Case |
|---|---|---|---|
| 4 | Draft | ~1s | Quick preview |
| 12 | Good | ~3s | Standard use |
| 25 | Very Good | ~6s | Production assets |
| 40+ | Excellent | ~10s | Final output |
Guidance scale controls how closely the output follows the input. Think of it as "creativity vs. accuracy":
Higher resolution means more detail but requires more VRAM and time:
| Resolution | VRAM Required | Generation Time | Detail Level |
|---|---|---|---|
| 256 | 8 GB | ~1s | Basic shapes |
| 512 | 12 GB | ~3s | Good detail |
| 1024 | 16 GB | ~6s | High detail |
| 1536 | 24 GB | ~10s | Maximum detail |
Export as GLB with these considerations:
Export as STL or OBJ:
Export as 3D Gaussian Splatting or GLB:
| Issue | Cause | Solution |
|---|---|---|
| Blurry textures | Low sampling steps | Increase to 25+ |
| Distorted geometry | Poor input image | Use a cleaner, well-lit photo |
| Missing details | Low resolution | Increase to 1024+ |
| Artifacts on back | Single-view input | Provide multiple views |
| Slow generation | High resolution + steps | Use online platform with optimized hardware |
| Out of memory | High resolution on limited GPU | Reduce resolution or use cloud generation |
According to community testing on Reddit and benchmarks published by 3D AI Studio, TRELLIS 2 currently leads in generation speed and overall output quality among open-source 3D generation models.
| Feature | TRELLIS 2 | Tripo3D | Meshy AI | Hunyuan3D |
|---|---|---|---|---|
| Generation speed | ~3s | ~10s | ~30s | ~15s |
| Max resolution | 1536³ | 1024³ | 1024³ | 1024³ |
| Output formats | GLB, OBJ, STL, GS, NeRF | GLB, OBJ, FBX | OBJ, FBX, STL, GLB | OBJ, GLB |
| Local editing | Yes | No | No | No |
| Open source | Yes (MIT) | No | No | Yes |
| Multi-view input | Yes | Yes | No | Yes |
| Text-to-3D | Yes | Yes | Yes | Yes |
TRELLIS 2 is a 4-billion parameter AI model developed by Microsoft Research that generates high-quality 3D assets from text prompts or images. It uses a Sparse 3D VAE and DiT architecture to produce 3D models in approximately 3 seconds. The source code is available on GitHub under the MIT license.
Yes. TRELLIS 2 is open source under the MIT license. You can run it locally for free if you have a compatible NVIDIA GPU. For those without GPU hardware, our online platform offers a free tier for 3D generation.
TRELLIS 2 supports GLB, OBJ, STL, 3D Gaussian Splatting (.ply), and NeRF (.npz) export formats. GLB is recommended for game engines, STL for 3D printing, and Gaussian Splatting for web-based 3D viewers.
Yes. TRELLIS 2 is released under the MIT license, which permits commercial use. However, always verify the license of any input images you use and check for potential trademark issues in generated content. See the GitHub license discussion for details.
TRELLIS 2 requires a minimum of 8 GB VRAM for 256-resolution generation. For standard 512-resolution output, 12 GB VRAM is recommended. High-quality 1024-1536 resolution generation requires 16-24 GB VRAM. If your GPU doesn't meet these requirements, use the online platform instead.
TRELLIS 2 generates 3D models in approximately 3 seconds, significantly faster than Tripo3D (~10s) and Meshy AI (~30s). It supports higher resolution (up to 1536³), offers local editing, and is the only fully open-source option among the three. See the detailed comparison table above for specifics.
The fastest way to try TRELLIS 2 is through our online platform:
No GPU needed. No Python installation. Upload an image or describe what you want and get a production-ready 3D model in seconds.
| Feature | Self-Hosted | Our Platform |
|---|---|---|
| Setup time | 2-4 hours | 0 minutes |
| GPU required | Yes (16GB+ VRAM) | No |
| Technical knowledge | Python, CUDA | None |
| Max resolution | Limited by your GPU | Up to 1536³ |
| Batch processing | Yes | Yes |
Related articles:
3D technology specialists focused on AI-powered 3D model generation, format conversion, and browser-based 3D rendering. We test and review 3D tools so you don't have to.

Everything about 3D art — types, tools, techniques, and learning paths. Covers digital 3D art, modeling software, AI tools, and how to get started.

Complete guide to using Microsoft TRELLIS 2 online for free. Covers Hugging Face Spaces, Google Colab, and other cloud platforms — no GPU or Python installation needed.

Microsoft TRELLIS 2 was released on December 16, 2025. Complete timeline from paper publication to Hugging Face release, with key milestones and the difference from TRELLIS v1.
Join the community
Subscribe to our newsletter for the latest news and updates