• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    4 days ago

    Soldered is better! It’s sometimes faster, definitely faster if it happens to be lpddr.

    But TBH the only thing that really matters his “how much VRAM do you have,” and Qwen 32B slots in at 24GB, or maybe 16GB if the GPU is totally empty and you tune your quantization carefully. And the cheapest way to that (until 2025) is a used MI60, P40 or 3090.