• NVIDIA released a demo version of a chatbot that runs locally on your PC, giving it access to your files and documents.

• The chatbot, called Chat with RTX, can answer queries and create summaries based on personal data fed into it.

• It supports various file formats and can integrate YouTube videos for contextual queries, making it useful for data research and analysis.

  • @[email protected]
    link
    fedilink
    English
    503 months ago

    That was an annoying read. It doesn’t say what this actually is.

    It’s not a new LLM. Chat with RTX is specifically software to do inference (=use LLMs) at home, while using the hardware acceleration of RTX cards. There are several projects that do this, though they might not be quite as optimized for NVIDIA’s hardware.


    Go directly to NVIDIA to avoid the clickbait.

    Chat with RTX uses retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software and NVIDIA RTX acceleration to bring generative AI capabilities to local, GeForce-powered Windows PCs. Users can quickly, easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers.

    Source: https://blogs.nvidia.com/blog/chat-with-rtx-available-now/

    Download page: https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generative-ai/

    • @[email protected]
      link
      fedilink
      English
      153 months ago

      Pretty much every LLM you can download already has CUDA support via PyTorch.

      However, some of the easier to use frontends don’t use GPU acceleration because it’s a bit of a pain to configure across a wide range of hardware models and driver versions. IIRC GPT4All does not use GPU acceleration yet (might need outdated; I haven’t checked in a while).

      If this makes local LLMs more accessible to people who are not familiar with setting up a CUDA development environment or Python venvs, that’s great news.

      • ɐɥO
        link
        fedilink
        English
        13 months ago

        Gpt4all somehow uses Gpu acceleration on my rx 6600xt

        • @[email protected]
          link
          fedilink
          English
          13 months ago

          Ooh nice. Looking at the change logs, looks like they added Vulkan acceleration back in September. Probably not as good as CUDA/Metal on supported hardware though.

          • ɐɥO
            link
            fedilink
            English
            13 months ago

            getting around 44 iterations/s (or whatever that means) on my gpu

  • @[email protected]
    link
    fedilink
    English
    24
    edit-2
    3 months ago

    it gives the chatbot access to your files and documents

    I’m sure nvidia will be trustworthy and responsible with this

  • @[email protected]
    link
    fedilink
    English
    203 months ago

    They say it works without an internet connection, and if that’s true this could be pretty awesome. I’m always skeptical about interacting with chatbots that run in the cloud, but if I can put this behind a firewall so I know there’s no telemetry, I’m on board.

    • halfwaythere
      link
      fedilink
      English
      83 months ago

      You can already do this. There are plenty of vids that show you how and it’s pretty easy to get started. Expanding functionality to get it to act and respond how you want is a bit more challenging. But definitely doable.

  • Scott
    link
    fedilink
    English
    103 months ago

    The performance on my 3070 was awful, tools like LM Studio work significantly better.

    • @[email protected]
      link
      fedilink
      English
      193 months ago

      Oh nooo, what an unfortunate turn of events. Guess that just means your GPU is too weak and old. How about upgrading to 40 series?

      – Nvidia, probably

    • @[email protected]
      link
      fedilink
      English
      43 months ago

      On my 4090, the performance is much better than ChatGPT4. The output is way worse though.

      • Scott
        link
        fedilink
        English
        13 months ago

        Yeah my boss did a screen share with me and it was done instantly, while mine was needing to recompile the embeddings for the 5th time

  • @[email protected]
    link
    fedilink
    English
    1
    edit-2
    3 months ago

    I’m a bit of a noob here. Can someone please give me a few examples how I would use this on my local machine?