Let LLM solve your vision task without training with an interactive run on VESSL.
VisProg is an innovative neuro-symbolic approach that utilizes natural language instructions to tackle complex visual tasks. By generating modular programs and employing computer vision models and image processing routines, VisProg offers flexible solutions for tasks like visual question answering and language-guided image editing. This approach broadens the capabilities of AI systems, allowing them to cater to diverse user needs and effectively handle a wide range of complex tasks.
<aside>
💡 Replace ”your openai api key”
 with your actual OpenAI API key.
</aside>
name: visprog
description: "Let LLM solve your vision task without training with an interactive run on VESSL."
image: nvcr.io/nvidia/pytorch:22.10-py3
resources:
cluster: aws-apne2
preset: v1.v100-1.mem-52
run:
- workdir: /root
command: |
echo $OPENAI_API_KEY
git clone <https://github.com/treasuraid/visprog.git>
- workdir: /root/visprog
command: |
conda env create -f environment.yaml
source activate visprog
pip install vessl opencv-python-headless
python script/image_editing.py
env:
OPENAI_API_KEY: "your openai api key"