Near AI¶
See the Near AI website for more on how we will achieve Open Source and User-owned AGI..
⚠️ Warning: Alpha software¶
NearAI is alpha software. This means that it is not yet ready for production use. We are actively working on improving the software and would love your help.
If you would like to help build our future, please see our contributing guide.
About¶
The NearAI project is a toolkit to help build, measure, and deploy AI systems focused on agents.
NearAI consists of:
- app.near.ai a place to share agents, environments, and datasets; powered by the NearAI API.
- A CLI tool to interact with the NearAI registry, download agents, run them in environments, and more.
- A library with access to the same tools as the CLI, but to be used programmatically.
This intro is split into two parts:
Web Usage Guide¶
https://app.near.ai allows you to use NEAR AI Hub directly from your browser. It currently offers a subset of the features available from the full Hub API.
Features:
- NEAR AI Login
- Login with your NEAR account using your favourite wallet provider.
- Inference using the provider of your choice
- Chose between the best open source models.
- Read the registry:
- Datasets - https://app.near.ai/datasets
- Benchmarks - https://app.near.ai/benchmarks
- Models - https://app.near.ai/models
- Agents - https://app.near.ai/agents
-
View and manage your NEAR AI access keys.
- https://app.near.ai/settings
Source code in: demo
For the api specification see openapi.json
CLI Usage Guide¶
Registry¶
The registry is a place to store models, datasets, agents, and environments (more types to come). You can upload and download items from the registry using the nearai registry
command.
About the registry
The registry is backed by an S3 bucket with metadata stored in a database.
To upload an item to the registry, you need a directory containing a metadata.json file. The metadata.json file describes the item, and all the other files in the directory make up the item. For an agent that is one agent.py
file, for a dataset it may be hundreds of files.
The metadata_template command will create a template for you to fill in.
Fill in name, version, category and any other fields for which you have values. The notable categories are:model
, dataset
, agent
. {
"category": "agent",
"description": "An example agent",
"tags": [
"python"
],
"details": {},
"show_entry": true,
"name": "example-agent",
"version": "1"
}
Upload an element to the registry using:
You can list elements in the registry using several filters:
nearai registry list --namespace <NAMESPACE> \
--category <CATEGORY> \
--tags <COMMA_SEPARTED_LIST_OF_TAGS>
--show_all
Check the item is available by listing all elements in the registry of that category:
Show only items with the tag quine
and python
:
Download this element locally. To download refer to the item by
Tip
If you start downloading and item, and cancel the download midway, you should delete the folder at ~/.nearai/registry/
to trigger a new download.
Update the metadata of an item with the registry update command
View info about an item with the registry info commandAgents¶
See the Agents documentation and How to guide
- QUICKSTART: build and run a python agent on NearAI
- About Agents
- Agent Operation and Features
- Running an existing agent from the registry
- The Environment API
- Uploading an agent
- Running an agent remotely through the CLI
- Running an agent through the API
- Remote results
Benchmarking¶
nearai
includes a benchmarking tool to compare different agents and solvers on sets of reference evals (like mpbb
).
Requirements for benchmarking with nearai
To create a benchmark, you need two things:
1. A dataset in your `nearai` dataset registry.
2. A solver for the dataset implemented in the `nearai` library for said dataset.
If you have a dataset and a solver, you can run a benchmark.
To run a benchmark, you can use the nearai benchmark
command. For example, to run the mpbb
benchmark on the llama-v3-70b-instruct
, you can use:
nearai benchmark run mbpp MBPPSolverStrategy \
--model llama-v3-70b-instruct \
--subset=train \
--max_concurrent=1
Fine tuning¶
We use torchtune
for fine tuning models. The following command will start a fine tuning process using the llama-3-8b-instruct
model and the llama3
tokenizer.
poetry run python3 -m nearai finetune start \
--model llama-3-8b-instruct \
--format llama3-8b \
--tokenizer tokenizers/llama-3 \
--dataset <your-dataset> \
--method nearai.finetune.text_completion.dataset \
--column text \
--split train \
--num_procs 8
Submit an experiment¶
To submit a new experiment run:
nearai submit --command <COMMAND> --name <EXPERIMENT_NAME> [--nodes <NUMBER_OF_NODES>] [--cluster <CLUSTER>]
This will submit a new experiment. The command must be executed from a folder that is a git repository (public github repositories, and private github repositories on the same organization as nearai are supported). The current commit will be used for running the command so make sure it is already available online. The diff with respect to the current commit will be applied remotely (new files are not included in the diff).
On each node the environment variable ASSIGNED_SUPERVISORS
will be available with a comma separated list of supervisors that are running the experiment. The current supervisor can be accessed via nearai.CONFIG.supervisor_id
. See examples/prepare_data.py for an example.