# Ollama

**System Architecture**

```mermaid
flowchart LR  
    User[User Request] --> Balancer[Load Balancer]  
    Balancer --> Node[Ollama Node]  
    Node -->|Inference Proof| Blockchain[AgentGPT Blockchain]  
    Blockchain -->|Token Stream| Node  
    Blockchain --> Audit[Audit Ledger]  
    Node --> Cache[Local GGUF Models]  
    style Node fill:#8fc,stroke:#333  
    style Cache fill:#fdd,stroke:#333  
```

***

#### **1. Local LLM Configuration**

**Ollama Modelfile Setup**

```yaml
# agentgpt-modelfile.yaml  
model: "llama3-70b-instruct"  
parameters:  
  num_ctx: 8192  
  temperature: 0.7  
  num_gpu_layers: 45  
quantization: "Q4_K_M"  
adapter: "web3-legal-v3.safetensors"  
blockchain:  
  reward_address: "0x..."  
  payment_per_token: 0.00015 AGPT  
system_prompt: |  
  RESPONSE IN JSON ONLY.  
  Include 'chain_id', 'model_hash', 'token_cost'.  
```

**Deployment Command**:

```bash
agentgpt deploy-ollama \  
  --models ./models \  
  --hardware "gpu=2,ram=64GB" \  
  --node-type "VALIDATOR"  
```

***

#### **2. Proof-of-Compute Smart Contract**

**Verified Inference Paymaster (Solidity)**

```solidity
// contracts/OllamaPaymaster.sol  
pragma solidity ^0.8.25;  

contract ComputeVerifier {  
    mapping(bytes32 => bool) public verifiedHashes;  

    function submitInferenceProof(  
        bytes32 modelHash,  
        bytes32 inputHash,  
        bytes calldata zkProof  
    ) external returns (uint256) {  
        bytes32 proofHash = keccak256(abi.encode(modelHash, inputHash, zkProof));  
        require(!verifiedHashes[proofHash], "Duplicate proof");  
        verifiedHashes[proofHash] = true;  
        _mint(msg.sender, 0.00015 ether * _countTokens(zkProof));  
    }  

    function _countTokens(bytes memory proof) internal pure returns (uint256) {  
        return abi.decode(proof, (uint256));  
    }  
}  
```

***

#### **3. Edge Workflow Automation**

**Hybrid Local/Cloud Execution (Python)**

```python
from ollama import Client  
from agentgpt.web3 import ChainlinkOracle  

class InferenceRouter:  
    def __init__(self):  
        self.ollama = Client(base_url='http://localhost:11434')  
        self.oracle = ChainlinkOracle(job_id="ollama_verify_v2")  

    async def process_legal_doc(self, text: str):  
        response = await self.ollama.generate(  
            model="llama3-legal-70b",  
            prompt=f"Analyze contract risk: {text}",  
            options={"temperature": 0.5}  
        )  
        proof = await self.oracle.fetch_zkp(response['text'])  
        return {  
            "analysis": response['text'],  
            "tx_hash": await self._log_to_chain(proof)  
        }  
```

***

#### **4. Resource Governance**

**Hardware Tokenization Rules**

```yaml
resource_limits:  
  cpu:  
    max_usage: 85%  
    penalty: "0.02 AGPT/min-over"  
  vram:  
    min_required: 12GB  
models:  
  required:  
    - "llama3-8b-legal-q4"  
    - "mixtral-47b-web3"  
fallback:  
  strategy: "distribute_to_fleet"  
```

***

#### **5. Real-Time Monitoring**

```bash
agentgpt monitor ollama --detail full  
```

**Output**:

```
OLLAMA NODE NETWORK  
┌───────────────┬───────────┬──────────┬──────────────┐  
│ Node ID       │ Load      │ AGPT/day │ Model Ready  │  
├───────────────┼───────────┼──────────┼──────────────┤  
│ node-7x2a     │ 78% CPU   │ 124.83   │ 22/24 models │  
│ node-gpu3     │ 2x A6000  │ 384.12   │ All models   │  
├───────────────┼───────────┼──────────┼──────────────┤  
│ TOTAL         │ 63% Avg   │ 508.95 AGPT | Compliance 99.1%  
└───────────────┴───────────┴──────────┴──────────────┘  
Failed Proofs: 0.02%  
```

***

#### **6. Security & Compliance**

**Model Provenance Checker (Rust)**

```rust
// agentgpt-provenance/src/lib.rs  
pub fn verify_model_signature(  
    model_path: &str,  
    public_key: &str  
) -> Result<bool, VerificationError> {  
    let hash = compute_blake3_hash(model_path)?;  
    let sig = fs::read_to_string(format!("{}.sig", model_path))?;  
    crypto::verify(public_key, hash.as_bytes(), sig.as_bytes())  
}  
```

***

#### **Troubleshooting Matrix**

| **Issue**             | **Resolution Protocol**                                 |
| --------------------- | ------------------------------------------------------- |
| Model Load Failure    | 1. Rehash GGUF files ▸ 2. Verify GPU compatibility      |
| ZK-Proof Mismatch     | 1. Retry with higher precision ▸ 2. Check CUDA versions |
| Token Undercount      | 1. Manually adjust via DAO vote ▸ 2. Reprocess logs     |
| Node Oversubscription | 1. Throttle requests ▸ 2. Activate spot nodes           |

***

#### **Best Practices**

1. **Model Quantization Guide**:

```bash
agentgpt ollama quantize \  
  --model "llama3-70b" \  
  --quant "Q5_K_S" \  
  --gpu_layers 45  
```

2. **Hardware Allocation**:

```yaml
gpu_priority:  
  - "legal_analysis: A100"  
  - "sentiment: T4"  
cpu_fallback:  
  enabled: true  
  max_threads: 8  
```

3. **Federated Learning**:

```solidity
function submit_model_update(bytes32 diffHash) external {  
    require(stakedTokens[msg.sender] >= 1000 ether, "Insufficient stake");  
    modelUpdates.push(diffHash);  
    _schedule_consensus_check();  
}  
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://agent-gpt.gitbook.io/agent-gpt/integration-examples/ollama.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.