Evolve Network
  • 🙋Introduction
    • What is Evolve Network?
      • Mission
      • Key Features
    • High-Level Overview
  • 👨‍🏫Core Concepts
    • Decentralized Compute
    • LLMs
    • Agents & Agent Flows
  • 🧩Agents Platform
    • Agents Flow
      • Agent Studio
      • Agents Hub
      • Using the Platform Locally
      • Using the Platform on Web App
    • Tools
    • Memory
    • Publishing Agent Flows
      • Public
      • Private (Local)
      • NFTs
  • 🗂️Data Management
    • Data Hub Overview
    • Data Studio
    • Built-in Data Scraper
    • Vector Databases
      • How It Works
  • 🖥️Node
    • Node Runner
    • Quick Start Guide
      • System Tray App
      • GPU Allocation and Sharing
      • Local Web App
    • The Node App
      • Architecture
    • Incentives
      • Best Practices
  • 🌐Network Architecture
    • Decentralized Network
    • Blockchain
    • Native Explorer
  • 🕵️‍♂️Tokenomics
    • Token Utility
    • Buying and Selling Tokens
    • Payments and Incentives
      • Pricing for Platform Usage
      • EVOLVE Token Emissions
    • Governance (DAO)
      • Proposal Creation
      • Voting Mechanism
      • Token-based Governance Participation
  • 🧑‍🍳Dev SDK
    • Agentflow Endpoints
    • Integration Guidelines for Third-party Services
  • 🛡️Security and Privacy
    • End-to-End Encryption
    • Trusted Execution Environment (TEE)
    • API / OAuth Management
    • Data Handling Policies
  • 🗣️Community Network
    • Roadmap
    • FAQs
    • Forum & Socials
Powered by GitBook
On this page
  • Start Scraping Data
  • Data Collection Process
  • Multimodal Data Types
  • Output and Usage
  1. Data Management

Built-in Data Scraper

Web scraping integration in the Data Hub allows users to automatically collect data from various online sources based on specified criteria. This ensures that the data used for AI workflows is current and relevant to the use case.

Start Scraping Data

Define Data Requirements: Specify the type of data needed, including topics, keywords, preferred sources, using the natural language chat interface. This information helps in configuring the scraper tools to target the most relevant sources.

Configuring the Scraper: The default scraper is configured to search and collect data that meets the defined criteria. You can adjust settings such as crawl depth, rate limits, and data extraction patterns to fine-tune the scraping process.

Data Collection Process

Automated Scraping: The scraper automatically crawls the web, extracting data from relevant sources. This includes news articles, social media posts, research papers, and other publicly available information.

Data Cleaning and Refinement: Collected data is cleaned to remove any irrelevant or duplicate information. This involves parsing the data, removing unnecessary tags or formatting, and normalizing the data to a consistent structure.

Multimodal Data Types

  • Text Data: The scraper tools are adept at handling textual data, extracting content, metadata, and context from web pages.

  • Multimedia Data: The tools can also handle multimedia data such as images, videos, and audio files, extracting transcripts and metadata to the corresponding textual content.

  • Structured Data: For structured data sources like tables and databases, the scraper extracts the data into a usable SQL structure for further queries on it.

Output and Usage

Vector Knowledge Graphs: Refined data is organized into vector knowledge graphs for structured representation. These graphs capture the relationships between different data points, making it easier to retrieve and analyze the information.

Integration with AI Models: These knowledge graphs are then available for use in AI agents flow as memory, enhancing their information base and performance. The structured data allows for efficient querying and retrieval, supporting various LLM-driven tasks and workflows.

PreviousData StudioNextVector Databases

Last updated 11 months ago

🗂️